informatique:ai_coding:gpu_bench
Différences
Ci-dessous, les différences entre deux révisions de la page.
| Les deux révisions précédentesRévision précédenteProchaine révision | Révision précédente | ||
| informatique:ai_coding:gpu_bench [02/12/2025 20:06] – [GPU Bench] cyrille | informatique:ai_coding:gpu_bench [03/12/2025 07:56] (Version actuelle) – [EuroLLM-9B-Instruct-Q4_0] cyrille | ||
|---|---|---|---|
| Ligne 7: | Ligne 7: | ||
| | RTX 5060 Ti (16 Go) | ~759 TOPS | ~380 TOPS | Blackwell | | | RTX 5060 Ti (16 Go) | ~759 TOPS | ~380 TOPS | Blackwell | | ||
| - | Résumé de tests : | + | Bench llama.cpp |
| + | |||
| + | * Text generation: tg128, tg256, tg512 : '' | ||
| + | * Prompt processing: b128, b256, b512 : '' | ||
| ^ models | ^ models | ||
| ^ ^ ^ i7-1360P ^ RTX 3060 ^ RTX 5060 Ti ^ | ^ ^ ^ i7-1360P ^ RTX 3060 ^ RTX 5060 Ti ^ | ||
| | Qwen2.5-coder-7b-instruct-q5_k_m | tg128 | 5.47 | 57.65 | 73.54 | | | Qwen2.5-coder-7b-instruct-q5_k_m | tg128 | 5.47 | 57.65 | 73.54 | | ||
| - | | | tg256 | ... | 57.61 | 73.32 | | + | | //size: 5.07 GiB// | tg256 | ... | 57.61 | 73.32 | |
| | | tg512 | ... | 56.20 | 71.80 | | | | tg512 | ... | 56.20 | 71.80 | | ||
| + | | | b128 | ... | | ||
| + | | | b256 | ... | | ||
| + | | | b512 | ... | | ||
| | Qwen2.5-coder-7b-instruct-q8_0 | | Qwen2.5-coder-7b-instruct-q8_0 | ||
| - | | | tg256 | ... | 41.38 | 50.33 | | + | | //size: 7.54 GiB// | tg256 | ... | 41.38 | 50.33 | |
| | | tg512 | ... | 40.70 | 49.62 | | | | tg512 | ... | 40.70 | 49.62 | | ||
| + | | | b128 | 13.98 | | ||
| + | | | b256 | ... | | ||
| + | | | b512 | ... | | ||
| | EuroLLM-9B-Instruct-Q4_0 | | EuroLLM-9B-Instruct-Q4_0 | ||
| - | | | tg256 | ... | 55.96 | 71.15 | | + | | //size: 4.94 GiB// | tg256 | ... | 55.96 | 71.15 | |
| | | tg512 | ... | 53.87 | 69.45 | | | | tg512 | ... | 53.87 | 69.45 | | ||
| + | | | b128 | ... | | ||
| + | | | b256 | ... | | ||
| + | | | b512 | ... | | ||
| + | | Qwen3-14B-UD-Q5_K_XL | ||
| + | | //size: 9.82 GiB// | tg256 | ... | 29.97 | 38.17 | | ||
| + | | | tg512 | ... | 29.25 | 37.30 | | ||
| + | | | b128 | ... | 903.97 | CUDA error | | ||
| + | | | b256 | ... | 951.71 | ... | | ||
| + | | | b512 | ... | 963.76 | ... | | ||
| ===== Intel® Core™ i7-1360P 13th Gen ===== | ===== Intel® Core™ i7-1360P 13th Gen ===== | ||
| Ligne 57: | Ligne 75: | ||
| </ | </ | ||
| - | === Qwen2.5-coder-7b-instruct-q8_0 === | ||
| - | |||
| - | < | ||
| - | ./ | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | Device 0: NVIDIA GeForce RTX 3060, compute capability 8.6, VMM: yes | ||
| - | | model | size | | ||
| - | | ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | ||
| - | | qwen2 7B Q8_0 | 7.54 GiB | 7.62 B | CUDA | ||
| - | | qwen2 7B Q8_0 | 7.54 GiB | 7.62 B | CUDA | ||
| - | | qwen2 7B Q8_0 | 7.54 GiB | 7.62 B | CUDA | ||
| - | </ | ||
| - | |||
| - | === EuroLLM-9B-Instruct-Q4_0 === | ||
| - | |||
| - | < | ||
| - | ./ | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | Device 0: NVIDIA GeForce RTX 3060, compute capability 8.6, VMM: yes | ||
| - | | model | size | | ||
| - | | ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | ||
| - | | llama ?B Q4_0 | 4.94 GiB | 9.15 B | CUDA | ||
| - | | llama ?B Q4_0 | 4.94 GiB | 9.15 B | CUDA | ||
| - | | llama ?B Q4_0 | 4.94 GiB | 9.15 B | CUDA | ||
| - | </ | ||
| Ligne 107: | Ligne 96: | ||
| </ | </ | ||
| - | === Qwen2.5-coder-7b-instruct-q8_0 === | ||
| - | < | ||
| - | $ ~/ | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | Device 0: NVIDIA GeForce RTX 5060 Ti, compute capability 12.0, VMM: yes | ||
| - | | model | size | | ||
| - | | ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | ||
| - | | qwen2 7B Q8_0 | 7.54 GiB | 7.62 B | CUDA | ||
| - | | qwen2 7B Q8_0 | 7.54 GiB | 7.62 B | CUDA | ||
| - | | qwen2 7B Q8_0 | 7.54 GiB | 7.62 B | CUDA | ||
| - | build: 3f3a4fb9c (7130) | ||
| - | </ | ||
| - | === EuroLLM-9B-Instruct-Q4_0 === | ||
| - | |||
| - | < | ||
| - | $ ./ | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | Device 0: NVIDIA GeForce RTX 5060 Ti, compute capability 12.0, VMM: yes | ||
| - | | model | size | | ||
| - | | ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | ||
| - | | llama ?B Q4_0 | 4.94 GiB | 9.15 B | CUDA | ||
| - | | llama ?B Q4_0 | 4.94 GiB | 9.15 B | CUDA | ||
| - | | llama ?B Q4_0 | 4.94 GiB | 9.15 B | CUDA | ||
| - | |||
| - | build: 3f3a4fb9c (7130) | ||
| - | </ | ||
| ===== Traduction ===== | ===== Traduction ===== | ||
informatique/ai_coding/gpu_bench.1764702382.txt.gz · Dernière modification : de cyrille
