informatique:ai_coding:gpu_bench
Différences
Ci-dessous, les différences entre deux révisions de la page.
| Les deux révisions précédentesRévision précédenteProchaine révision | Révision précédente | ||
| informatique:ai_coding:gpu_bench [02/12/2025 19:50] – cyrille | informatique:ai_coding:gpu_bench [03/12/2025 07:56] (Version actuelle) – [EuroLLM-9B-Instruct-Q4_0] cyrille | ||
|---|---|---|---|
| Ligne 6: | Ligne 6: | ||
| | RTX 3060 (12 Go) | ~120 TOPS | ~60 TOPS | Ampere | | | RTX 3060 (12 Go) | ~120 TOPS | ~60 TOPS | Ampere | | ||
| | RTX 5060 Ti (16 Go) | ~759 TOPS | ~380 TOPS | Blackwell | | | RTX 5060 Ti (16 Go) | ~759 TOPS | ~380 TOPS | Blackwell | | ||
| + | |||
| + | Bench llama.cpp : | ||
| + | |||
| + | * Text generation: tg128, tg256, tg512 : '' | ||
| + | * Prompt processing: b128, b256, b512 : '' | ||
| + | |||
| + | ^ models | ||
| + | ^ ^ ^ i7-1360P ^ RTX 3060 ^ RTX 5060 Ti ^ | ||
| + | | Qwen2.5-coder-7b-instruct-q5_k_m | tg128 | 5.47 | 57.65 | 73.54 | | ||
| + | | //size: 5.07 GiB// | tg256 | ... | 57.61 | 73.32 | | ||
| + | | | tg512 | ... | 56.20 | 71.80 | | ||
| + | | | b128 | ... | | ||
| + | | | b256 | ... | | ||
| + | | | b512 | ... | | ||
| + | | Qwen2.5-coder-7b-instruct-q8_0 | ||
| + | | //size: 7.54 GiB// | tg256 | ... | 41.38 | 50.33 | | ||
| + | | | tg512 | ... | 40.70 | 49.62 | | ||
| + | | | b128 | 13.98 | | ||
| + | | | b256 | ... | | ||
| + | | | b512 | ... | | ||
| + | | EuroLLM-9B-Instruct-Q4_0 | ||
| + | | //size: 4.94 GiB// | tg256 | ... | 55.96 | 71.15 | | ||
| + | | | tg512 | ... | 53.87 | 69.45 | | ||
| + | | | b128 | ... | | ||
| + | | | b256 | ... | | ||
| + | | | b512 | ... | | ||
| + | | Qwen3-14B-UD-Q5_K_XL | ||
| + | | //size: 9.82 GiB// | tg256 | ... | 29.97 | 38.17 | | ||
| + | | | tg512 | ... | 29.25 | 37.30 | | ||
| + | | | b128 | ... | 903.97 | CUDA error | | ||
| + | | | b256 | ... | 951.71 | ... | | ||
| + | | | b512 | ... | 963.76 | ... | | ||
| ===== Intel® Core™ i7-1360P 13th Gen ===== | ===== Intel® Core™ i7-1360P 13th Gen ===== | ||
| Ligne 42: | Ligne 74: | ||
| | qwen2 7B Q5_K - Medium | | qwen2 7B Q5_K - Medium | ||
| </ | </ | ||
| - | |||
| - | === Qwen2.5-coder-7b-instruct-q8_0 === | ||
| - | |||
| - | < | ||
| - | ./ | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | Device 0: NVIDIA GeForce RTX 3060, compute capability 8.6, VMM: yes | ||
| - | | model | size | | ||
| - | | ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | ||
| - | | qwen2 7B Q8_0 | 7.54 GiB | 7.62 B | CUDA | ||
| - | | qwen2 7B Q8_0 | 7.54 GiB | 7.62 B | CUDA | ||
| - | | qwen2 7B Q8_0 | 7.54 GiB | 7.62 B | CUDA | ||
| - | </ | ||
| - | |||
| - | === EuroLLM-9B-Instruct-Q4_0 === | ||
| - | |||
| - | < | ||
| - | ./ | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | Device 0: NVIDIA GeForce RTX 3060, compute capability 8.6, VMM: yes | ||
| - | | model | size | | ||
| - | | ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | ||
| - | | llama ?B Q4_0 | 4.94 GiB | 9.15 B | CUDA | ||
| - | | llama ?B Q4_0 | 4.94 GiB | 9.15 B | CUDA | ||
| - | | llama ?B Q4_0 | 4.94 GiB | 9.15 B | CUDA | ||
| - | </ | ||
| - | |||
| - | Avec llama-server : | ||
| - | * --ctx-size 74000 Ok | ||
| - | * --ctx-size 80000 Failed | ||
| Ligne 98: | Ligne 96: | ||
| </ | </ | ||
| - | === Qwen2.5-coder-7b-instruct-q8_0 === | ||
| - | < | ||
| - | $ ~/ | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | Device 0: NVIDIA GeForce RTX 5060 Ti, compute capability 12.0, VMM: yes | ||
| - | | model | size | | ||
| - | | ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | ||
| - | | qwen2 7B Q8_0 | 7.54 GiB | 7.62 B | CUDA | ||
| - | | qwen2 7B Q8_0 | 7.54 GiB | 7.62 B | CUDA | ||
| - | | qwen2 7B Q8_0 | 7.54 GiB | 7.62 B | CUDA | ||
| - | build: 3f3a4fb9c (7130) | ||
| - | </ | ||
| - | === EuroLLM-9B-Instruct-Q4_0 === | ||
| - | |||
| - | < | ||
| - | $ ./ | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | Device 0: NVIDIA GeForce RTX 5060 Ti, compute capability 12.0, VMM: yes | ||
| - | | model | size | | ||
| - | | ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | ||
| - | | llama ?B Q4_0 | 4.94 GiB | 9.15 B | CUDA | ||
| - | | llama ?B Q4_0 | 4.94 GiB | 9.15 B | CUDA | ||
| - | | llama ?B Q4_0 | 4.94 GiB | 9.15 B | CUDA | ||
| - | |||
| - | build: 3f3a4fb9c (7130) | ||
| - | </ | ||
| ===== Traduction ===== | ===== Traduction ===== | ||
informatique/ai_coding/gpu_bench.1764701430.txt.gz · Dernière modification : de cyrille
