informatique:ai_coding:gpu_bench
Différences
Ci-dessous, les différences entre deux révisions de la page.
| Les deux révisions précédentesRévision précédenteProchaine révision | Révision précédente | ||
| informatique:ai_coding:gpu_bench [02/12/2025 20:26] – [GPU Bench] cyrille | informatique:ai_coding:gpu_bench [03/12/2025 07:56] (Version actuelle) – [EuroLLM-9B-Instruct-Q4_0] cyrille | ||
|---|---|---|---|
| Ligne 9: | Ligne 9: | ||
| Bench llama.cpp : | Bench llama.cpp : | ||
| - | * tg128, tg256, tg512 : '' | + | * Text generation: |
| - | * b128, b256, b512 : '' | + | * Prompt processing: |
| ^ models | ^ models | ||
| Ligne 17: | Ligne 17: | ||
| | //size: 5.07 GiB// | tg256 | ... | 57.61 | 73.32 | | | //size: 5.07 GiB// | tg256 | ... | 57.61 | 73.32 | | ||
| | | tg512 | ... | 56.20 | 71.80 | | | | tg512 | ... | 56.20 | 71.80 | | ||
| - | | | b128 | ... | | | + | | | b128 | ... | 1825.17 |
| - | | | b256 | ... | | | + | | | b256 | ... | 1924.10 |
| - | | | b512 | ... | | | + | | | b512 | ... | 1959.18 |
| | Qwen2.5-coder-7b-instruct-q8_0 | | Qwen2.5-coder-7b-instruct-q8_0 | ||
| | //size: 7.54 GiB// | tg256 | ... | 41.38 | 50.33 | | | //size: 7.54 GiB// | tg256 | ... | 41.38 | 50.33 | | ||
| | | tg512 | ... | 40.70 | 49.62 | | | | tg512 | ... | 40.70 | 49.62 | | ||
| - | | | b128 | ... | | + | | | b128 | 13.98 | 1952.96 | |
| - | | | b256 | ... | | | + | | | b256 | ... | 2054.09 |
| - | | | b512 | ... | | | + | | | b512 | ... | 2093.21 |
| | EuroLLM-9B-Instruct-Q4_0 | | EuroLLM-9B-Instruct-Q4_0 | ||
| | //size: 4.94 GiB// | tg256 | ... | 55.96 | 71.15 | | | //size: 4.94 GiB// | tg256 | ... | 55.96 | 71.15 | | ||
| | | tg512 | ... | 53.87 | 69.45 | | | | tg512 | ... | 53.87 | 69.45 | | ||
| - | | | b128 | ... | | | | + | | | b128 | ... | 1433.95 |
| - | | | b256 | ... | | | | + | | | b256 | ... | 1535.06 |
| - | | | b512 | ... | | | | + | | | b512 | ... | 1559.88 |
| - | | Qwen3-14B-UD-Q5_K_XL | + | | Qwen3-14B-UD-Q5_K_XL |
| - | | //size: 9.82 GiB// | tg256 | ... | | 38.17 | | + | | //size: 9.82 GiB// | tg256 | ... | 29.97 | 38.17 | |
| - | | | tg512 | ... | | 37.30 | | + | | | tg512 | ... | 29.25 | 37.30 | |
| - | | | b128 | ... | | | | + | | | b128 | ... | |
| - | | | b256 | ... | | | | + | | | b256 | ... | |
| - | | | b512 | ... | | | | + | | | b512 | ... | |
| ===== Intel® Core™ i7-1360P 13th Gen ===== | ===== Intel® Core™ i7-1360P 13th Gen ===== | ||
| Ligne 75: | Ligne 75: | ||
| </ | </ | ||
| - | === Qwen2.5-coder-7b-instruct-q8_0 === | ||
| - | |||
| - | < | ||
| - | ./ | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | Device 0: NVIDIA GeForce RTX 3060, compute capability 8.6, VMM: yes | ||
| - | | model | size | | ||
| - | | ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | ||
| - | | qwen2 7B Q8_0 | 7.54 GiB | 7.62 B | CUDA | ||
| - | | qwen2 7B Q8_0 | 7.54 GiB | 7.62 B | CUDA | ||
| - | | qwen2 7B Q8_0 | 7.54 GiB | 7.62 B | CUDA | ||
| - | </ | ||
| - | |||
| - | === EuroLLM-9B-Instruct-Q4_0 === | ||
| - | |||
| - | < | ||
| - | ./ | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | Device 0: NVIDIA GeForce RTX 3060, compute capability 8.6, VMM: yes | ||
| - | | model | size | | ||
| - | | ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | ||
| - | | llama ?B Q4_0 | 4.94 GiB | 9.15 B | CUDA | ||
| - | | llama ?B Q4_0 | 4.94 GiB | 9.15 B | CUDA | ||
| - | | llama ?B Q4_0 | 4.94 GiB | 9.15 B | CUDA | ||
| - | </ | ||
| Ligne 125: | Ligne 96: | ||
| </ | </ | ||
| - | === Qwen2.5-coder-7b-instruct-q8_0 === | ||
| - | < | ||
| - | $ ~/ | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | Device 0: NVIDIA GeForce RTX 5060 Ti, compute capability 12.0, VMM: yes | ||
| - | | model | size | | ||
| - | | ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | ||
| - | | qwen2 7B Q8_0 | 7.54 GiB | 7.62 B | CUDA | ||
| - | | qwen2 7B Q8_0 | 7.54 GiB | 7.62 B | CUDA | ||
| - | | qwen2 7B Q8_0 | 7.54 GiB | 7.62 B | CUDA | ||
| - | build: 3f3a4fb9c (7130) | ||
| - | </ | ||
| - | === EuroLLM-9B-Instruct-Q4_0 === | ||
| - | |||
| - | < | ||
| - | $ ./ | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | ggml_cuda_init: | ||
| - | Device 0: NVIDIA GeForce RTX 5060 Ti, compute capability 12.0, VMM: yes | ||
| - | | model | size | | ||
| - | | ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | ||
| - | | llama ?B Q4_0 | 4.94 GiB | 9.15 B | CUDA | ||
| - | | llama ?B Q4_0 | 4.94 GiB | 9.15 B | CUDA | ||
| - | | llama ?B Q4_0 | 4.94 GiB | 9.15 B | CUDA | ||
| - | |||
| - | build: 3f3a4fb9c (7130) | ||
| - | </ | ||
| ===== Traduction ===== | ===== Traduction ===== | ||
informatique/ai_coding/gpu_bench.1764703560.txt.gz · Dernière modification : de cyrille
