Outils pour utilisateurs

Outils du site


informatique:egpu

Différences

Ci-dessous, les différences entre deux révisions de la page.

Lien vers cette vue comparative

Les deux révisions précédentesRévision précédente
Prochaine révision
Révision précédente
informatique:egpu [08/12/2025 15:25] – [PNY OC 16GB Geforce RTX 5060 Ti] cyrilleinformatique:egpu [15/01/2026 14:12] (Version actuelle) – [eGpu] cyrille
Ligne 14: Ligne 14:
     * acheté     * acheté
     * ✅ RTX 3060 ok     * ✅ RTX 3060 ok
-    * ✗ RTX 5060 à peu près ok ([[/informatique/ai_coding/gpu_bench|ça plante selon les modèles]]) +    * ❌ RTX 5060 à peu près ok ([[informatique:ai_lm:gpu_bench|ça plante selon les modèles]])
-  * [[https://fr.aliexpress.com/item/1005007990218564.html?|WKG-L19C70]] Wikingoo +
-    * Le vendeur dit qu'elle fonctionnera mieux que la L17 avec la RTX 5060 ...+
   * [[http://www.cyidpcie.cn/page/HL7.html|TB3-HL7]]   * [[http://www.cyidpcie.cn/page/HL7.html|TB3-HL7]]
     * acheté     * acheté
Ligne 22: Ligne 20:
     * ✅ RTX 3060 ok     * ✅ RTX 3060 ok
     * ❌ RTX 5060 failed     * ❌ RTX 5060 failed
-  * [[https://fr.aliexpress.com/item/1005008424134383.html|ADT UT4G-BK7]] TB3/TB4 vers PCIe x16 PCIe 4.0 x4 GPU Dock+  * [[https://fr.aliexpress.com/item/1005007990218564.html?|WKG-L19C70]] Wikingoo 
 +    * Le vendeur dit qu'elle fonctionnera mieux que la L17 avec la RTX 5060 ... 
 +    * Mais des acheteurs signalent des déconnexions https://community.frame.work/t/egpu-disconnects-fw-13-amd/73265 
 +  * ADT UT4G 
 +    * USB4/TB3/T4B to Pcie X16 adapter for eGPU 
 +    * https://www.adtlink.cn/en/product/UT4G.html 
 +      * $128 https://www.adt.link/product/UT4G-Shop.html 
 +    * [[https://fr.aliexpress.com/item/1005008424134383.html|ADT UT4G-BK7]] TB3/TB4 vers PCIe x16 PCIe 4.0 x4 GPU Dock 
 +  * AOOSTAR 
 +    * AG02 Oculink/USB4, avec PSU 
 +      * $219 https://aoostar.com/products/aoostar-ag01-egpu-dock-with-oculink-port-built-in-huntkey-400w-power-supply-supports-tgx-interface-hot-swap 
 +    * AOOSTAR EG02 TB5+Oculink 
 +      * $219 https://aoostar.com/collections/egpu-series/products/aoosatr-eg02-tb5-oculink 
 +  * Minisforum DEG2 OCulink Thunderbolt 5 eGPU Dock 
 +    * Thunderbolt 5 Port | Up to 80Gps, OCuLink (PCIe 4.0 ×4) | Up to 64Gps, Built-in M.2 2280 SSD, Compatible with ATX / SFX PSU 
 +    * $259 https://www.minisforum.com/fr/products/deg2 ("ajouter au panier" pour voir le prix) 
 +  * EXP-GDC TH5P4 
 +    * ???
  
 Au final on ne fait tourner que de petits models avec de petit context ... Au final on ne fait tourner que de petits models avec de petit context ...
Ligne 62: Ligne 77:
 ggml_cuda_init: failed to initialize CUDA: CUDA driver version is insufficient for CUDA runtime version ggml_cuda_init: failed to initialize CUDA: CUDA driver version is insufficient for CUDA runtime version
 </code> </code>
 +
 +==== nvidia-uvm ====
 +
 +<code>
 +$ modinfo nvidia-uvm
 +
 +filename:       /lib/modules/6.14.0-37-generic/updates/dkms/nvidia-uvm.ko.zst
 +version:        580.126.09
 +supported:      external
 +license:        Dual MIT/GPL
 +srcversion:     B7E9DECF7BD1D315EBCCCF0
 +depends:        nvidia
 +name:           nvidia_uvm
 +retpoline:      Y
 +vermagic:       6.14.0-37-generic SMP preempt mod_unload modversions 
 +sig_id:         PKCS#7
 +signer:         NS5x-NS7xAU Secure Boot Module Signature key
 +sig_key:        3B:82:8F:E4:B9:99:2E:1F:E5:76:9C:33:AC:26:A9:F0:0A:1A:E3:46
 +sig_hashalgo:   sha512
 +signature:      66:E9:9A:75:7C:2D:5B:1C:56:B9:CD:CE:E4:64:3B:5F:66:BB:F3:B2:
 + 8F:E8:34:44:62:FD:02:32:A3:27:A8:EA:20:BB:BA:87:6F:F7:F8:6E:
 + F5:27:67:07:97:55:39:39:B2:7E:DE:01:F1:E5:64:AF:3A:29:98:90:
 + 8D:A3:7A:0C:D9:D2:60:A8:15:C1:55:6E:F1:53:FE:85:D2:07:54:12:
 + B0:A4:D5:76:96:D4:A9:5F:85:B4:75:18:B4:38:A2:8B:15:3D:8C:8B:
 + F3:0A:AA:1E:F6:81:F1:27:CC:1E:22:EC:E6:72:BC:DC:3A:FD:39:2F:
 + F4:BF:DE:47:38:7E:1D:FE:04:D1:29:24:AD:CB:46:44:7F:4F:62:67:
 + 38:FA:96:10:58:47:02:C8:65:05:67:7A:53:A6:70:76:A1:10:39:56:
 + 0B:B3:5F:98:E2:D3:F1:FC:7E:85:02:E0:37:04:E4:91:E6:7D:92:25:
 + FE:3E:CD:0F:E1:26:B8:78:FA:C6:DB:AD:AA:CB:A9:22:2E:E7:20:DA:
 + 91:46:FC:14:EB:54:54:B4:AF:1D:66:72:9B:C2:99:18:1B:57:77:14:
 + FD:65:14:B0:96:A5:0A:78:A4:AA:E2:F3:49:96:85:53:A3:28:50:C9:
 + E4:74:89:65:C7:24:19:BC:AF:4C:15:5E:55:8C:53:CC
 +parm:           uvm_conf_computing_channel_iv_rotation_limit:ulong
 +parm:           uvm_ats_mode:Set to 0 to disable ATS (Address Translation Services). Any other value is ignored. Has no effect unless the platform supports ATS. (int)
 +parm:           uvm_perf_prefetch_enable:uint
 +parm:           uvm_perf_prefetch_threshold:uint
 +parm:           uvm_perf_prefetch_min_faults:uint
 +parm:           uvm_perf_thrashing_enable:uint
 +parm:           uvm_perf_thrashing_threshold:uint
 +parm:           uvm_perf_thrashing_pin_threshold:uint
 +parm:           uvm_perf_thrashing_lapse_usec:uint
 +parm:           uvm_perf_thrashing_nap:uint
 +parm:           uvm_perf_thrashing_epoch:uint
 +parm:           uvm_perf_thrashing_pin:uint
 +parm:           uvm_perf_thrashing_max_resets:uint
 +parm:           uvm_perf_map_remote_on_native_atomics_fault:uint
 +parm:           uvm_disable_hmm:Force-disable HMM functionality in the UVM driver. Default: false (HMM is enabled if possible). However, even with uvm_disable_hmm=false, HMM will not be enabled if is not supported in this driver build configuration, or if ATS settings conflict with HMM. (bool)
 +parm:           uvm_perf_migrate_cpu_preunmap_enable:int
 +parm:           uvm_perf_migrate_cpu_preunmap_block_order:uint
 +parm:           uvm_global_oversubscription:Enable (1) or disable (0) global oversubscription support. (int)
 +parm:           uvm_perf_pma_batch_nonpinned_order:uint
 +parm:           uvm_cpu_chunk_allocation_sizes:OR'ed value of all CPU chunk allocation sizes. (uint)
 +parm:           uvm_leak_checker:Enable uvm memory leak checking. 0 = disabled, 1 = count total bytes allocated and freed, 2 = per-allocation origin tracking. (int)
 +parm:           uvm_force_prefetch_fault_support:uint
 +parm:           uvm_debug_enable_push_desc:Enable push description tracking (uint)
 +parm:           uvm_debug_enable_push_acquire_info:Enable push acquire information tracking (uint)
 +parm:           uvm_page_table_location:Set the location for UVM-allocated page tables. Choices are: vid, sys. (charp)
 +parm:           uvm_perf_access_counter_migration_enable:Whether access counters will trigger migrations.Valid values: <= -1 (default policy), 0 (off), >= 1 (on) (int)
 +parm:           uvm_perf_access_counter_batch_count:uint
 +parm:           uvm_perf_access_counter_threshold:Number of remote accesses on a region required to trigger a notification.Valid values: [1, 65535] (uint)
 +parm:           uvm_perf_reenable_prefetch_faults_lapse_msec:uint
 +parm:           uvm_perf_fault_batch_count:uint
 +parm:           uvm_perf_fault_replay_policy:uint
 +parm:           uvm_perf_fault_replay_update_put_ratio:uint
 +parm:           uvm_perf_fault_max_batches_per_service:uint
 +parm:           uvm_perf_fault_max_throttle_per_service:uint
 +parm:           uvm_perf_fault_coalesce:uint
 +parm:           uvm_fault_force_sysmem:Force (1) using sysmem storage for pages that faulted. Default: 0. (int)
 +parm:           uvm_perf_map_remote_on_eviction:int
 +parm:           uvm_block_cpu_to_cpu_copy_with_ce:Use GPU CEs for CPU-to-CPU migrations. (int)
 +parm:           uvm_exp_gpu_cache_peermem:Force caching for mappings to peer memory. This is an experimental parameter that may cause correctness issues if used. (uint)
 +parm:           uvm_exp_gpu_cache_sysmem:Force caching for mappings to system memory. This is an experimental parameter that may cause correctness issues if used. (uint)
 +parm:           uvm_downgrade_force_membar_sys:Force all TLB invalidation downgrades to use MEMBAR_SYS (uint)
 +parm:           uvm_channel_num_gpfifo_entries:uint
 +parm:           uvm_channel_gpfifo_loc:charp
 +parm:           uvm_channel_gpput_loc:charp
 +parm:           uvm_channel_pushbuffer_loc:charp
 +parm:           uvm_enable_va_space_mm:Set to 0 to disable UVM from using mmu_notifiers to create an association between a UVM VA space and a process. This will also disable pageable memory access via either ATS or HMM. (int)
 +parm:           uvm_enable_debug_procfs:Enable debug procfs entries in /proc/driver/nvidia-uvm (int)
 +parm:           uvm_peer_copy:Choose the addressing mode for peer copying, options: phys [default] or virt. Valid for Ampere+ GPUs. (charp)
 +parm:           uvm_debug_prints:Enable uvm debug prints. (int)
 +parm:           uvm_enable_builtin_tests:Enable the UVM built-in tests. (This is a security risk) (int)
 +parm:           uvm_release_asserts:Enable uvm asserts included in release builds. (int)
 +parm:           uvm_release_asserts_dump_stack:dump_stack() on failed UVM release asserts. (int)
 +parm:           uvm_release_asserts_set_global_error:Set UVM global fatal error on failed release asserts. (int)
 +
 +$ systool -m nvidia_uvm -v
 +
 +Module = "nvidia_uvm"
 +  Attributes:
 +    coresize            = "2154496"
 +    initsize            = "0"
 +    initstate           = "live"
 +    refcnt              = "4"
 +    srcversion          = "B7E9DECF7BD1D315EBCCCF0"
 +    taint               = "OE"
 +    uevent              = <store method only>
 +    version             = "580.126.09"
 +  Parameters:
 +    uvm_ats_mode        = "1"
 +    uvm_block_cpu_to_cpu_copy_with_ce= "0"
 +    uvm_channel_gpfifo_loc= "auto"
 +    uvm_channel_gpput_loc= "auto"
 +    uvm_channel_num_gpfifo_entries= "1024"
 +    uvm_channel_pushbuffer_loc= "auto"
 +    uvm_conf_computing_channel_iv_rotation_limit= "2147483648"
 +    uvm_cpu_chunk_allocation_sizes= "2166784"
 +    uvm_debug_enable_push_acquire_info= "0"
 +    uvm_debug_enable_push_desc= "0"
 +    uvm_debug_prints    = "0"
 +    uvm_disable_hmm     = "Y"
 +    uvm_downgrade_force_membar_sys= "1"
 +    uvm_enable_builtin_tests= "0"
 +    uvm_enable_debug_procfs= "0"
 +    uvm_enable_va_space_mm= "1"
 +    uvm_exp_gpu_cache_peermem= "0"
 +    uvm_exp_gpu_cache_sysmem= "0"
 +    uvm_fault_force_sysmem= "0"
 +    uvm_force_prefetch_fault_support= "0"
 +    uvm_global_oversubscription= "1"
 +    uvm_leak_checker    = "0"
 +    uvm_page_table_location= "(null)"
 +    uvm_peer_copy       = "phys"
 +    uvm_perf_access_counter_batch_count= "256"
 +    uvm_perf_access_counter_migration_enable= "-1"
 +    uvm_perf_access_counter_threshold= "256"
 +    uvm_perf_fault_batch_count= "256"
 +    uvm_perf_fault_coalesce= "1"
 +    uvm_perf_fault_max_batches_per_service= "20"
 +    uvm_perf_fault_max_throttle_per_service= "5"
 +    uvm_perf_fault_replay_policy= "2"
 +    uvm_perf_fault_replay_update_put_ratio= "50"
 +    uvm_perf_map_remote_on_eviction= "1"
 +    uvm_perf_map_remote_on_native_atomics_fault= "0"
 +    uvm_perf_migrate_cpu_preunmap_block_order= "2"
 +    uvm_perf_migrate_cpu_preunmap_enable= "1"
 +    uvm_perf_pma_batch_nonpinned_order= "6"
 +    uvm_perf_prefetch_enable= "1"
 +    uvm_perf_prefetch_min_faults= "1"
 +    uvm_perf_prefetch_threshold= "51"
 +    uvm_perf_reenable_prefetch_faults_lapse_msec= "1000"
 +    uvm_perf_thrashing_enable= "1"
 +    uvm_perf_thrashing_epoch= "2000"
 +    uvm_perf_thrashing_lapse_usec= "500"
 +    uvm_perf_thrashing_max_resets= "4"
 +    uvm_perf_thrashing_nap= "1"
 +    uvm_perf_thrashing_pin= "300"
 +    uvm_perf_thrashing_pin_threshold= "10"
 +    uvm_perf_thrashing_threshold= "3"
 +    uvm_release_asserts = "1"
 +    uvm_release_asserts_dump_stack= "0"
 +    uvm_release_asserts_set_global_error= "0"
 +</code>
 +
 +Le plantage de la RTX 5060 Ti arrive plus tard si ''options nvidia_uvm uvm_disable_hmm=1''.
  
 ==== Séries RTX ==== ==== Séries RTX ====
Ligne 99: Ligne 269:
 Ticket ouvert chez Nvidia : [[https://github.com/NVIDIA/open-gpu-kernel-modules/issues/974#issuecomment-3627087138|kgspBootstrap_GH100: GSP-FMC reported an error while attempting to boot GSP]] Ticket ouvert chez Nvidia : [[https://github.com/NVIDIA/open-gpu-kernel-modules/issues/974#issuecomment-3627087138|kgspBootstrap_GH100: GSP-FMC reported an error while attempting to boot GSP]]
  
 +J'ai acheté un câble Thunderbolt certifié (50€) pour remplacer celui fourni avec l'eGPU. **On dirait que ça fonctionne mieux, mais ça plante facilement** ''kernel: NVRM: nvAssertOkFailedNoLog: Assertion failed: GPU lost from the bus [NV_ERR_GPU_IS_LOST] (0x0000000F) returned from pRmApi->Control() ... NVRM: nvGpuOpsReportFatalError: uvm encountered global fatal error 0x60, requiring os reboot to recover ...''
 === nvidia-kkms-565 === === nvidia-kkms-565 ===
  
Ligne 190: Ligne 360:
 === Avec le bridge Wikingoo WKGL17-C50 === === Avec le bridge Wikingoo WKGL17-C50 ===
  
-Avec certains modèles ya "[[/informatique/ai_coding/gpu_bench|CUDA Error]]" et dans les logs ya :+Avec certains modèles ya "[[informatique:ai_lm:gpu_bench|CUDA Error]]" et dans les logs ya :
  
 <code> <code>
informatique/egpu.1765203908.txt.gz · Dernière modification : de cyrille

Sauf mention contraire, le contenu de ce wiki est placé sous les termes de la licence suivante : CC0 1.0 Universal
CC0 1.0 Universal Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki