NVIDIA CUDA : インストール2024/02/21 |
NVIDIA 社製グラフィックカードによる GPU コンピューティング GPGPU(General-Purpose computing on Graphics Processing Units) プラットフォーム CUDA (Compute Unified Device Architecture) をインストールします。
|
|
[1] | PowerShell を管理者権限で起動して作業します。 C++ コンパイラーをインストールしておきます。 |
Windows PowerShell Copyright (C) Microsoft Corporation. All rights reserved. PS C:\Users\Administrator> Invoke-WebRequest -Uri https://aka.ms/vs/17/release/vs_BuildTools.exe -OutFile "vs_BuildTools.exe" # サイレントモードでインストール PS C:\Users\Administrator> ./vs_buildtools.exe ` --add Microsoft.Component.MSBuild ` --add Microsoft.VisualStudio.Component.CoreBuildTools ` --add Microsoft.VisualStudio.Component.VC.CoreBuildTools ` --add Microsoft.VisualStudio.Component.VC.Tools.x86.x64 ` --add Microsoft.VisualStudio.Component.VC.Redist.14.Latest ` --add Microsoft.VisualStudio.Component.VC.CoreIde ` --add Microsoft.VisualStudio.Component.Windows11SDK.22621 ` --add Microsoft.VisualStudio.ComponentGroup.NativeDesktop.Core ` --add Microsoft.VisualStudio.Workload.MSBuildTools ` --add Microsoft.VisualStudio.Workload.VCTools ` --includeRecommended --quiet --wait # インストールプロセスが起動 PS C:\Users\Administrator> Get-Process -Name "vs_*", "setup*" Handles NPM(K) PM(K) WS(K) CPU(s) Id SI ProcessName ------- ------ ----- ----- ------ -- -- ----------- 369 17 3460 15508 0.66 3132 0 vs_BuildTools 890 64 30168 61180 8.03 4048 0 vs_setup_bootstrapper # 上記プロセスが終了すればインストール完了 PS C:\Users\Administrator> Get-Process -Name "vs_*" # C++ コンパイラー PS C:\Users\Administrator> Get-ChildItem "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\*\bin\Hostx64\x64\cl.exe" Directory: C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.39.33519\bin\Hostx64\x64 Mode LastWriteTime Length Name ---- ------------- ------ ---- -a---- 2/20/2024 5:00 PM 843248 cl.exe # Path を通す PS C:\Users\Administrator> $currentPath = [Environment]::GetEnvironmentVariable("Path", "Machine") PS C:\Users\Administrator> $currentPath += ";C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.39.33519\bin\Hostx64\x64" PS C:\Users\Administrator> [Environment]::SetEnvironmentVariable("Path", $currentPath, "Machine") # 環境変数を再読み込み PS C:\Users\Administrator> $env:Path = [System.Environment]::GetEnvironmentVariable("Path","Machine") + ";" + [System.Environment]::GetEnvironmentVariable("Path","User") PS C:\Users\Administrator> cl.exe Microsoft (R) C/C++ Optimizing Compiler Version 19.39.33519 for x64 Copyright (C) Microsoft Corporation. All rights reserved. usage: cl [ option... ] filename... [ /link linkoption... ] |
[2] | CUDA をダウンロードしてインストールします。 インストールしたいバージョンを下記サイトで確認してダウンロードします。 ⇒ https://developer.nvidia.com/cuda-toolkit-archive |
PS C:\Users\Administrator> Invoke-WebRequest -Uri "https://developer.download.nvidia.com/compute/cuda/12.3.2/local_installers/cuda_12.3.2_546.12_windows.exe" -OutFile "cuda_12.3.2_546.12_windows.exe" # サイレントモードでインストール PS C:\Users\Administrator> ./cuda_12.3.2_546.12_windows.exe -s # インストールプロセスが起動 PS C:\Users\Administrator> Get-Process -Name "cuda*", "setup*" Handles NPM(K) PM(K) WS(K) CPU(s) Id SI ProcessName ------- ------ ----- ----- ------ -- -- ----------- 318 20 2912 15564 149.67 3972 0 cuda_12.3.2_546.12_windows 471 30 36488 63616 158.52 3524 0 setup # 上記プロセスが終了すればインストール完了 PS C:\Users\Administrator> Get-Process -Name "cuda*", "setup*" # 環境変数を再読み込み PS C:\Users\Administrator> $env:Path = [System.Environment]::GetEnvironmentVariable("Path","Machine") + ";" + [System.Environment]::GetEnvironmentVariable("Path","User") PS C:\Users\Administrator> nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Wed_Nov_22_10:30:42_Pacific_Standard_Time_2023 Cuda compilation tools, release 12.3, V12.3.107 Build cuda_12.3.r12.3/compiler.33567101_0 PS C:\Users\Administrator> nvidia-smi Tue Feb 20 17:39:49 2024 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 546.12 Driver Version: 546.12 CUDA Version: 12.3 | |-----------------------------------------+----------------------+----------------------+ | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 3060 WDDM | 00000000:07:00.0 Off | N/A | | 0% 40C P8 9W / 170W | 21MiB / 12288MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 800 C+G C:\Windows\System32\dwm.exe N/A | | 0 N/A N/A 964 C+G C:\Windows\System32\LogonUI.exe N/A | +---------------------------------------------------------------------------------------+ |
[3] | サンプルプログラムをダウンロードして動作確認します。 |
PS C:\Users\Administrator> Invoke-WebRequest -Uri "https://github.com/NVIDIA/cuda-samples/archive/refs/heads/master.zip" -OutFile "master.zip" PS C:\Users\Administrator> Expand-Archive -Path ./master.zip PS C:\Users\Administrator> cd ./master/cuda-samples-master/Samples/1_Utilities/deviceQuery PS C:\Users\Administrator\master\cuda-samples-master\Samples\1_Utilities\deviceQuery> nvcc -I ../../../Common deviceQuery.cpp -o deviceQuery deviceQuery.cpp Creating library deviceQuery.lib and object deviceQuery.exp PS C:\Users\Administrator\master\cuda-samples-master\Samples\1_Utilities\deviceQuery> ./deviceQuery.exe C:\Users\Administrator\master\cuda-samples-master\Samples\1_Utilities\deviceQuery\deviceQuery.exe Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "NVIDIA GeForce RTX 3060" CUDA Driver Version / Runtime Version 12.3 / 12.3 CUDA Capability Major/Minor version number: 8.6 Total amount of global memory: 12288 MBytes (12884377600 bytes) (028) Multiprocessors, (128) CUDA Cores/MP: 3584 CUDA Cores GPU Max Clock rate: 1777 MHz (1.78 GHz) Memory Clock rate: 7501 Mhz Memory Bus Width: 192-bit L2 Cache Size: 2359296 bytes Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384) Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total shared memory per multiprocessor: 102400 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 1536 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 5 copy engine(s) Run time limit on kernels: Yes Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled CUDA Device Driver Mode (TCC or WDDM): WDDM (Windows Display Driver Model) Device supports Unified Addressing (UVA): Yes Device supports Managed Memory: Yes Device supports Compute Preemption: Yes Supports Cooperative Kernel Launch: Yes Supports MultiDevice Co-op Kernel Launch: No Device PCI Domain ID / Bus ID / location ID: 0 / 7 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.3, CUDA Runtime Version = 12.3, NumDevs = 1 Result = PASS PS C:\Users\Administrator\master\cuda-samples-master\Samples\1_Utilities\deviceQuery> cd ~/master/cuda-samples-master/Samples/1_Utilities/bandwidthTest PS C:\Users\Administrator\master\cuda-samples-master\Samples\1_Utilities\bandwidthTest> nvcc -I ../../../Common bandwidthTest.cu -o bandwidthTest bandwidthTest.cu tmpxft_000015a0_00000000-10_bandwidthTest.cudafe1.cpp Creating library bandwidthTest.lib and object bandwidthTest.exp PS C:\Users\Administrator\master\cuda-samples-master\Samples\1_Utilities\deviceQuery> ./bandwidthTest.exe [CUDA Bandwidth Test] - Starting... Running on... Device 0: NVIDIA GeForce RTX 3060 Quick Mode Host to Device Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(GB/s) 32000000 12.0 Device to Host Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(GB/s) 32000000 12.8 Device to Device Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(GB/s) 32000000 324.4 Result = PASS NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled. |