NVIDIA CUDA : Install2024/12/09 |
Install NVIDIA CUDA (Compute Unified Device Architecture). |
|
[1] | Run PowerShell with Admin Privilege and work. Download and Install C++ compiler first. |
Windows PowerShell Copyright (C) Microsoft Corporation. All rights reserved. PS C:\Users\Administrator> Invoke-WebRequest -Uri https://aka.ms/vs/17/release/vs_BuildTools.exe -OutFile "vs_BuildTools.exe" # install on silent mode PS C:\Users\Administrator> ./vs_buildtools.exe ` --add Microsoft.Component.MSBuild ` --add Microsoft.VisualStudio.Component.CoreBuildTools ` --add Microsoft.VisualStudio.Component.VC.CoreBuildTools ` --add Microsoft.VisualStudio.Component.VC.Tools.x86.x64 ` --add Microsoft.VisualStudio.Component.VC.Redist.14.Latest ` --add Microsoft.VisualStudio.Component.VC.CoreIde ` --add Microsoft.VisualStudio.Component.Windows11SDK.22621 ` --add Microsoft.VisualStudio.ComponentGroup.NativeDesktop.Core ` --add Microsoft.VisualStudio.Workload.MSBuildTools ` --add Microsoft.VisualStudio.Workload.VCTools ` --includeRecommended --quiet --wait # installation processes are running PS C:\Users\Administrator> Get-Process -Name "vs_*", "setup*" Handles NPM(K) PM(K) WS(K) CPU(s) Id SI ProcessName ------- ------ ----- ----- ------ -- -- ----------- 376 17 3520 16144 0.78 5668 0 vs_BuildTools 914 66 29176 61820 4.92 6228 0 vs_setup_bootstrapper # after finishing installation, processes above finish PS C:\Users\Administrator> Get-Process -Name "vs_*" # C++ compiler is here PS C:\Users\Administrator> Get-ChildItem "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\*\bin\Hostx64\x64\cl.exe" Directory: C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.42.34433\bin\Hostx64\x64 Mode LastWriteTime Length Name ---- ------------- ------ ---- -a---- 12/8/2024 5:51 PM 862792 cl.exe # set Path to environment variables PS C:\Users\Administrator> $currentPath = [Environment]::GetEnvironmentVariable("Path", "Machine") PS C:\Users\Administrator> $currentPath += ";C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.42.34433\bin\Hostx64\x64" PS C:\Users\Administrator> [Environment]::SetEnvironmentVariable("Path", $currentPath, "Machine") # reload environment variables PS C:\Users\Administrator> $env:Path = [System.Environment]::GetEnvironmentVariable("Path","Machine") + ";" + [System.Environment]::GetEnvironmentVariable("Path","User") PS C:\Users\Administrator> cl.exe Microsoft (R) C/C++ Optimizing Compiler Version 19.42.34435 for x64 Copyright (C) Microsoft Corporation. All rights reserved. usage: cl [ option... ] filename... [ /link linkoption... ] |
[2] | Download and Install CUDA. Make sure the version of CUDA you'd like to install on the official site below. ⇒ https://developer.nvidia.com/cuda-toolkit-archive |
PS C:\Users\Administrator> Invoke-WebRequest -Uri "https://developer.download.nvidia.com/compute/cuda/12.6.3/local_installers/cuda_12.6.3_561.17_windows.exe" -OutFile "cuda_12.6.3_561.17_windows.exe" # install on silent mode PS C:\Users\Administrator> ./cuda_12.6.3_561.17_windows.exe -s # installation processes are running PS C:\Users\Administrator> Get-Process -Name "cuda*", "setup*" Handles NPM(K) PM(K) WS(K) CPU(s) Id SI ProcessName ------- ------ ----- ----- ------ -- -- ----------- 213 17 757964 760556 23.53 6992 0 cuda_12.6.3_561.17_windows # after finishing installation, processes above finish PS C:\Users\Administrator> Get-Process -Name "cuda*", "setup*" # reload environment variables PS C:\Users\Administrator> $env:Path = [System.Environment]::GetEnvironmentVariable("Path","Machine") + ";" + [System.Environment]::GetEnvironmentVariable("Path","User") PS C:\Users\Administrator> nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2024 NVIDIA Corporation Built on Wed_Oct_30_01:18:48_Pacific_Daylight_Time_2024 Cuda compilation tools, release 12.6, V12.6.85 Build cuda_12.6.r12.6/compiler.35059454_0 PS C:\Users\Administrator> nvidia-smi Sun Dec 8 19:23:04 2024 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 561.17 Driver Version: 561.17 CUDA Version: 12.6 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 3060 WDDM | 00000000:03:00.0 Off | N/A | | 0% 40C P8 9W / 170W | 18MiB / 12288MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | 0 N/A N/A 1416 C+G C:\Windows\System32\dwm.exe N/A | | 0 N/A N/A 4192 C+G C:\Windows\explorer.exe N/A | +-----------------------------------------------------------------------------------------+ |
[3] | Run sample program to verify installation. |
PS C:\Users\Administrator> Invoke-WebRequest -Uri "https://github.com/NVIDIA/cuda-samples/archive/refs/heads/master.zip" -OutFile "master.zip" PS C:\Users\Administrator> Expand-Archive -Path ./master.zip PS C:\Users\Administrator> cd ./master/cuda-samples-master/Samples/1_Utilities/deviceQuery PS C:\Users\Administrator\master\cuda-samples-master\Samples\1_Utilities\deviceQuery> nvcc -I ../../../Common deviceQuery.cpp -o deviceQuery deviceQuery.cpp Creating library deviceQuery.lib and object deviceQuery.exp PS C:\Users\Administrator\master\cuda-samples-master\Samples\1_Utilities\deviceQuery> ./deviceQuery.exe C:\Users\Administrator\master\cuda-samples-master\Samples\1_Utilities\deviceQuery\deviceQuery.exe Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "NVIDIA GeForce RTX 3060" CUDA Driver Version / Runtime Version 12.6 / 12.6 CUDA Capability Major/Minor version number: 8.6 Total amount of global memory: 12288 MBytes (12884377600 bytes) (028) Multiprocessors, (128) CUDA Cores/MP: 3584 CUDA Cores GPU Max Clock rate: 1777 MHz (1.78 GHz) Memory Clock rate: 7501 Mhz Memory Bus Width: 192-bit L2 Cache Size: 2359296 bytes Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384) Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total shared memory per multiprocessor: 102400 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 1536 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 1 copy engine(s) Run time limit on kernels: Yes Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled CUDA Device Driver Mode (TCC or WDDM): WDDM (Windows Display Driver Model) Device supports Unified Addressing (UVA): Yes Device supports Managed Memory: Yes Device supports Compute Preemption: Yes Supports Cooperative Kernel Launch: Yes Supports MultiDevice Co-op Kernel Launch: No Device PCI Domain ID / Bus ID / location ID: 0 / 3 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.6, CUDA Runtime Version = 12.6, NumDevs = 1 Result = PASS PS C:\Users\Administrator\master\cuda-samples-master\Samples\1_Utilities\deviceQuery> cd ~/master/cuda-samples-master/Samples/1_Utilities/bandwidthTest PS C:\Users\Administrator\master\cuda-samples-master\Samples\1_Utilities\bandwidthTest> nvcc -I ../../../Common bandwidthTest.cu -o bandwidthTest bandwidthTest.cu tmpxft_00001170_00000000-10_bandwidthTest.cudafe1.cpp Creating library bandwidthTest.lib and object bandwidthTest.exp PS C:\Users\Administrator\master\cuda-samples-master\Samples\1_Utilities\deviceQuery> ./bandwidthTest.exe [CUDA Bandwidth Test] - Starting... Running on... Device 0: NVIDIA GeForce RTX 3060 Quick Mode Host to Device Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(GB/s) 32000000 12.3 Device to Host Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(GB/s) 32000000 13.0 Device to Device Bandwidth, 1 Device(s) PINNED Memory Transfers Transfer Size (Bytes) Bandwidth(GB/s) 32000000 320.6 Result = PASS NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled. |
|