CUDA
Jump to navigation
Jump to search
Compile CUDA code
When you compile CUDA code, you should always compile only one ‘-arch
‘ flag that matches your most used GPU cards. This will enable faster runtime, because code generation will occur during compilation.
If you only mention ‘-gencode
‘, but omit the ‘-arch
‘ flag, the GPU code generation will occur on the JIT compiler by the CUDA driver.
When you want to speed up CUDA compilation, you want to reduce the amount of irrelevant ‘-gencode
‘ flags. However, sometimes you may wish to have better CUDA backwards compatibility by adding more comprehensive ‘-gencode
‘ flags.
CUDA Compatibility
CUDA Version | cuDNN Version | NCCL Version | NVIDIA GPU Driver Version | Compute Capability Support |
---|---|---|---|---|
CUDA 12.0 | ||||
CUDA 11.8 | ||||
CUDA 11.7 | ||||
CUDA 11.5 | 8.3.x | 2.10.x | 510.39 or later | Compute Capability 3.0 to 8.6 |
CUDA 11.4 | 8.2.x | 2.10.x | 470.42.01 or later | Compute Capability 3.0 to 8.6 |
CUDA 11.3 | 8.2.x | 2.10.x | 465.19.01 or later | Compute Capability 3.0 to 8.6 |
CUDA 11.2 | 8.1.x | 2.9.x | 460.32.03 or later | Compute Capability 3.0 to 8.6 |
CUDA 11.1 | 8.0.x | 2.9.x | 455.23.04 or later | Compute Capability 3.0 to 8.6 |
CUDA 11.0 | 7.6.x | 2.8.x | 450.36.06 or later | Compute Capability 3.0 to 8.6 |
CUDA 10.2 | 7.6.x | 2.7.x | 440.33 or later | Compute Capability 3.0 to 7.5 |
CUDA 10.1 | 7.6.x | 2.4.x | 418.39 or later | Compute Capability 3.0 to 7.5 |
CUDA 10.0 | 7.4.x | 2.2.x | 410.48 or later | Compute Capability 3.0 to 7.5 |
CUDA 9.2 | 7.2.x | 2.1.x | 396.26 or later | Compute Capability 3.0 to 7.5 |
CUDA 9.1 | 7.1.x | 2.0.x | 390.46 or later | Compute Capability 3.0 to 7.5 |
CUDA 9.0 | 7.0.x | 1.3.x | 384.81 or later | Compute Capability 3.0 to 7.5 |
CUDA 8.0 | 6.0.x | 1.3.x | 375.26 or later | Compute Capability 2.0 to 6.2 |
CUDA 7.5 | 5.1.x | 1.3.x | 352.31 or later | Compute Capability 2.0 to 5.2 |