CUDA

From HPCWIKI
Revision as of 15:13, 19 March 2023 by Admin (talk | contribs)
Jump to navigation Jump to search

Compile CUDA code

When you compile CUDA code, you should always compile only one ‘-arch‘ flag that matches your most used GPU cards. This will enable faster runtime, because code generation will occur during compilation.

If you only mention ‘-gencode‘, but omit the ‘-arch‘ flag, the GPU code generation will occur on the JIT compiler by the CUDA driver.

When you want to speed up CUDA compilation, you want to reduce the amount of irrelevant ‘-gencode‘ flags. However, sometimes you may wish to have better CUDA backwards compatibility by adding more comprehensive ‘-gencode‘ flags.


CUDA Compatibility

CUDA Version cuDNN Version NCCL Version NVIDIA GPU Driver Version Compute Capability Support
CUDA 12.0
CUDA 11.8
CUDA 11.7
CUDA 11.5 8.3.x 2.10.x 510.39 or later Compute Capability 3.0 to 8.6
CUDA 11.4 8.2.x 2.10.x 470.42.01 or later Compute Capability 3.0 to 8.6
CUDA 11.3 8.2.x 2.10.x 465.19.01 or later Compute Capability 3.0 to 8.6
CUDA 11.2 8.1.x 2.9.x 460.32.03 or later Compute Capability 3.0 to 8.6
CUDA 11.1 8.0.x 2.9.x 455.23.04 or later Compute Capability 3.0 to 8.6
CUDA 11.0 7.6.x 2.8.x 450.36.06 or later Compute Capability 3.0 to 8.6
CUDA 10.2 7.6.x 2.7.x 440.33 or later Compute Capability 3.0 to 7.5
CUDA 10.1 7.6.x 2.4.x 418.39 or later Compute Capability 3.0 to 7.5
CUDA 10.0 7.4.x 2.2.x 410.48 or later Compute Capability 3.0 to 7.5
CUDA 9.2 7.2.x 2.1.x 396.26 or later Compute Capability 3.0 to 7.5
CUDA 9.1 7.1.x 2.0.x 390.46 or later Compute Capability 3.0 to 7.5
CUDA 9.0 7.0.x 1.3.x 384.81 or later Compute Capability 3.0 to 7.5
CUDA 8.0 6.0.x 1.3.x 375.26 or later Compute Capability 2.0 to 6.2
CUDA 7.5 5.1.x 1.3.x 352.31 or later Compute Capability 2.0 to 5.2