CUDA

Compile CUDA code

When you compile CUDA code, you should always compile only one ‘-arch‘ flag that matches your most used GPU cards. This will enable faster runtime, because code generation will occur during compilation.

If you only mention ‘-gencode‘, but omit the ‘-arch‘ flag, the GPU code generation will occur on the JIT compiler by the CUDA driver.

When you want to speed up CUDA compilation, you want to reduce the amount of irrelevant ‘-gencode‘ flags. However, sometimes you may wish to have better CUDA backwards compatibility by adding more comprehensive ‘-gencode‘ flags.

CUDA Compatibility

CUDA Version	cuDNN Version	NCCL Version	NVIDIA GPU Driver Version	Compute Capability Support
CUDA 12.0
CUDA 11.8
CUDA 11.7
CUDA 11.5	8.3.x	2.10.x	510.39 or later	Compute Capability 3.0 to 8.6
CUDA 11.4	8.2.x	2.10.x	470.42.01 or later	Compute Capability 3.0 to 8.6
CUDA 11.3	8.2.x	2.10.x	465.19.01 or later	Compute Capability 3.0 to 8.6
CUDA 11.2	8.1.x	2.9.x	460.32.03 or later	Compute Capability 3.0 to 8.6
CUDA 11.1	8.0.x	2.9.x	455.23.04 or later	Compute Capability 3.0 to 8.6
CUDA 11.0	7.6.x	2.8.x	450.36.06 or later	Compute Capability 3.0 to 8.6
CUDA 10.2	7.6.x	2.7.x	440.33 or later	Compute Capability 3.0 to 7.5
CUDA 10.1	7.6.x	2.4.x	418.39 or later	Compute Capability 3.0 to 7.5
CUDA 10.0	7.4.x	2.2.x	410.48 or later	Compute Capability 3.0 to 7.5
CUDA 9.2	7.2.x	2.1.x	396.26 or later	Compute Capability 3.0 to 7.5
CUDA 9.1	7.1.x	2.0.x	390.46 or later	Compute Capability 3.0 to 7.5
CUDA 9.0	7.0.x	1.3.x	384.81 or later	Compute Capability 3.0 to 7.5
CUDA 8.0	6.0.x	1.3.x	375.26 or later	Compute Capability 2.0 to 6.2
CUDA 7.5	5.1.x	1.3.x	352.31 or later	Compute Capability 2.0 to 5.2

CUDA

Compile CUDA code

CUDA Compatibility

Navigation menu

Search