Compiling for a GPU

Using a GPU can accelerate a code, but requires special programming and compiling. Several options are available for GPU-enabled programs.

OpenACC

OpenACC is a standard

Available NVIDIA CUDA Compilers

Module	Version	Module Load Command
cuda	11.4.2	module load cuda/11.4.2
cuda	11.8.0	module load cuda/11.8.0
cuda	12.2.2	module load cuda/12.2.2
cuda	12.4.1	module load cuda/12.4.1
cuda	12.8.0	module load cuda/12.8.0

Module	Version	Module Load Command
nvhpc	24.1	module load nvhpc/24.1
nvhpc	24.5	module load nvhpc/24.5
nvhpc	25.3	module load nvhpc/25.3

GPU architecture

According to the CUDA documentation, “in the CUDA naming scheme, GPUs are named sm_xy, where x denotes the GPU generation number, and y the version in that generation.” The documentation contains details about the architecture and the corresponding xy value. The compute capability is x.y.

Please use the following values when compiling CUDA code on the HPC system.

Type	GPU	Architecture	Compute Capability	CUDA Version
Datacenter	V100	Volta	7.0	9+
	A100	Ampere	8.0	11+
	A40	Ampere	8.6	11+
	H200	Hopper	9.0	11.8+
RTX	A6000	Ampere	8.6	11+
GeForce	RTX2080Ti	Turing	7.5	10+
	RTX3090	Ampere	8.6	11+

As an example, if you are only interested in V100 and A100:

-gencode arch=compute_70,code=sm_70 -gencode arch=compute_80,code=sm_80

Updated April 23, 2019 | compiler, gpu, rivanna, software

« Return to HPC Overview

NVHPC

Compiling for a GPU

OpenACC

Available NVIDIA CUDA Compilers

GPU architecture