Compiling for a GPU
Using a GPU can accelerate a code, but requires special programming and compiling. Several options are available for GPU-enabled programs.
OpenACC
OpenACC is a standard
Available NVIDIA CUDA Compilers
Module | Version |
Module Load Command |
cuda | 10.2.89 |
module load cuda/10.2.89
|
cuda | 11.4.2 |
module load cuda/11.4.2
|
cuda | 11.8.0 |
module load cuda/11.8.0
|
cuda | 12.2.2 |
module load cuda/12.2.2
|
cuda | 12.4.1 |
module load cuda/12.4.1
|
Module | Version |
Module Load Command |
nvhpc | 24.1 |
module load nvhpc/24.1
|
nvhpc | 24.5 |
module load nvhpc/24.5
|
GPU architecture
According to the CUDA documentation, “in the CUDA naming scheme, GPUs are named sm_xy
, where x
denotes the GPU generation number, and y
the version in that generation.” The documentation contains details about the architecture and the corresponding xy
value. The compute capability is x.y
.
Please use the following values when compiling CUDA code on the HPC system.
Type |
GPU |
Architecture |
Compute Capability |
CUDA Version |
Datacenter |
V100 |
Volta |
7.0 |
9+ |
|
A100 |
Ampere |
8.0 |
11+ |
|
A40 |
Ampere |
8.6 |
11+ |
RTX |
A6000 |
Ampere |
8.6 |
11+ |
GeForce |
RTX2080Ti |
Turing |
7.5 |
10+ |
|
RTX3090 |
Ampere |
8.6 |
11+ |
As an example, if you are only interested in V100 and A100:
-gencode arch=compute_70,code=sm_70 -gencode arch=compute_80,code=sm_80
|
compiler, gpu, rivanna, software