« Return to HPC Overview

Message Passing Interface (MPI) and UVA HPC

Overview

MPI stands for Message Passing Interface. The MPI standard is defined by the Message Passing Interface Forum. The standard defines the interface for a set of functions that can be used to pass messages between processes on the same computer or on different computers. MPI can be used to program shared memory or distributed memory computers. There is a large number of implementations of MPI from various computer vendors and academic groups. MPI is supported on the HPC clusters.

MPI On the HPC System

MPI is a standard that describes the behavior of a library. It is intended to be used with compiled languages (C/C++/Fortran). Several implementations of this standard exist. UVA HPC supports OpenMPI for all our compilers and IntelMPI for the Intel compiler. MPI can also be used with the interpreted languages R and Python through packages that link to an implementation; on the HPC system these languages use OpenMPI.

Selecting Compiler and Implementation

An MPI implementation must be built with a specific compiler. Consequently, only compilers for which MPI has been prepared can be used with it. All versions of the Intel compiler will have a corresponding IntelMPI. For OpenMPI run

module spider openmpi

This will respond with the versions of OpenMPI available. To see which version goes with which compiler, run

module spider openmpi/<version>

For example:

module spider

Example output:

You will need to load all module(s) on any one of the lines below before the
"" module is available to load.
   gcc/11.4.0

This shows that OpenMPI version is available for gcc 11.4.0.

Once a choice of compiler and MPI implementation have been made, the modules must be loaded. First load the compiler, then the MPI. For instance, to use OpenMPI with gcc 11.4.0, run

module load gcc/11.4.0
module load openmpi

To load the Intel compiler version and its IntelMPI version, run

module load intel

However, for Intel 18.0, run:

module load intel/18.0
module load intelmpi/18.0

It is also possible to combine these into one line, as long as the compiler is specified first (this can result in errors if you are not using the default compiler, however)

module load gcc openmpi

For a detailed description of building and running MPI codes on the HPC system, please see our HowTo.

**Available MPI library modules**

Module	Category	Description
aocc	compiler	AMD Optimized C/C++ & Fortran compilers (AOCC) based on LLVM
clang	compiler	C, C++, Objective-C compiler, based on LLVM. Does not include C++ standard library -- use libstdc++ from GCC.
gcc	compiler	The GNU Compiler Collection includes front ends for C, C++, Objective-C, Fortran, Java, and Ada, as well as libraries for these languages (libstdc++, libgcj,...).
ghc	compiler	The Glorious/Glasgow Haskell Compiler
go	compiler	Go is an open source programming language that makes it easy to build simple, reliable, and efficient software.
impi	mpi	Intel MPI Library, compatible with MPICH ABI
intel-compilers	compiler	Intel C, C++ & Fortran compilers
intelmpi	mpi	IntelMPI from Intel.
llvm	compiler	The LLVM Core libraries provide a modern source- and target-independent optimizer, along with code generation support for many popular CPUs (as well as some less common ones!) These libraries are built around a well specified code representation known as the LLVM intermediate representation ("LLVM IR"). The LLVM Core libraries are well documented, and it is particularly easy to invent your own language (or port an existing compiler) to use LLVM as an optimizer and code generator.
mvapich2	mpi	The MVAPICH software, based on MPI 4.1 standard, delivers the best performance, scalability and fault tolerance for high-end computing systems and servers.
nvhpc	compiler	C, C++ and Fortran compilers included with the NVIDIA HPC SDK (previously: PGI)
ocaml	compiler	OCaml is an industrial-strength programming language supporting functional, imperative and object-oriented styles
openmpi	mpi	The Open MPI Project is an open source MPI-3 implementation.

Example Slurm Scripts

This example is a Slurm job command file to run a parallel (MPI) job using the OpenMPI implementation:

#!/bin/bash
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=40
#SBATCH --time=12:00:00
#SBATCH --output=output_filename
#SBATCH --partition=parallel
#SBATCH -A mygroup

module load gcc
module load openmpi

mpirun ./parallel_executable

In this example, the Slurm job file is requesting two nodes with sixteen tasks per node for a total of 32 processes. Both OpenMPI and IntelMPI are able to obtain the number of processes and the host list from Slurm, so these are not specified. In general, MPI jobs should use all of a node, but some codes cannot be distributed in that manner so we are showing a more general example here.

Slurm can also place the job freely if the directives specify only the number of tasks. In this case do not specify a node count. This is not generally recommended, however, as it can have a significant negative impact on performance.

#!/bin/bash
#SBATCH -N 2
#SBATCH --ntasks=8
#SBATCH --time=12:00:00
#SBATCH --output=output_filename
#SBATCH --partition=parallel 
#SBATCH -A mygroup

module load gcc
module load openmpi

mpirun ./parallel_executable

Example: MPI over an odd number of tasks

#!/bin/bash
#SBATCH --ntasks=97
#SBATCH --nodes=5
#SBATCH --ntasks-per-node=20
#SBATCH --time=12:00:00
#SBATCH --output=output_filename
#SBATCH --partition=parallel 
#SBATCH -A mygroup

module load gcc
module load openmpi
mpirun ./parallel_executable

MPI with OpenMP

The following example runs a total of 32 MPI processes, 8 on each node, with each task using 5 cores for threading. The total number of cores utilized is thus 160.

#!/bin/bash
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=5
#SBATCH --time=12:00:00
#SBATCH --output=output_filename
#SBATCH --partition=parallel
#SBATCH -A mygroup

module load gcc openmpi
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
mpirun ./hybrid_executable

Updated April 23, 2019 | mpi, rivanna, software