How to compile HPL-GPU
Background
There are many combination to compile High Performance LINPACK (HPL) with different configurations such as different compiler, different basic linear algebra subprograms (BLAS), massage passing interface (MPI) libraries: for example
- Which compiler + HPL + which Blas (OpenBLAS / Intel MKL / CuBLAS) + which MPI (OpenMPI, MPICH, Intel MPI)
Build High Performance LINPACK with CUDA
In this post, we are going to use GNU compiler, OpenBLAS, OpenMPI for HPL-GPU
Assumption
- The build system is Intel Skylake CPU, 128GB Memory, Nvidia GPUs on Ubuntu 20.04
- NVIDIA driver is ready and CUDA 12.0 has been installed and library can be found at /usr/local/cuda
- We are using OpenMPI, OpenBlas and hpl-2.0_FERMI_v15 as a ingredient, gcc version 10, g++ version 10 and gfortran version 10
Step 1, Compile OpenBLAS[1]
_version=v0.3.23
FILE=$_version.tar.gz
wget http://github.com/xianyi/OpenBLAS/archive/$FILE
tar -xzvf $FILE
cd OpenBLAS-0.3.23
# OpenBLAS does not support f77, GCC or Intel are supported
# type make to detect the CPU automatically. or type make TARGET=xxx to set target CPU, e.g. make TARGET=NEHALEM. The full target list is in file TargetList.txt.
make
# To install the library, you can run "make PREFIX=/path/to/your/installation install".
mkdir -p /opt/hpcmate/home/lib/OpenBLAS-0.3.23
make PREFIX=/opt/hpcmate/home/lib/OpenBLAS-0.3.23 install
ls ~/lib/OpenBLAS-0.3.23/lib
now, compiled OpenBLAS's library and header file is available ~/lib/OpenBLAS-0.3.23
Step 2, Compile OpenMPI[2][3]
FILE=openmpi-2.0.2.tar.gz
wget https://download.open-mpi.org/release/open-mpi/v2.0/$FILE
tar -xzvf $FILE
cd openmpi-2.0.2
./configure --prefix=/opt/hpcmate/home/lib/openmpi-2.0.2
make all install
now, compiled OpenMPI 's library and header file is available ~/lib/openmpi-2.0.2
Step 3, Compile hpl-2.0_FERMI_v15
#download hpl-2.0_FERMI_v15.tgz and untar
cd hpl-2.0_FERMI_v15
Now we need to modify Makefile to fit our build environment by editing Make.CUDA file in hpl-2.0_FERMI_v15 folder.
######Edit the below lines with your settings###########
#hpl-2.0_FERMI_v15 path
TOPdir = /opt/hpcmate/home/try3/hpl-gpu/hpl-2.0_FERMI_v15
#MPI library path
MPdir = /opt/hpcmate/home/lib/openmpi-2.0.2
MPinc = -I$(MPdir)/include
MPlib = $(MPdir)/lib/libmpi.so
#BLAS library path
LAdir = /opt/hpcmate/home/lib/OpenBLAS-0.3.23
LAinc =
LAlib = -L$(TOPdir)/src/cuda -ldgemm -L/usr/local/cuda/lib64 -lcuda -lcublas -lcudart -L$(LAdir)/lib -lopenblas -lpthread
# CC and Linker
CC = /opt/hpcmate/home/lib/openmpi-2.0.2/bin/mpicc
CCFLAGS = $(HPL_DEFS) -fomit-frame-pointer -O3 -funroll-loops -w -Wall -fopenmp
#################################################
To remove Intel MKL dependency in source code at hpl-2.0_FERMI_v15/src/cuda/cuda_dgemm.c, we also need to change from intel MKL to OpenBLAS. To do that, we need to edit hpl-2.0_FERMI_v15/src/cuda/cuda_dgemm.c so that cuda_dgemm can use OpenBLAS instead of Intel MKL. [4]
# find and edit following lines at hpl-2.0_FERMI_v15/src/cuda/cuda_dgemm.c as
...
// handle2 = dlopen ("libmkl_intel_lp64.so", RTLD_LAZY);
handle2 = dlopen ("libopenblas.so", RTLD_LAZY);
...
//dgemm_mkl = (void(*)())dlsym(handle, "dgemm");
dgemm_mkl = (void(*)())dlsym(handle, "dgemm_");
...
//handle = dlopen ("libmkl_intel_lp64.so", RTLD_LAZY);
handle = dlopen ("libopenblas.so", RTLD_LAZY);
...
//mkl_dtrsm = (void(*)())dlsym(handle2, "dtrsm");
mkl_dtrsm = (void(*)())dlsym(handle2, "dtrsm_");
...
otherwise, we will face runtime error something like :
libmkl_intel_lp64.so: cannot open shared object file: No such file or directory
libopenblas.so.0: undefined symbol: dtrsm
libopenblas.so.0: undefined symbol: dgemm
Now we also need to setup required environment to point right path to compile HPL-GPU,
# crate setenv.sh file with following contents
export CC=gcc
export CXX=g++
export F77=gfortran
export FC=gfortran
export FC90=gfortran
export PATH=/opt/hpcmate/home/lib/openmpi-2.0.2/bin:$PATH
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/hpcmate/home/lib/openmpi-2.0.2/lib:/opt/hpcmate/home/lib/OpenBLAS-0.3.23/lib:/opt/hpcmate/home/try3/hpl-gpu/hpl-2.0_FERMI_v15/src/cuda
Now we are ready to compile HPL-GPU,
#set PATH and LD_LIBRARY_PATH
source setenv.sh #the definition above
#clean up
cd hpl-2.0_FERMI_v15
$make arch=CUDA clean
$make arch=CUDA
#for successful compilation, we will get ./bin/CUDA with xhpl and HPL.dat
Reminder
- Use the same compile tools and version to compile required libraries such as OpenMPI, OpenBlas and HPL-GPU
- Set the correct PATH and LD_LIBRARY_PATH to point right things
- MPdir, LAdir, CC should be matched with what you are trying to link
Todo
- How to adjust HPL.dat
- Automated compile script - we are going to share automated compile script of this post through repository.hpcmate.com. please contact support@hpcmate.com, if you need
References
- ↑ https://github.com/xianyi/OpenBLAS/wiki/User-Manual#compile-the-library
- ↑ https://www.open-mpi.org/software/ompi/v2.0/
- ↑ https://sites.google.com/site/rangsiman1993/comp-env/program-install/install-openmpi
- ↑ http://hwengineer.blogspot.com/2018/03/power9-ac922-hpl-cuda-compile.html