Building LAMMPS for Nvidia GraceHopper nodes

Gabriele · October 9, 2023, 10:47pm

I am trying to build LAMMPS with Kokkos for HOPPER90. However, I am running compilation errors, with nvc as well as with gcc.

With g++13.1.0 I am getting the error:

  extern _Float32 modff32 (_Float32 __x, _Float32 *__iptr) noexcept (true); extern _Float32 __modff32 (_Float32 __x, _Float32 *__iptr) noexcept (true) __attribute__ ((__nonnull__ (2)));
                                                                                                       
Error limit reached.
100 errors detected in the compilation of "/home/gjost/LAMMPS/lammps/lib/kokkos/core/src/impl/Kokkos_CPUDiscovery.cpp".

With nvc++ 23.7.0 I am getting the error:

"/home/gjost/LAMMPS/lammps/lib/kokkos/core/src/../../tpls/desul/include/desul/atomics/CUDA.hpp", line 522: error: function "desul::atomic_fetch_max(long *, long, desul::MemoryOrderRelaxed, desul::MemoryScopeDevice)" has already been defined
  __device__ inline long atomic_fetch_max(long* const dest,
                         ^

"/home/gjost/LAMMPS/lammps/lib/kokkos/core/src/../../tpls/desul/include/desul/atomics/CUDA.hpp", line 529: error: function "desul::atomic_fetch_min(long *, long, desul::MemoryOrderRelaxed, desul::MemoryScopeDevice)" has already been defined
  __device__ inline long atomic_fetch_min(long* const dest,
                         ^

For MPI I am using the OpenMP that comes with nvhpc/23.7.
Is there a recommended compiler/version that I should be using for the LAMMPS GraceHopper build?
I using the following LAMMPS commit:
commit ce756540e8615b6f7c1c1242695172b502834b19

Thanks in advance and greetings, Gabriele

akohlmey · October 10, 2023, 3:59am

To the best of my knowledge, both the nvidia CUDA compiler (nvcc) and the Nvidia HPC compiler (formerly PGI compiler) need a compatible GCC “underneath” since they masquerade as a GCC compiler to use their header files and then have compatibility “magic” for GCC constructs that they cannot handle. However, those need to be updated with every GCC (major) release.

So I suspect GCC 13 is too new for either compiler.

stamoor · October 10, 2023, 3:53pm

What CUDA version? It should work with nvcc + gcc. The max GCC version depends on the CUDA version.

stamoor · October 10, 2023, 4:49pm

Also it looks like that commit is using Kokkos v3.7 instead of 4.1, you definitely want to update to use Kokkos 4 for Hopper

Gabriele · October 20, 2023, 9:04pm

Stan, only saw your response just now. Can you recommend a commit I should be using. I am here:
"'gjost@sj-cg4-01:~/LAMMPS/lammps$ git branch -a

stable
remotes/origin/HEAD → origin/develop
remotes/origin/compute-fix-variable-outputs
remotes/origin/coulomb-refactoring
remotes/origin/develop
remotes/origin/fix-rigid-enforce2d
remotes/origin/maintenance
remotes/origin/nwchem
remotes/origin/release
remotes/origin/stable
remotes/origin/subversion
remotes/origin/triclinic-neighbor-bug
gjost@sj-cg4-01:~/LAMMPS/lammps$ git log
commit ce756540e8615b6f7c1c1242695172b502834b19 (HEAD → stable, tag: stable_2Aug2023_update1, tag: patch_2Aug2023_update1, tag: lammps-gui-v1.2, origin/stable)
Author: Axel Kohlmeyer [email protected]
Date: Fri Sep 22 07:51:58 2023 -0400
‘’’
How can I switch to a branch that has Kokkos 4?
Many thanks and sorry for the late response.
Gabriele

Gabriele · October 20, 2023, 9:09pm

Also, I tried adding
‘’‘-DKokkos_ENABLE_DESUL_ATOMICS_EXTERNAL=on ‘’’
but that runs into configuration errors.
I think I am using cuda 12.2

stamoor · October 23, 2023, 3:19pm

Can you try origin/develop? It should be up to date.

rarensu · September 4, 2024, 4:21pm

Howdy guys, is there any update on this? I too am interested in trying LAMMPS on the Grace Hopper. Do we have a build recipe that is known to work well?

stamoor · September 4, 2024, 4:50pm

It should work fine, as long as you use a recent version of LAMMPS.

alphataubio · September 5, 2024, 12:07pm

@rarensu make sure to build your own cuda-aware openmpi/5.0.x customized for your local cluster environment(s). NVIDIA puts out a lot of great packages, nvhpc is NOT.

this is my latest build script and user module file. your local cluster modules might be different, consult local documentation and support staff.

cd /dev/shm
wget https://download.open-mpi.org/release/open-mpi/v5.0/openmpi-5.0.5.tar.gz
tar xzvf openmpi-5.0.5.tar.gz; cd openmpi-5.0.5

module --force purge
module load StdEnv/2023
module unload openmpi imkl flexiblas
module load cuda/12.2 pmix/5.0.2 prrte/3.0.5

./configure --prefix=$HOME/local/openmpi-5.0.5 \
  --with-cuda=$CUDA_HOME \
  --with-cuda-libdir=$CUDA_HOME/lib64/stubs \
  --disable-io-romio \
  --without-knem --with-io-romio-flags=--without-ze

make -j 64 all; make install

$HOME/local/modules/openmpi/5.0.5:
(You need to hardcode your own HOME_DIRECTORY, lmod doesnt do tilde or $HOME expansion)

#%Module
set prefix {<HOME_DIRECTORY>/local/openmpi-5.0.5}
set version {5.0.5}
prepend-path CMAKE_PREFIX_PATH ${prefix}
prepend-path PATH ${prefix}/bin
prepend-path CPATH ${prefix}/include
prepend-path LIBRARY_PATH ${prefix}/lib
prepend-path LD_LIBRARY_PATH ${prefix}/lib
prepend-path MANPATH ${prefix}/share/man
prepend-path PKG_CONFIG_PATH ${prefix}/lib/pkgconfig
setenv MODULE_OPENMPI_PREFIX ${prefix}
prepend-path MODULEPATH ${prefix}

then

cd /dev/shm
module --force purge
module use $HOME/local/modules
module load StdEnv/2023
module unload openmpi imkl flexiblas
module load cuda/12.2 pmix/5.0.2 prrte/3.0.5 openmpi/5.0.5

git clone https://github.com/lammps/lammps.git
mkdir lammps/build; cd lammps/build

cmake ../cmake -D BUILD_SHARED_LIBS=yes \
    -D LAMMPS_MACHINE=kk \
    -D CMAKE_INSTALL_PREFIX=~/local \
    -D LAMMPS_FFMPEG=yes \
    -D CMAKE_CXX_COMPILER="${PWD}/../lib/kokkos/bin/nvcc_wrapper" \
    -D PKG_KOKKOS=yes -D Kokkos_ENABLE_CUDA=yes \
    -D FFT_KOKKOS=CUFFT \
    -D Kokkos_ARCH_NATIVE=yes -D Kokkos_ARCH_ HOPPER90=yes \
    -D PKG_REAXFF=yes ; \
    cmake --build . -j 64; make install