Kokkos::Cuda mismatch of architecture

YoussefMabrouk · January 30, 2025, 10:34am

I am trying to simulate an organic liquid with 10k atoms based on OPLS-AA using LAMMPS on a NVIDIA Tesla V100 GPU node with 24 CPUs.

First I used LAMMPS without GPU and KOKKOS packages and with 12 OMP threads and I had 10 ns/day performance.

Then I compiled LAMMPS with GPU package using the following command

cmake ../cmake -DPKG_GPU=yes -DGPU_API=cuda \
-DGPU_PREC=mixed -DGPU_ARCH=sm_70 \
-D BUILD_MPI=yes -D BUILD_OMP=yes -D PKG_MOLECULE=ON \
-D PKG_COMPRESS=ON -D PKG_KSPACE=ON

and I had a performance of 30 ns/day for 6 MPI tasks and 4 OMP threads.

Since GPU package does not support all styles, I then try to compile LAMMPS with KOKKOS package using the following command

cmake  -C ../cmake/presets/kokkos-cuda.cmake \
 -C ../cmake/presets/basic.cmake \
 -D BUILD_MPI=yes -D PKG_OPENMP=yes \
 -D Kokkos_ARCH_PASCAL60=no -D Kokkos_ARCH_VOLTA70=yes \
 -D Kokkos_ENABLE_OPENMP=yes -D PKG_MOLECULE=ON \ 
 -D PKG_COMPRESS=ON -D PKG_KSPACE=ON

and after running using the command

export OMP_PROC_BIND=spread
export OMP_PLACES=threads
export OMP_NUM_THREADS=12
export MPI_NUM=2
mpirun -np $MPI_NUM /home/ji39tip/lammps-kokkos/build/lmp \
-in run.in -k on g 2 -sf kk

I had the follwoing error

[node128:439598] *** End of error message ***
Kokkos::Cuda::initialize ERROR: likely mismatch of architecture

I have no experience with GPU computations and I am very confused with this error since the same architecture I used to configure the GPU package sm_70 did not lead to this error, while setting VOLTA70 for KOKKOS leads to the error.

A difference between the compilation with GPU and KOKKOS packages is an error related to this line in CMakeLists.txt

# silence nvcc warnings
if((PKG_KOKKOS) AND (Kokkos_ENABLE_CUDA) \
    AND NOT (CMAKE_CXX_COMPILER_ID STREQUAL "Clang"))
       set(CMAKE_TUNE_DEFAULT "${CMAKE_TUNE_DEFAULT}" "-Xcudafe  
         --diag_suppress=unrecognized_pragma,--diag_suppress=128")
    endif()

This gives an error saying -Xcudafe is not known. Since it seems to only "silence nvcc warnings ", I have omitted this line from CMakeLists.txt to compile.

I am thankful very any hints.

akohlmey · January 30, 2025, 11:10am

When compiling the GPU package, LAMMPS will compile so-called “fat” binaries that are compatible with all supported GPU architectures. Thus even if there a mismatch, it will run as expected.

You can check the actual architecture with the nvc_get_devices executable compiled with the GPU package.

YoussefMabrouk · January 30, 2025, 12:24pm

Thanks for your reply. I did not pay attention that the server has several GPU architectures NVIDIA TESLA A100, V100 and P100 and the option VOLTA70 only corresponds to one architecture. Now I submit the job to the correct node and the error is gone.