Whenever the atom_style full is used with the Kokkos accelerator on GPU
I get a core dumped error with lammps executables that are otherwise able to smoothly (and very efficiently!) run reaxff simulations and other standard simulations on GPU with Kokkos.
A simple way to reproduce the error is to run the in.rhodo bench input file.
I get the same problem with the 21Nov2023 and 17Apr2024 patch releases.
The executables are compiled with both the KOKKOS and MOLECULE packages for V100 nvidia gpus with the CUDA 12.2.91 toolkit and gnu compilers 12.2 (see attached for more informations).
Ultimately I want to run npt simulations with the OPLS force field as (I only show the preamble):
which should not be a problem as all the fix and styles are said to work with the kokkos package.
As a side note: I noticed that from the 21Nov2023 to 17Apr2024 patch releases it became necessary to set the FFT_KOKKOS cmake variable to “CUFFT” in order to avoid the default KISS FFT library, which is not mentioned in the documentation about the Kokkos package building.
Best,
Amaël
Summary
– <<< Build configuration >>>
LAMMPS Version: 20231121
Operating System: Linux Red 8.6
CMake Version: 3.25.2
Build type: RELEASE
Generator: Unix Makefiles using /usr/bin/gmake
– Enabled packages: BROWNIAN;CLASS2;CORESHELL;DIELECTRIC;DIPOLE;EXTRA-COMPUTE;EXTRA-DUMP;EXTRA-FIX;EXTRA-MOLECULE;EXTRA-PAIR;FEP;GPU;INTEL;KOKKOS;KSPACE;MANYBODY;MC;MEAM;MISC;MOLECULE;OPENMP;OPT;QTB;REAXFF;REPLICA;RIGID;TALLY
– <<< Compilers and Flags: >>>
– C++ Compiler: /gpfslocalsup/spack_soft/gcc/12.2.0/gcc-8.5.0-ptka3d5gf3nvhwkl6g5bgw7uzksjoywv/bin/g++
Type: GNU
Version: 12.2.0
C++ Flags: -O3 -DNDEBUG -march=cascadelake -mtune=cascadelake
Defines: LAMMPS_SMALLBIG;LAMMPS_MEMALIGN=64;LAMMPS_OMP_COMPAT=4;LAMMPS_JPEG;LAMMPS_PNG;LAMMPS_GZIP;FFT_SINGLE;FFT_MKL;FFT_MKL_THREADS;LMP_OPENMP;$<BUILD_INTERFACE:LMP_KOKKOS>;FFT_CUFFT;LMP_INTEL;LMP_INTEL_USELRT;LMP_USE_MKL_RNG;LMP_GPU
Options: -Xcudafe;–diag_suppress=unrecognized_pragma
– <<< Linker flags: >>>
– Executable name: lmp_impis-V100
– Static library flags:
– <<< MPI flags >>>
– MPI_defines: MPICH_SKIP_MPICXX;OMPI_SKIP_MPICXX;_MPICC_H
– MPI includes: /gpfs7kro/gpfslocalsup/spack_soft/openmpi/4.1.5/gcc-12.2.0-k5b6xwux5o26nktzmvewnoya5metw33z/include;/gpfslocalsup/spack_soft/openmpi/4.1.5/gcc-12.2.0-k5b6xwux5o26nktzmvewnoya5metw33z/include
– MPI libraries: /gpfslocalsup/spack_soft/openmpi/4.1.5/gcc-12.2.0-k5b6xwux5o26nktzmvewnoya5metw33z/lib/libmpi.so;
– <<< GPU package settings >>>
– GPU API: CUDA
– CUDA Compiler: /gpfslocalsys/cuda/12.2.0/bin/nvcc
– GPU default architecture: sm_70
– GPU binning with CUDPP: OFF
– CUDA MPS support: yes
– GPU precision: MIXED
– Kokkos Devices: CUDA;CUDA_LAMBDA;OPENMP;SERIAL
– Kokkos Architecture: ICX;VOLTA70
– <<< FFT settings >>>
– Primary FFT lib: MKL
– Using single precision FFTs
– Using threaded FFTs
– Kokkos FFT: cuFFT
Thank you for having a look at this. I’m using one gpu and like you have used the neigh half option.
I said it runs fine without kokkos, with an executable that was compiled without the Kokkos package.
In fact, the executable compiled with the Kokkos package enabled does not work when the GPU package is used (core dumped), which works fine with the other executable…
I spotted the faulty option: Kokkos_ARCH_ICX
My cmake options are attached in case this is due to a bad interaction with other settings.
It runs perfectly fine without (I only tested the 17Apr2024 version).
Allowing this option only generates problem with the KOKKOS or GPU accelerator when using the atom_style full,though I did not investigate that much the GPU accelerator.
Launching on a V100 GPU w. a Cascade Lake 6248 CPU the following:
lmp -in in.rhodo
lmp -in in.rhodo -sf gpu
lmp -in in.rhodo -k on g 1 t 1 -sf kk -pk kokkos neigh half
Hmmm, if your CPU doesn’t support the vector flag you are using, it could crash. For example I’ve seen a crash on Haswell when accidentally compiling for KNL AVX512, it gave an “illegal instruction” error or something like that. Glad you got it working.
It is surprising that it worked so far for all other simulations. I mixed up the optimization flags between the different GPU partitions I’m using…