I’ve been working on this for several of weeks - attempting to build/run lammps with gpu and kokkos
I need kokkos and the gpu to run reaction models with large systems
key words: gpu kokkos lammps build
OS linux mint 21.1, AMD TR 7900, RTX 3900, Lammps 8Feb 23, Kokkos-4,
NVIDIA-SMI 525.85.12 Driver Version: 525.85.12 CUDA Version: 12.0
Cuda works with vmd, gromacs, and maeastro
I have installed Kokkos with cmake using
cmake …/ -D Kokkos_ENABLE_CUDA=ON -DKokkos_ARCH_AMPERE86=ON -D Kokkos_ENABLE_CUDA_LAMBDA=ON -D Kokkos_ENABLE_OPENMP=yes …/cmake
with this installation I can compile lammps to run with kokkos with no gpu using
cmake -C …/cmake/presets/most.cmake -D Kokkos_ENABLE_OPENMP=yes -D PKG_KOKKOS=yes -D Kokkos_ENABLE_THREADS=ON …/cmake
and for example
mpirun -np 1 lmp-k -k on t 16 -sf kk -in in.rhodo
Similarly without the invoking kokkos I can use cmake to run with a gpu
cmake -C …/cmake/presets/most.cmake -D PKG_GPU=on -D GPU_API=cuda -DCMAKE_CUDA_ARCHITECTURES=86 …/cmake
with for example mpirun -np 1 lmp -sf gpu -pk gpu 1 -in in.lj works as expected
However If try to complile lammps and kokkos to use with a GPU RTX 3090 ( using presets = most or basic )
cmake -C …/cmake/presets/most.cmake -D PKG_GPU=on -D Kokkos_ENABLE_CUDA=yes -D PKG_KOKKOS=yes …/cmake
or with the cmake command
cmake -C …/cmake/presets/most.cmake -D PKG_GPU=on -D GPU_API=cuda -DCMAKE_CUDA_ARCHITECTURES=86 -D PKG_KOKKOS=yes -D Kokkos_ENABLE_CUDA=ON -D Kokkos_ENABLE_THREADS=ON …/cmake
complilation proceeds ( with multiple warnings ) then after 100% is reached and the following error appears:
tmp/ccDhsuWv.s:4323155: Error: symbol fatbinData’ is already defined ( many lines ) /tmp/ccDhsuWv.s:4327364: Error: symbol fatbinData’ is already defined
lto-wrapper: fatal error: /usr/bin/c++ returned 1 exit status
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make: *** [CMakeFiles/lmp.dir/build.make:117: lmp] Error 1
make: *** [CMakeFiles/Makefile2:391: CMakeFiles/lmp.dir/all] Error 2
make: *** [Makefile:136: all] Error 2
I can find little information on fatbin Data is already defined.
Cuda was installed via the deb local from Nvidia and I gather that the lto wrapper is critcial
The full output of cmake for lammps is attached.
In essence any combination of package inclusion ends in failure I I try to invoke a gpu and Kokkos
any tips would be appreciated.
I have the full output for the builds , but as a new user I do not have the privilege of uploading those
============= prior GitHup data/suggestions ===========================
akohlm… assigned stanmoore1 Apr 24, 2023
added the kokkos_package label Apr 24, 2023
added this to LAMMPS Bug Reports Apr 24, 2023
moved this to High Priority Bugs in LAMMPS Bug Reports Apr 24, 2023
stanmoore1 commented Apr 25, 2023
Hi I have not encountered this error before but somehow link time optimization (LTO) is getting enabled. According to Link-time optimization with CUDA on Linux (-flto) - #2 by wlangdon - CUDA Programming and Performance - NVIDIA Developer Forums, can you try using -Xcompiler -fno-lto and see if that fixes the issue?
commented Apr 25, 2023
Alternatively can you try adding the -dlto flag to enable LTO on device, see https://developer.nvidia.com/blog/improving-gpu-app-performance-with-cuda-11-2-device-lto/ and 811162 – media-libs/opencv-4.5.2-r1 lto-wrapper: fatal error.
stanmoore1 added the cmake label Apr 25, 2023
Oh also a suggestion from XXweinbe2: try using Kokkos nvcc_wrapper as the compiler, something like this:*****************************
sta commented Apr 25, 2023
If you aren’t using nvcc_wrapper then that is mostly likely the cause of the issue.
stanmoore1 added the invalid label Apr 26, 2023
moved this from High Priority Bugs to Done in LAMMPS Bug Reports Apr 26, 2023
stanmoore1 commented Apr 26, 2023
I’m pretty sure that using nvcc_wrapper will fix your issue. <<<<*************** this did not fix the error ***********
In any case, since this is not a bug in LAMMPS but rather a build question I will close this issue on GitHub. Feel free to continue the discussion on MatSci: LAMMPS - Materials Science Community Discourse if you need more help. Thanks
sta closed this as completed Apr 26, 2023
Auth commented Apr 26, 2023
running kokkos with gpu #3751
busce004 opened this issue Apr 24, 2023 · 6 comments
running kokkos with gpu
busce opened this issue Apr 24, 2023 · 6 comments
commented Apr 24, 2023
==================== I’ve not rebuilt cuda ============== it functions with other MD programs, VMD, Maestro, gromacs
To use device LTO, add the option -dlto to both the compilation and link commands as shown below. Skipping the -dlto option from either of these two steps affects your results.
Compilation of cuda source files with -dlto option:
nvcc -dc -dlto *.cu
Linking of cuda object files with -dlto option:
nvcc -dlto *.o
Using -dlto option at compile time instructs the compiler to store a high-level intermediate representation (NVVM-IR) of the device code being compiled into the fatbinary. The -dlto option at link time will instruct the linker to retrieve the NVVM IR from all the link objects and merge them together into a single IR and perform optimization on the resulting IR for code generation. Device LTO works with any supported SM arch target.