Dear All,
I’ve been working on this for several of weeks - attempting to build/run lammps with gpu and kokkos
I need kokkos and the gpu to run reaction models with large systems
key words: gpu kokkos lammps build
SYSTEM
OS linux mint 21.1, AMD TR 7900, RTX 3900, Lammps 8Feb 23, Kokkos-4,
NVIDIA-SMI 525.85.12 Driver Version: 525.85.12 CUDA Version: 12.0
Build cuda_11.7.r11.7/compiler.31294372_0
Cuda works with vmd, gromacs, and maeastro
BACKGROUND
I have installed Kokkos with cmake using
cmake …/ -D Kokkos_ENABLE_CUDA=ON -DKokkos_ARCH_AMPERE86=ON -D Kokkos_ENABLE_CUDA_LAMBDA=ON -D Kokkos_ENABLE_OPENMP=yes …/cmake
=============================
with this installation I can compile lammps to run with kokkos with no gpu using
cmake -C …/cmake/presets/most.cmake -D Kokkos_ENABLE_OPENMP=yes -D PKG_KOKKOS=yes -D Kokkos_ENABLE_THREADS=ON …/cmake
and for example
mpirun -np 1 lmp-k -k on t 16 -sf kk -in in.rhodo
==============================
Similarly without the invoking kokkos I can use cmake to run with a gpu
cmake -C …/cmake/presets/most.cmake -D PKG_GPU=on -D GPU_API=cuda -DCMAKE_CUDA_ARCHITECTURES=86 …/cmake
with for example mpirun -np 1 lmp -sf gpu -pk gpu 1 -in in.lj works as expected
ISSUE:
However If try to complile lammps and kokkos to use with a GPU RTX 3090 ( using presets = most or basic )
cmake -C …/cmake/presets/most.cmake -D PKG_GPU=on -D Kokkos_ENABLE_CUDA=yes -D PKG_KOKKOS=yes …/cmake
or with the cmake command
cmake -C …/cmake/presets/most.cmake -D PKG_GPU=on -D GPU_API=cuda -DCMAKE_CUDA_ARCHITECTURES=86 -D PKG_KOKKOS=yes -D Kokkos_ENABLE_CUDA=ON -D Kokkos_ENABLE_THREADS=ON …/cmake
complilation proceeds ( with multiple warnings ) then after 100% is reached and the following error appears:
…
…
tmp/ccDhsuWv.s:4323155: Error: symbol fatbinData’ is already defined ( many lines ) /tmp/ccDhsuWv.s:4327364: Error: symbol fatbinData’ is already defined
lto-wrapper: fatal error: /usr/bin/c++ returned 1 exit status
compilation terminated.
lammps build.txt
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/lmp.dir/build.make:117: lmp] Error 1
make[1]: *** [CMakeFiles/Makefile2:391: CMakeFiles/lmp.dir/all] Error 2
make: *** [Makefile:136: all] Error 2
I can find little information on fatbin Data is already defined.
Cuda was installed via the deb local from Nvidia and I gather that the lto wrapper is critcial
The full output of cmake for lammps is attached.
lammps build.txt
In essence any combination of package inclusion ends in failure I I try to invoke a gpu and Kokkos
any tips would be appreciated.
====
I have the full output for the builds , but as a new user I do not have the privilege of uploading those
============= prior GitHup data/suggestions ===========================
akohlm… assigned stanmoore1 Apr 24, 2023
added the kokkos_package label Apr 24, 2023
added this to LAMMPS Bug Reports Apr 24, 2023
moved this to High Priority Bugs in LAMMPS Bug Reports Apr 24, 2023
Contributor
stanmoore1 commented Apr 25, 2023
Hi I have not encountered this error before but somehow link time optimization (LTO) is getting enabled. According to Link-time optimization with CUDA on Linux (-flto) - #2 by wlangdon - CUDA Programming and Performance - NVIDIA Developer Forums, can you try using -Xcompiler -fno-lto and see if that fixes the issue?
stanmoore1
Contributor
commented Apr 25, 2023
Alternatively can you try adding the -dlto flag to enable LTO on device, see https://developer.nvidia.com/blog/improving-gpu-app-performance-with-cuda-11-2-device-lto/ and 811162 – media-libs/opencv-4.5.2-r1 lto-wrapper: fatal error.
stanmoore1 added the cmake label Apr 25, 2023
Oh also a suggestion from XXweinbe2: try using Kokkos nvcc_wrapper as the compiler, something like this:*****************************
-D CMAKE_CXX_COMPILER=$(pwd)/…/lib/kokkos/bin/nvcc_wrapper
Contributor
sta commented Apr 25, 2023
If you aren’t using nvcc_wrapper then that is mostly likely the cause of the issue.
stanmoore1 added the invalid label Apr 26, 2023
moved this from High Priority Bugs to Done in LAMMPS Bug Reports Apr 26, 2023
Contributor
stanmoore1 commented Apr 26, 2023
I’m pretty sure that using nvcc_wrapper will fix your issue. <<<<*************** this did not fix the error ***********
In any case, since this is not a bug in LAMMPS but rather a build question I will close this issue on GitHub. Feel free to continue the discussion on MatSci: LAMMPS - Materials Science Community Discourse if you need more help. Thanks
sta closed this as completed Apr 26, 2023
Auth commented Apr 26, 2023
running kokkos with gpu #3751
Closed
busce004 opened this issue Apr 24, 2023 · 6 comments
Closed
running kokkos with gpu
#3751
busce opened this issue Apr 24, 2023 · 6 comments
Comments
commented Apr 24, 2023
==================== I’ve not rebuilt cuda ============== it functions with other MD programs, VMD, Maestro, gromacs
To use device LTO, add the option -dlto to both the compilation and link commands as shown below. Skipping the -dlto option from either of these two steps affects your results.
Compilation of cuda source files with -dlto option:
nvcc -dc -dlto *.cu
Linking of cuda object files with -dlto option:
nvcc -dlto *.o
Using -dlto option at compile time instructs the compiler to store a high-level intermediate representation (NVVM-IR) of the device code being compiled into the fatbinary. The -dlto option at link time will instruct the linker to retrieve the NVVM IR from all the link objects and merge them together into a single IR and perform optimization on the resulting IR for code generation. Device LTO works with any supported SM arch target.