Hello,
I want to use kokkos with HIP to accelerate ReaxFF, but failured when configured kokkos. The only change is revising Kokkos_ARCH_ to Kokkos_ARCH_NAVI1030.
ROCm version: 5.7.50700, LAMMPS version: 17Apr2024. Graphic: RX 6650xt.
My configuration setting:
cmake -C ../cmake/presets/basic.cmake -C ../cmake/presets/kokkos-hip.cmake -D GPU_API=HIP -D HIP_ARCH=gfx1030 -D CMAKE_CXX_COMPILER=hipcc -D HIP_PATH=/opt/rocm/hip/bin -D rocthrust_DIR=/opt/rocm/rocthrust -D PKG_REAXFF=on -D PKG_OPENMP=on ../cmake
Here is the full message:
loading initial cache file ../cmake/presets/basic.cmake
loading initial cache file ../cmake/presets/kokkos-hip.cmake
-- The CXX compiler identification is Clang 17.0.0
-- Check for working CXX compiler: /usr/bin/hipcc
-- Check for working CXX compiler: /usr/bin/hipcc -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Could NOT find Git (missing: GIT_EXECUTABLE)
-- Running check for auto-generated files from make-based build system
-- Found MPI_CXX: /usr/local/lib/libmpicxx.so (found version "4.1")
-- Found MPI: TRUE (found version "4.1")
-- Looking for C++ include omp.h
-- Looking for C++ include omp.h - found
-- Found OpenMP_CXX: -fopenmp=libomp
-- Found OpenMP: TRUE found components: CXX
-- Found GZIP: /usr/bin/gzip
-- Could NOT find FFMPEG (missing: FFMPEG_EXECUTABLE)
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.1")
-- Checking for module 'fftw3'
-- No package 'fftw3' found
-- Looking for C++ include cmath
-- Looking for C++ include cmath - found
-- Setting default Kokkos CXX standard to 17
-- Kokkos version: 4.3.0
-- The project name is: Kokkos
-- Using internal gtest for testing
-- Compiler Version: 5.7.31921
-- Using -std=c++17 for C++17 standard as feature
-- Built-in Execution Spaces:
-- Device Parallel: Kokkos::HIP
-- Host Parallel: Kokkos::OpenMP
-- Host Serial: SERIAL
--
-- Architectures:
-- NAVI1030
-- Found TPLLIBDL: /usr/include
CMake Error at /home/lch/Software/lammps/lammps-17Apr2024/lib/kokkos/cmake/Modules/FindTPLROCTHRUST.cmake:11 (FIND_PACKAGE):
By not providing "Findrocthrust.cmake" in CMAKE_MODULE_PATH this project
has asked CMake to find a package configuration file provided by
"rocthrust", but CMake did not find one.
Could not find a package configuration file provided by "rocthrust" with
any of the following names:
rocthrustConfig.cmake
rocthrust-config.cmake
Add the installation prefix of "rocthrust" to CMAKE_PREFIX_PATH or set
"rocthrust_DIR" to a directory containing one of the above files. If
"rocthrust" provides a separate development package or SDK, be sure it has
been installed.
Call Stack (most recent call first):
/home/lch/Software/lammps/lammps-17Apr2024/lib/kokkos/cmake/kokkos_functions.cmake:293 (FIND_PACKAGE)
/home/lch/Software/lammps/lammps-17Apr2024/lib/kokkos/cmake/kokkos_tpls.cmake:93 (KOKKOS_IMPORT_TPL)
/home/lch/Software/lammps/lammps-17Apr2024/lib/kokkos/cmake/kokkos_tribits.cmake:208 (INCLUDE)
/home/lch/Software/lammps/lammps-17Apr2024/lib/kokkos/CMakeLists.txt:226 (KOKKOS_SETUP_BUILD_ENVIRONMENT)
-- Configuring incomplete, errors occurred!
See also "/home/lch/Software/lammps/lammps-17Apr2024/build/CMakeFiles/CMakeOutput.log".
See also "/home/lch/Software/lammps/lammps-17Apr2024/build/CMakeFiles/CMakeError.log".
I have no idea about this error, any suggestion is appreciated. Thank you!
Best regard
CMakeError.log (9.5 KB)
CMakeOutput.log (65.9 KB)
I asked about this on the Kokkos Slack channel and here is the reply:
Most of the time rocThrust is installed with ROCm but not always.
It seems that their installation doesn't have rocThrust.
They need to use -DKokkos_ENABLE_ROCTHRUST=OFF.
Thank you very much! I successfully compiled LAMMPS using -DKokkos_ENABLE_ROCTHRUST=OFF. Actually, I tried to install different version of ROCm, the latest 6.1.1 and now 5.7.0. Both showed the same error as presented in this topic. When I tried sudo apt install rocthrust
and it showed rocthrust is already the latest version (2.18.0.50700-63~20.04). I could see rocthrust folder in /opt/rocm, but I am not sure whether it successfully installed. There are a lot of header files in /opt/rocm/rocthrust/include/thrust.
In addition, I can successfully compiled it using LAMMPS-stable too with no -DKokkos_ENABLE_ROCTHRUST=OFF command. But no matter which lammps I use, when I try to use kokkos, I will get error:
Memory access fault by GPU node-1 (Agent handle: 0xd2330d0) on address 0x7faf612a5000. Reason: Page not present or supervisor privilege.
Command:lmp -in in.CHO -sf kk -k on g 1 -pk kokkos newton on neigh half
.
The same when I use ROCm 6.1.1 too.
Can you try running the standard Lennard-Jones benchmark, i.e. lammps/bench/in.lj
and see if the issue persists?
Hi. I tried to run /bench/in.lj
with kokkos and the same error occured.
For the rocthrust problem, I added -D CMAKE_PREFIX_PATH=/opt/rocm/lib/cmake
in configuration command and now it could successfully compiled.
Unfortunately, new error occured when I tried to run ReaxFF example (other example showed same error too). It happened in every lammps-17Apr2024. And it also occured after I upgraded the kokkos version (4.1.0-4.3.1) for lammps-stable. Here is the log:
$ lmp -in in.CHO -sf kk -k on g 1 -pk kokkos newton on neigh half
LAMMPS (2 Aug 2023 - Update 3)
KOKKOS mode with Kokkos version 4.x.x is enabled (src/KOKKOS/kokkos.cpp:108)
will use up to 1 GPU(s) per node
Kokkos::OpenMP::initialize WARNING: OMP_PROC_BIND environment variable not set
In general, for best performance with OpenMP 4.0 or better set OMP_PROC_BIND=spread and OMP_PLACES=threads
For best performance with OpenMP 3.1 set OMP_PROC_BIND=true
For unit testing set OMP_PROC_BIND=false
using 1 OpenMP thread(s) per MPI task
terminate called after throwing an instance of 'std::runtime_error'
what(): hipFuncGetAttributes(&attr, kernel_func) error( hipErrorInvalidKernelFile): invalid kernel file /home/lch/Software/lammps-stable/lammps-2Aug2023/lib/kokkos/core/src/HIP/Kokkos_HIP_KernelLaunch.hpp:189
In summary, using kokkos older than 4.1.0 will generator Memory access fault
, using version newer than 4.1.0 will generator hipFuncGetAttributes error
.
I think RX 6650x should use gfx1032 not gfx1030, see Accelerator and GPU hardware specifications — ROCm Documentation. Seems like Kokkos doesn’t officially support gfx1032, but you may be able to get around it compiling with --offload-arch=gfx1032
manually.
Thank you. It will generate error in configuration if I change NAVI1030 to NAVI1032 or GFX1032:
CMake Error at /home/lch/Software/lammps/lammps-17Apr2024/lib/kokkos/cmake/kokkos_arch.cmake:1014 (MESSAGE):
HIP enabled but no automatically detected AMD GPU architecture is
supported. Please manually specify one AMD GPU architecture via
-DKokkos_ARCH_{..}=ON'.
Call Stack (most recent call first):
/home/lch/Software/lammps/lammps-17Apr2024/lib/kokkos/cmake/kokkos_tribits.cmake:204 (INCLUDE)
/home/lch/Software/lammps/lammps-17Apr2024/lib/kokkos/CMakeLists.txt:226 (KOKKOS_SETUP_BUILD_ENVIRONMENT)
Maybe it’s because kokkos can only support GFX1030 GPU as you mentioned. I will find another AMD GPU to have a try. Thank you for your patient reply!