Could not find "rocthrust" when configure kokkos

Hello,

I want to use kokkos with HIP to accelerate ReaxFF, but failured when configured kokkos. The only change is revising Kokkos_ARCH_ to Kokkos_ARCH_NAVI1030.

ROCm version: 5.7.50700, LAMMPS version: 17Apr2024. Graphic: RX 6650xt.

My configuration setting:

cmake -C ../cmake/presets/basic.cmake -C ../cmake/presets/kokkos-hip.cmake -D GPU_API=HIP -D HIP_ARCH=gfx1030 -D CMAKE_CXX_COMPILER=hipcc -D HIP_PATH=/opt/rocm/hip/bin -D rocthrust_DIR=/opt/rocm/rocthrust -D PKG_REAXFF=on -D PKG_OPENMP=on ../cmake

Here is the full message:

loading initial cache file ../cmake/presets/basic.cmake
loading initial cache file ../cmake/presets/kokkos-hip.cmake
-- The CXX compiler identification is Clang 17.0.0
-- Check for working CXX compiler: /usr/bin/hipcc
-- Check for working CXX compiler: /usr/bin/hipcc -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Could NOT find Git (missing: GIT_EXECUTABLE) 
-- Running check for auto-generated files from make-based build system
-- Found MPI_CXX: /usr/local/lib/libmpicxx.so (found version "4.1") 
-- Found MPI: TRUE (found version "4.1")  
-- Looking for C++ include omp.h
-- Looking for C++ include omp.h - found
-- Found OpenMP_CXX: -fopenmp=libomp  
-- Found OpenMP: TRUE  found components: CXX 
-- Found GZIP: /usr/bin/gzip  
-- Could NOT find FFMPEG (missing: FFMPEG_EXECUTABLE) 
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.1") 
-- Checking for module 'fftw3'
--   No package 'fftw3' found
-- Looking for C++ include cmath
-- Looking for C++ include cmath - found
-- Setting default Kokkos CXX standard to 17
-- Kokkos version: 4.3.0
-- The project name is: Kokkos
-- Using internal gtest for testing
-- Compiler Version: 5.7.31921
-- Using -std=c++17 for C++17 standard as feature
-- Built-in Execution Spaces:
--     Device Parallel: Kokkos::HIP
--     Host Parallel: Kokkos::OpenMP
--       Host Serial: SERIAL
-- 
-- Architectures:
--  NAVI1030
-- Found TPLLIBDL: /usr/include  
CMake Error at /home/lch/Software/lammps/lammps-17Apr2024/lib/kokkos/cmake/Modules/FindTPLROCTHRUST.cmake:11 (FIND_PACKAGE):
  By not providing "Findrocthrust.cmake" in CMAKE_MODULE_PATH this project
  has asked CMake to find a package configuration file provided by
  "rocthrust", but CMake did not find one.

  Could not find a package configuration file provided by "rocthrust" with
  any of the following names:

    rocthrustConfig.cmake
    rocthrust-config.cmake

  Add the installation prefix of "rocthrust" to CMAKE_PREFIX_PATH or set
  "rocthrust_DIR" to a directory containing one of the above files.  If
  "rocthrust" provides a separate development package or SDK, be sure it has
  been installed.
Call Stack (most recent call first):
  /home/lch/Software/lammps/lammps-17Apr2024/lib/kokkos/cmake/kokkos_functions.cmake:293 (FIND_PACKAGE)
  /home/lch/Software/lammps/lammps-17Apr2024/lib/kokkos/cmake/kokkos_tpls.cmake:93 (KOKKOS_IMPORT_TPL)
  /home/lch/Software/lammps/lammps-17Apr2024/lib/kokkos/cmake/kokkos_tribits.cmake:208 (INCLUDE)
  /home/lch/Software/lammps/lammps-17Apr2024/lib/kokkos/CMakeLists.txt:226 (KOKKOS_SETUP_BUILD_ENVIRONMENT)


-- Configuring incomplete, errors occurred!
See also "/home/lch/Software/lammps/lammps-17Apr2024/build/CMakeFiles/CMakeOutput.log".
See also "/home/lch/Software/lammps/lammps-17Apr2024/build/CMakeFiles/CMakeError.log".

I have no idea about this error, any suggestion is appreciated. Thank you!

Best regard

CMakeError.log (9.5 KB)
CMakeOutput.log (65.9 KB)

I asked about this on the Kokkos Slack channel and here is the reply:

Most of the time rocThrust is installed with ROCm but not always.
It seems that their installation doesn't have rocThrust.
They need to use -DKokkos_ENABLE_ROCTHRUST=OFF.

Thank you very much! I successfully compiled LAMMPS using -DKokkos_ENABLE_ROCTHRUST=OFF. Actually, I tried to install different version of ROCm, the latest 6.1.1 and now 5.7.0. Both showed the same error as presented in this topic. When I tried sudo apt install rocthrust and it showed rocthrust is already the latest version (2.18.0.50700-63~20.04). I could see rocthrust folder in /opt/rocm, but I am not sure whether it successfully installed. There are a lot of header files in /opt/rocm/rocthrust/include/thrust.

In addition, I can successfully compiled it using LAMMPS-stable too with no -DKokkos_ENABLE_ROCTHRUST=OFF command. But no matter which lammps I use, when I try to use kokkos, I will get error:

Memory access fault by GPU node-1 (Agent handle: 0xd2330d0) on address 0x7faf612a5000. Reason: Page not present or supervisor privilege.

Command:lmp -in in.CHO -sf kk -k on g 1 -pk kokkos newton on neigh half.
The same when I use ROCm 6.1.1 too.

Can you try running the standard Lennard-Jones benchmark, i.e. lammps/bench/in.lj and see if the issue persists?

Hi. I tried to run /bench/in.lj with kokkos and the same error occured.
For the rocthrust problem, I added -D CMAKE_PREFIX_PATH=/opt/rocm/lib/cmake in configuration command and now it could successfully compiled.

Unfortunately, new error occured when I tried to run ReaxFF example (other example showed same error too). It happened in every lammps-17Apr2024. And it also occured after I upgraded the kokkos version (4.1.0-4.3.1) for lammps-stable. Here is the log:

$ lmp -in in.CHO -sf kk -k on g 1 -pk kokkos newton on neigh half
LAMMPS (2 Aug 2023 - Update 3)
KOKKOS mode with Kokkos version 4.x.x is enabled (src/KOKKOS/kokkos.cpp:108)
  will use up to 1 GPU(s) per node
Kokkos::OpenMP::initialize WARNING: OMP_PROC_BIND environment variable not set
  In general, for best performance with OpenMP 4.0 or better set OMP_PROC_BIND=spread and OMP_PLACES=threads
  For best performance with OpenMP 3.1 set OMP_PROC_BIND=true
  For unit testing set OMP_PROC_BIND=false

  using 1 OpenMP thread(s) per MPI task
terminate called after throwing an instance of 'std::runtime_error'
  what():  hipFuncGetAttributes(&attr, kernel_func) error( hipErrorInvalidKernelFile): invalid kernel file /home/lch/Software/lammps-stable/lammps-2Aug2023/lib/kokkos/core/src/HIP/Kokkos_HIP_KernelLaunch.hpp:189

In summary, using kokkos older than 4.1.0 will generator Memory access fault, using version newer than 4.1.0 will generator hipFuncGetAttributes error.

I think RX 6650x should use gfx1032 not gfx1030, see Accelerator and GPU hardware specifications — ROCm Documentation. Seems like Kokkos doesn’t officially support gfx1032, but you may be able to get around it compiling with --offload-arch=gfx1032 manually.

Thank you. It will generate error in configuration if I change NAVI1030 to NAVI1032 or GFX1032:

CMake Error at /home/lch/Software/lammps/lammps-17Apr2024/lib/kokkos/cmake/kokkos_arch.cmake:1014 (MESSAGE):
  HIP enabled but no automatically detected AMD GPU architecture is
  supported.  Please manually specify one AMD GPU architecture via
  -DKokkos_ARCH_{..}=ON'.
Call Stack (most recent call first):
  /home/lch/Software/lammps/lammps-17Apr2024/lib/kokkos/cmake/kokkos_tribits.cmake:204 (INCLUDE)
  /home/lch/Software/lammps/lammps-17Apr2024/lib/kokkos/CMakeLists.txt:226 (KOKKOS_SETUP_BUILD_ENVIRONMENT)

Maybe it’s because kokkos can only support GFX1030 GPU as you mentioned. I will find another AMD GPU to have a try. Thank you for your patient reply!