Granular pair styles and GPUs

lammps_user2 · June 26, 2025, 3:22pm

Dear Community,

I understand that there is no GPU implementation of pair_style granular.

I am interested in implementing a CUDA version of the pair_style granular.

Is there any specific reason why pair_style granular hasn’t been implemented for GPU? Are granular pair styles with tangential forces inefficient when implemented on GPU?

Thanks!

stamoor · June 26, 2025, 3:53pm

The KOKKOS accelerator package has support for pair_style gran_hooke_history, and it can run the chute benchmark, see lammps/bench/in.chute at b7be53f3fcfdcc25f3e62a06d1fe624d4b20134c · lammps/lammps · GitHub.

However you are correct that there is no support for pair_style granular, I would suggest adding a Kokkos version instead of CUDA.

lammps_user2 · June 26, 2025, 4:21pm

Thanks for the reply.

I am trying to integrate the DEM model with a Lattice Boltzmann simulation which has an implementation in CUDA.

This is reason why I would like to implement CUDA version.

I will think about how I can use the KOKKOS version.

akohlmey · June 26, 2025, 5:21pm

You would be on your own with that.

I strongly suggest that you make some benchmarks to determine how much time is spent on the LB implementation and how much on DEM for a typical system. It is very difficult to gain significant speedup from DEM models, since there are very few work units due to the limited number of neighbors and the complex logic with the different models. The granular pair style is particularly complex with all the submodels that it supports and thus challenging to port to GPU acceleration (regardless of the framework you would be using for that).
I would suspect, that you can run quite efficiently, in case you run the LB code on the GPU while running DEM concurrently on the CPU.