GPU package dose not really speed

Dear ALL
I am currently trying to run lammps with GPU package but there is something I cannot really figure out. May I ask anyone who have the idea to figure it out?
the version of lammps is “28 Mar 2023 version”; GPU is RTX 4090.
ubuntu is 20.04; CUDA Toolkit 11.8 (by the way, CUDA 12.0 and 12.1 will give errors, have no idea for this); NVIDIA GPU Driver 530.30.02
I tried to modify the Makefile.linux in three parts:
(1) CUDA_HOME = /usr/local/cuda-11.8/bin/nvcc
(2) NVCC = /usr/local/cuda-11.8/bin/nvcc
(3) CUDA_ARCH = -arch = sm_89;

then I make -f Makefile.liunx
and i used ./nvc_get_device, and detect gpu correct (attached image)

then I compile make yes-gpu, make yes-bpm, make yes-granular make mpi and everthing is smooth.

then i tried to run my input script with GPU, but get the error
“Cannot use neigh_modify exclude with GPU neighbor builds”. although i do not know how to figure this out until now, but some website said may I can let only CPU to build neighbour. But al least, i can see this GPU tried to help to run the simulations;

Then I deleted the neigh_modifyexclude so that to check the performance of GPU. Then weird things happened, that with or without GPU, it used 100% same time to finish the works. it seems that GPU did not really help!!! (attached images)

it is really confused to me. dose I make any mistakes when I tried to use GPU to speed?

i will appreciate the help if anyone can have the idea for this case.

best regards


does anyone has the idea how to figure it out?

Have you checked that the styles you are using have GPU acceleration?

What errors?

What website?
The error message is pretty self-explanatory and it should be straightforward to solve this issue after properly studying the LAMMPS documentation for the individual commands involved and the sections specific written to explain how to use accelerated styles in LAMMPS.

Neither package has styles that are accelerated by the GPU package. Please study the LAMMPS documentation more carefully. The information you need is all there.

Hi Thanks for your reponse, it is really helpful, I can see your point.
thanks for your suggestions and i tried to check the documentation carefully.
but i still feel confused for many points.

For instace, for the granular package;
i can see this gran/hooke/history/kk with the “kk”, so maybe i should use [KOKKOS Package] with GPU to speed up, is that right?The GPU package is not really effective.

Then for [pair_style granular command], i cannot identify any suffix, so I should assume that for [pair_style granular command], we do not have any way to speed up.

but then i need use " fix deform command" for this “pair_style granular”, and I can see that “Accelerator Variants: deform/kk”. So if i understand this right, fix of deform can be accelerated by [KOKKOS Package] with GPU.

Then its the confused point, if I used [pair_style granular command] which does not support any Accelerators, while I used “fix deform command” which support [KOKKOS Package] with GPU. Can I really improve the speed with help of GPU? (KOKKOS package).

I assume situation (1) pair style, compute and fix and totally separete; although no Accelerators of one pair style, fix deform/kk can still be Accelerated by KOKKOS package ?

situation (2) if the used pair style has no Accelerators; any fix and compute command related to this used pair style are not useful any more?

My apologies for all these long questions, it justs really confused to me. would you mind sharing which situation should be right case ?

thanks for your time and help.

For the error “Cannot use neigh_modify exclude with GPU neighbor builds”
i google our MATSCI community discourse, there is people has same problem;

he stated" I could solve it with package gpu 2 neigh no. Now the neighbour list is built on the CPU instead of the GPU." therefore i assume this could be a option to figure this out

the error is “Cuda driver error 1 in call at file ‘geryon/nvd_kernel.h’ in line 333” in my case when i tried to use CUDA 12.1 and 12.0;
i assume this error is really similar to this one

following the suggestions from that website, i tried to downgrade CUDA to 11.8, and it works. I therefore did not try to find the reason why i get the error for CUDA 12.1

There are too many questions to answer them individually.

Bottom line: there is no chance for your simulation to benefit from GPU acceleration.

Here is why:

  • The GPU package primarily accelerates the Pair part of the calculation and the neighbor list builds.
    The Pair part is not significant in your simulation and the neighbor list settings are not compatible.
  • The KOKKOS package will work most effectively when all styles have KOKKOS acceleration, so data does not have to be moved between GPU and host several times during a step. You have many styles that are not supported.
  • The KOKKOS package supports only double precision operations. Since you have a consumer grade GPU, your double precision support is significantly crippled and thus your GPU acceleration potential is limited to begin with, even for situations where all styles support KOKKOS.

Hi akohlmey

Many thanks for your response and it is really helpful. i can see your point. It seems that i may not be able to benfit from GPU acceleration for my complicated simulations. While I may try to build this KOKKOS for my other relatively simple simulations.
For that The KOKKOS package supports only double precision operations. I can see this

hope maybe we have the chance to use the KOKKOS package of the single/mixed precision.
Thank you very much for your response, i am clear for the condition now.
hope you have a nice day!