LAMMPS loop fails in GPU accelerate

Wangdb · October 14, 2022, 8:10am

Hello lammps users,

I am using the command lines combination “variable loop + label + next + jump” to carry out the loop of the same simulation. However, I found the simulation can not be finished with GPU-accelerate, while could be finished without GPU. The simulation exits with nothing error information when it used with GPU. The input and data files are attached as follow. Could you give me some suggestions about fixing the error in GPU-accelerate.

Dongbo Wang.
in.test (6.5 KB)
0.data (1.2 MB)

akohlmey · October 14, 2022, 8:36am

What version of LAMMPS do you use, what kind of GPU, and how did you compile the GPU package?

Wangdb · October 14, 2022, 8:55am

Thanks for your reply. I used LAMMPS version 29 Oct 2020, and the device of GPU is [GeForce RTX 3080 Lite Hash Rate], cuda version is cuda-11.4. I compiled the GPU-lib through modifying and making the makefile Makefile.linux as attachment.
Makefile.linux (1.7 KB)

akohlmey · October 14, 2022, 9:02am

Please try with a more recent version of LAMMPS.

Wangdb · October 14, 2022, 1:24pm

I have installed LAMMPS-23Jun2022 with GPU accelerate. But an new error is triggered. The poteng and toteng of the system is inf, and the simulation exits with “Non-numeric atom coords - simulation unstable”. And the same simulation could finished successfully without GPU accelerate. I used the Makefile.icc_openmpi to make the LAMMPS, and the icc version is 2021.4.0 from oneapi, the openmpi version is 4.1.4. So, which step I have done is wrong ?

akohlmey · October 14, 2022, 1:28pm

The Intel compilers occasionally miscompile LAMMPS and also the link between icc and nvcc is not great. I would recommend to use GNU compilers instead.

Wangdb · October 14, 2022, 2:38pm

OK. I have compiled LAMMPS with g++ using the makefile as attachment. However, I am not sure whether it is compiled successfully with GNU compilers or not. Using the compiled lammps, the simulation still exits with the same error “Non-numeric atom coords - simulation unstable” with GPU, while it works without GPU accelerate. So, could you give me some other suggestions ?
Makefile.mpi (3.2 KB)

akohlmey · October 14, 2022, 3:00pm

I cannot reproduce it on my desktop with the 15 September 2022 version. But I am running with only one MPI process (I have only an Intel HD 620 internal GPU, which I can use via OpenCL, compiled in mixed precision).

One thing to try is to add -pk gpu 0 pair/only yes to the command line (or change your input file to have suffix off at the beginning and then suffix on before the pair style command and suffix off after), so that only the pair style is used for GPU acceleration. This may address the instability, but also can run faster than trying to accelerate both, pair style and kspace. It does on my machine, even with one MPI and one GPU.