I am currently facing an error as shown below while running a friction simulation using the KOKKOS package with a NVIDIA RTX 4090 GPU.
"cudaStreamSynchronize(stream) error( cudaErrorIllegalAddress): an illegal memory access was encountered /home/name/lammps-22Jul2025/lib/kokkos/core/src/Cuda/Kokkos_Cuda_Instance.cpp:165 Backtrace: [0x64ad65f2d389] [0x64ad65f09bb0] [0x64ad65f33216] [0x64ad65f33bb9] [0x64ad65cf3d91] [0x64ad65cf4138] [0x64ad65d065f2] [0x64ad65d08865] [0x64ad652d62dd] [0x64ad644b0e14] [0x64ad63e9e727] [0x64ad63d6737b] [0x64ad63d67d7f] [0x64ad63ccecb1] [0x76a65ea2a1ca] [0x76a65ea2a28b] __libc_start_main [0x64ad63d5a915]
**[DESKTOP-20TF71N:192096] *** Process received signal ***** [DESKTOP-20TF71N:192096] Signal: Aborted (6) [DESKTOP-20TF71N:192096] Signal code: (-6) [DESKTOP-20TF71N:192096] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x45330)[0x76a65ea45330] [DESKTOP-20TF71N:192096] [ 1] /lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x11c)[0x76a65ea9eb2c] [DESKTOP-20TF71N:192096] [ 2] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x1e)[0x76a65ea4527e] [DESKTOP-20TF71N:192096] [ 3] /lib/x86_64-linux-gnu/libc.so.6(abort+0xdf)[0x76a65ea288ff] [DESKTOP-20TF71N:192096] [ 4] lmp(+0x2469bbd)[0x64ad65f09bbd] [DESKTOP-20TF71N:192096] [ 5] lmp(+0x2493216)[0x64ad65f33216] [DESKTOP-20TF71N:192096] [ 6] lmp(+0x2493bb9)[0x64ad65f33bb9] [DESKTOP-20TF71N:192096] [ 7] lmp(+0x2253d91)[0x64ad65cf3d91] [DESKTOP-20TF71N:192096] [ 8] lmp(+0x2254138)[0x64ad65cf4138] [DESKTOP-20TF71N:192096] [ 9] lmp(+0x22665f2)[0x64ad65d065f2] [DESKTOP-20TF71N:192096] [10] lmp(+0x2268865)[0x64ad65d08865] [DESKTOP-20TF71N:192096] [11] lmp(+0x18362dd)[0x64ad652d62dd] [DESKTOP-20TF71N:192096] [12] lmp(+0xa10e14)[0x64ad644b0e14] [DESKTOP-20TF71N:192096] [13] lmp(+0x3fe727)[0x64ad63e9e727] [DESKTOP-20TF71N:192096] [14] lmp(+0x2c737b)[0x64ad63d6737b] [DESKTOP-20TF71N:192096] [15] lmp(+0x2c7d7f)[0x64ad63d67d7f] [DESKTOP-20TF71N:192096] [16] lmp(+0x22ecb1)[0x64ad63ccecb1] [DESKTOP-20TF71N:192096] [17] /lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca)[0x76a65ea2a1ca] [DESKTOP-20TF71N:192096] [18] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b)[0x76a65ea2a28b] [DESKTOP-20TF71N:192096] [19] lmp(+0x2ba915)[0x64ad63d5a915]
**[DESKTOP-20TF71N:192096] *** End of error message ***** -------------------------------------------------------------------------- prterun noticed that process rank 0 with PID 192096 on node DESKTOP-20TF71N exited on signal 6 (Aborted). --------------------------------------------------------------------------"
The current LAMMPS version I am using is Jul22-2025 version
and running in UBUNTU 24.04
The CUDA version is 12.6
And these are the packages I have installed.
“cmake -C …/cmake/presets/basic.cmake -C …/cmake/presets/kokkos-cuda.cmake …/cmake
cmake -D Kokkos_ENABLE_CUDA=yes -D Kokkos_ENABLE_OPENMP=yes -D PKG_KOKKOS=yes -D PKG_MEAM=on -D PKG_MOLECULE=on -D PKG_OPENMP=yes -D GPU_API=cuda -D GPU_ARCH=sm_89 …/cmake”
My input keywords to start the simulation is (just in case)
“mpirun -np 1 lmp -k on g 1 -sf kk -pk kokkos neigh half newton on -in Simulation.lmp”
I have searched this error from here and tried to cool down my GPU when using, (maintaining at about 37~45 degree celsius) but still this error appears every so often.
I still can’t find out why this happens, so it will be grateful if anyone has suggestions or has seen this before.
As far as I read the discussion, the conclusion was not the heat, but one defective GPU (out of 4).
The error is a very generic error from a low level library, so it is very difficult to give any suggestions without the ability to reproduce the error or knowing any details about your simulation.
Some questions:
you say you are using the 22 July 2025 version. Is that the original release or the update?
does the same error happen with other input decks, e.g. the LAMMPS bench inputs or some of the examples, or only with this one input?
does your simulation run to completion without errors, when you are not using KOKKOS?
It was the original release. Did not check it had an update. Should I try the updated version?
It happens only to these (Indentation/Friction) kind of simulation. I have done modeling a DLC using the liquid-quenching method with the same potential files, parameter, and KOKKOS, and it ran without error.
Yes, though the amount of atoms and the size of the simulation was different (past : 4,000 atoms, current : 20,000 atoms), it ran fine using the CPU.
You have to check which bugs are fixed. If there is no mention of bugfixes in the KOKKOS package, then the chance is small that it will address your problem.
This only counts, if you run the exact same simulation. The issue could be triggered by your starting configuration.
This error is basically the same as a segmentation fault on the CPU, and is typically due to either an out of bounds memory access or trying to access host memory inside a device kernel. I will try to reproduce on H100 when I get a chance.
Currently what I am trying to do a is a friction(sliding) simulation.
A Si tip sliding on the surface of a Zr doped Carbon substrate.
The atoms used are : C, Zr, Si
I have used the hybrid pair style as follows
C-C, C-Zr, Zr-Zr : MEAM potential
Si-Si : Tersoff potential
C-Si, Zr-Si : LJ potential
Si tip :
A hemisphere fixed or moving using the move linear keyword.
The normal load/indentation force I am trying to give is 150 nN (approximately 93.59 eV/A)
Indentation and sliding speed : 0.1 A/ps
timestep : 0.25 fs
]
I am trying to speed up the simulation using the KOKKOS package(compiled to a Geforce RTX4090 GPU).
What I first encountered is that the above
Cuda: Illegal memory access pops up when the simulation is going through the indentation/sliding (random but mostly at the indentation step) step.
I have ran the same simulation with CPU and it runs fine without an error.
In addition I have tried
increasing the neighbor list,
slower indentation speed(0.05 A/ps)
reduced the timestep(0.1 fs)
But all have them shows the same error at the indentation/sliding step when using the KOKKOS package.
If there are any other steps I should take or any information you require, please let me know and I’ll respond as soon as possible. Thank you for your consideration.