Dear LAMMPS developers and users,
I am encountering a fatal KOKKOS GPU error when running pair_allegro. The simulation crashes immediately at the very beginning of the run.
My environment:
LAMMPS version: 11 Feb 2026
Kokkos version: 5.0.2
CUDA version: 12.4
PyTorch version: 2.6.0
My log is shown as below.
Internal error!
Thu May 14 20:18:24 CST 2026
Writing to /root/.config/pip/pip.conf
LAMMPS (11 Feb 2026)
KOKKOS mode with Kokkos version 5.0.2 is enabled
using double precision
using view layout = legacy
will use up to 1 GPU(s) per node
Kokkos::OpenMP::initialize WARNING: OMP_PROC_BIND environment variable not set
In general, for best performance with OpenMP 4.0 or better set OMP_PROC_BIND=spread and OMP_PLACES=threads
For best performance with OpenMP 3.1 set OMP_PROC_BIND=true
For unit testing set OMP_PROC_BIND=false
using 1 OpenMP thread(s) per MPI task
package kokkos
package kokkos newton on neigh half
Authors: Anders Johansson, Marc Descoteaux
variable L index 3
variable STRUCTURE index BaTiO3-sc333-expt
variable PERIOD index 100000
variable EQUILSTEPS index 5000
variable TEMP index 300
variable MODEL index BaTiO3.nequip
variable SEED index 1
variable ITER index 1
log Logs_{ITER}/log.equilibrate_{MODEL}{STRUCTURE}_{TEMP}L_{ITER}${PERIOD}
log Logs_1/log.equilibrate{MODEL}_{STRUCTURE}${TEMP}L_{ITER}${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip{STRUCTURE}_{TEMP}L_{ITER}${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_${TEMP}L_{ITER}${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_L_{ITER}${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_3{ITER}_{PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_3_1_${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_3_1_100000
units metal
atom_style atomic
read_data ${STRUCTURE}.data
read_data BaTiO3-sc333-expt.data
Reading data file …
orthogonal box = (0 0 0) to (11.97285 11.97285 12.1056)
1 by 1 by 1 MPI processor grid
reading atoms …
135 atoms
read_data CPU = 0.013 seconds
replicate $L $L $L
replicate 3 $L $L
replicate 3 3 $L
replicate 3 3 3
Replication is creating a 3x3x3 = 27 times larger system…
orthogonal box = (0 0 0) to (35.91855 35.91855 36.3168)
1 by 1 by 1 MPI processor grid
3645 atoms
replicate CPU = 0.006 seconds
pair_style allegro
NequIP/Allegro is using input precision d and output precision d
pair_coeff * * ${MODEL}.pth Ba O Ti
pair_coeff * * BaTiO3.nequip.pth Ba O Ti
NequIP/Allegro: Loading model from BaTiO3.nequip.pth
NequIP/Allegro: Freezing TorchScript model…
Type mapping:
NequIP/Allegro type | NequIP/Allegro name | LAMMPS type | LAMMPS name
0 | Ba | 1 | Ba
1 | Ti | 3 | Ti
2 | O | 2 | O
ti=0 tj=0 cut=5.00
ti=0 tj=1 cut=5.00
ti=0 tj=2 cut=5.00
ti=1 tj=0 cut=5.00
ti=1 tj=1 cut=5.00
ti=1 tj=2 cut=5.00
ti=2 tj=0 cut=5.00
ti=2 tj=1 cut=5.00
ti=2 tj=2 cut=5.00
mass 1 137.3
mass 2 15.9994
mass 3 47.9
timestep 0.002
compute polarization all allegro polarization 3
compute allegro will evaluate the quantity polarization of length 3
compute polarizability all allegro polarizability 9
compute allegro will evaluate the quantity polarizability of length 9
compute borncharges all allegro/atom born_effective_charges 9 1
compute allegro/atom will evaluate the quantity born_effective_charges of length 9 with newton 1
thermo_style custom pe fmax fnorm spcpu cpuremain
variable efield equal 1e-2*1.5
fix born all addbornforce 0.0 0.0 ${efield}
fix born all addbornforce 0.0 0.0 0.015
restart (v_EQUILSTEPS) ./Restarts_{ITER}/restart.equilibrate_{MODEL}_{STRUCTURE}${TEMP}L_{ITER}${PERIOD}.*
restart 5000 ./Restarts{ITER}/restart.equilibrate_{MODEL}{STRUCTURE}_{TEMP}L_{ITER}${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate{MODEL}_{STRUCTURE}${TEMP}L_{ITER}${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip{STRUCTURE}_{TEMP}L_{ITER}${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_${TEMP}L_{ITER}${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_L_{ITER}${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_3{ITER}_{PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_3_1_${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_3_1_100000.*
thermo 10
velocity all create {TEMP} {SEED} dist gaussian rot yes mom yes
velocity all create 300 ${SEED} dist gaussian rot yes mom yes
velocity all create 300 1 dist gaussian rot yes mom yes
fix nvt all nvt temp {TEMP} {TEMP} $(100dt)
fix nvt all nvt temp 300 {TEMP} (100dt)
fix nvt all nvt temp 300 300 $(100*dt)
fix nvt all nvt temp 300 300 0.2000000000000000111
run $(v_EQUILSTEPS)
run 5000
CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE
Your simulation uses code contributions which should be cited:
- KOKKOS package: https://doi.org/10.1145/3731599.3767498
The log file lists these citations in BibTeX format.
CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE
Neighbor list info …
update: every = 1 steps, delay = 0 steps, check = yes
max neighbors/atom: 2000, page size: 100000
master list distance cutoff = 7
ghost atom cutoff = 7
binsize = 7, bins = 6 6 6
1 neighbor lists, perpetual/occasional/extra = 1 0 0
(1) pair allegro/kk, perpetual
attributes: full, newton on, kokkos_device
pair build: full/bin/kk/device
stencil: full/bin/3d
bin: kk/device
Setting up Verlet run …
Unit style : metal
Current step : 0
Time step : 0.002
cudaStreamSynchronize(stream) error( cudaErrorIllegalAddress): an illegal memory access was encountered /opt/lammps/lib/kokkos/core/src/Cuda/Kokkos_Cuda_Instance.cpp:157
Backtrace:
[0x7f4c764e7e40] __libc_start_main
/input_lbg-428977-22644307/lbg-428977-22644307.sh: line 5: 62082 Aborted ./lmp_gpu -sf kk -k on g 1 t 1 -pk kokkos newton on neigh half -in in.equilibrate -echo screen
I would appreciate any solution or workaround.
My CMake command is:TORCH_CMAKE=(python -c 'import torch; print(torch.utils.cmake_prefix_path)') && cmake ../cmake -DCMAKE_PREFIX_PATH="{TORCH_CMAKE}" -DPKG_KOKKOS=ON -DKokkos_ENABLE_CUDA=ON -DKokkos_ARCH_VOLTA70=ON -DKokkos_ENABLE_SERIAL=ON -DKokkos_ENABLE_OPENMP=ON -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12.4 -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12.4/bin/nvcc -DMKL_INCLUDE_DIR=/tmp -DCMAKE_CXX_STANDARD=17 -DCMAKE_CXX_EXTENSIONS=OFF -DCMAKE_CXX_FLAGS=“-D_GLIBCXX_USE_CXX11_ABI=0” -DCMAKE_CUDA_FLAGS=“-D_GLIBCXX_USE_CXX11_ABI=0” -DKokkos_CXX_FLAGS=“-D_GLIBCXX_USE_CXX11_ABI=0”