KOKKOS GPU cudaErrorIllegalAddress with pair_allegro (LAMMPS 11Feb2026 + CUDA 12.4 + PyTorch 2.6)

Dear LAMMPS developers and users,

I am encountering a fatal KOKKOS GPU error when running pair_allegro. The simulation crashes immediately at the very beginning of the run.

My environment:
LAMMPS version: 11 Feb 2026
Kokkos version: 5.0.2
CUDA version: 12.4
PyTorch version: 2.6.0

My log is shown as below.

Internal error!
Thu May 14 20:18:24 CST 2026
Writing to /root/.config/pip/pip.conf
LAMMPS (11 Feb 2026)
KOKKOS mode with Kokkos version 5.0.2 is enabled
using double precision
using view layout = legacy
will use up to 1 GPU(s) per node
Kokkos::OpenMP::initialize WARNING: OMP_PROC_BIND environment variable not set
In general, for best performance with OpenMP 4.0 or better set OMP_PROC_BIND=spread and OMP_PLACES=threads
For best performance with OpenMP 3.1 set OMP_PROC_BIND=true
For unit testing set OMP_PROC_BIND=false

using 1 OpenMP thread(s) per MPI task
package kokkos
package kokkos newton on neigh half

Authors: Anders Johansson, Marc Descoteaux

variable L index 3
variable STRUCTURE index BaTiO3-sc333-expt
variable PERIOD index 100000
variable EQUILSTEPS index 5000
variable TEMP index 300
variable MODEL index BaTiO3.nequip
variable SEED index 1
variable ITER index 1
log Logs_{ITER}/log.equilibrate_{MODEL}{STRUCTURE}_{TEMP}L_{ITER}${PERIOD}
log Logs_1/log.equilibrate
{MODEL}_{STRUCTURE}${TEMP}L_{ITER}${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip
{STRUCTURE}_{TEMP}L_{ITER}${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_${TEMP}L_{ITER}${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_L_{ITER}${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_3
{ITER}_{PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_3_1_${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_3_1_100000

units metal
atom_style atomic

read_data ${STRUCTURE}.data
read_data BaTiO3-sc333-expt.data
Reading data file …
orthogonal box = (0 0 0) to (11.97285 11.97285 12.1056)
1 by 1 by 1 MPI processor grid
reading atoms …
135 atoms
read_data CPU = 0.013 seconds
replicate $L $L $L
replicate 3 $L $L
replicate 3 3 $L
replicate 3 3 3
Replication is creating a 3x3x3 = 27 times larger system…
orthogonal box = (0 0 0) to (35.91855 35.91855 36.3168)
1 by 1 by 1 MPI processor grid
3645 atoms
replicate CPU = 0.006 seconds

pair_style allegro
NequIP/Allegro is using input precision d and output precision d
pair_coeff * * ${MODEL}.pth Ba O Ti
pair_coeff * * BaTiO3.nequip.pth Ba O Ti
NequIP/Allegro: Loading model from BaTiO3.nequip.pth
NequIP/Allegro: Freezing TorchScript model…
Type mapping:
NequIP/Allegro type | NequIP/Allegro name | LAMMPS type | LAMMPS name
0 | Ba | 1 | Ba
1 | Ti | 3 | Ti
2 | O | 2 | O
ti=0 tj=0 cut=5.00
ti=0 tj=1 cut=5.00
ti=0 tj=2 cut=5.00
ti=1 tj=0 cut=5.00
ti=1 tj=1 cut=5.00
ti=1 tj=2 cut=5.00
ti=2 tj=0 cut=5.00
ti=2 tj=1 cut=5.00
ti=2 tj=2 cut=5.00
mass 1 137.3
mass 2 15.9994
mass 3 47.9

timestep 0.002
compute polarization all allegro polarization 3
compute allegro will evaluate the quantity polarization of length 3
compute polarizability all allegro polarizability 9
compute allegro will evaluate the quantity polarizability of length 9
compute borncharges all allegro/atom born_effective_charges 9 1
compute allegro/atom will evaluate the quantity born_effective_charges of length 9 with newton 1

thermo_style custom pe fmax fnorm spcpu cpuremain

variable efield equal 1e-2*1.5
fix born all addbornforce 0.0 0.0 ${efield}
fix born all addbornforce 0.0 0.0 0.015

restart (v_EQUILSTEPS) ./Restarts_{ITER}/restart.equilibrate_{MODEL}_{STRUCTURE}${TEMP}L_{ITER}${PERIOD}.*
restart 5000 ./Restarts
{ITER}/restart.equilibrate_{MODEL}{STRUCTURE}_{TEMP}L_{ITER}${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate
{MODEL}_{STRUCTURE}${TEMP}L_{ITER}${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip
{STRUCTURE}_{TEMP}L_{ITER}${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_${TEMP}L_{ITER}${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_L_{ITER}${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_3
{ITER}_{PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_3_1_${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_3_1_100000.*

thermo 10
velocity all create {TEMP} {SEED} dist gaussian rot yes mom yes
velocity all create 300 ${SEED} dist gaussian rot yes mom yes
velocity all create 300 1 dist gaussian rot yes mom yes
fix nvt all nvt temp {TEMP} {TEMP} $(100dt)
fix nvt all nvt temp 300 {TEMP} (100
dt)
fix nvt all nvt temp 300 300 $(100*dt)
fix nvt all nvt temp 300 300 0.2000000000000000111
run $(v_EQUILSTEPS)
run 5000

CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE

Your simulation uses code contributions which should be cited:

CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE

Neighbor list info …
update: every = 1 steps, delay = 0 steps, check = yes
max neighbors/atom: 2000, page size: 100000
master list distance cutoff = 7
ghost atom cutoff = 7
binsize = 7, bins = 6 6 6
1 neighbor lists, perpetual/occasional/extra = 1 0 0
(1) pair allegro/kk, perpetual
attributes: full, newton on, kokkos_device
pair build: full/bin/kk/device
stencil: full/bin/3d
bin: kk/device
Setting up Verlet run …
Unit style : metal
Current step : 0
Time step : 0.002
cudaStreamSynchronize(stream) error( cudaErrorIllegalAddress): an illegal memory access was encountered /opt/lammps/lib/kokkos/core/src/Cuda/Kokkos_Cuda_Instance.cpp:157
Backtrace:

0x5639126b6303
0x563912670c49
0x5639126bb9f5
0x563910ca15f9
0x5639126bb6b9
0x5639126be1cf
0x5639126bb70b
0x5639126bc1b1
0x5639126bdbea
0x5639122dbd59
0x5639122d990d
0x5639122d5ddd
0x5639122d0c23
0x5639122c645a
0x5639114e9002
0x563910c735be
0x563910a6735a
0x563910a63a9e
0x563910a5f783
0x7f4c764e7d90

[0x7f4c764e7e40] __libc_start_main

0x563910a5f615

/input_lbg-428977-22644307/lbg-428977-22644307.sh: line 5: 62082 Aborted ./lmp_gpu -sf kk -k on g 1 t 1 -pk kokkos newton on neigh half -in in.equilibrate -echo screen

I would appreciate any solution or workaround.

My CMake command is:TORCH_CMAKE=(python -c 'import torch; print(torch.utils.cmake_prefix_path)') && cmake ../cmake -DCMAKE_PREFIX_PATH="{TORCH_CMAKE}" -DPKG_KOKKOS=ON -DKokkos_ENABLE_CUDA=ON -DKokkos_ARCH_VOLTA70=ON -DKokkos_ENABLE_SERIAL=ON -DKokkos_ENABLE_OPENMP=ON -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12.4 -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12.4/bin/nvcc -DMKL_INCLUDE_DIR=/tmp -DCMAKE_CXX_STANDARD=17 -DCMAKE_CXX_EXTENSIONS=OFF -DCMAKE_CXX_FLAGS=“-D_GLIBCXX_USE_CXX11_ABI=0” -DCMAKE_CUDA_FLAGS=“-D_GLIBCXX_USE_CXX11_ABI=0” -DKokkos_CXX_FLAGS=“-D_GLIBCXX_USE_CXX11_ABI=0”

Pair style allegro is not part of LAMMPS but developed externally. You have to contact its developers directly about any issues with it.

Thank you for your reply. However, the crash does not happen in the pair_allegro code. It occurs during the Kokkos GPU neighbor list construction step, specifically in `Kokkos_Cuda_Instance.cpp`. This is a core LAMMPS/Kokkos component, not the external pair style.
Could you please reconsider this?

You have to provide an input deck that only uses components that are part of LAMMPS so we can reproduce the crash.

That a crash happens outside the pair style does not mean it is not caused by it. This is apparently a memory management issue and there can be issues caused by not reserving space correctly or allocating insufficient or incorrect amounts.

@2297019091 can you please reformat your post with triple ticks ```, otherwise it is totally unreadable. @anjohan any ideas on this?

My log is shown:

Internal error!
Thu May 14 20:42:17 CST 2026
Writing to /root/.config/pip/pip.conf
LAMMPS (11 Feb 2026)
KOKKOS mode with Kokkos version 5.0.2 is enabled
  using double precision
  using view layout = legacy
  will use up to 1 GPU(s) per node
Kokkos::OpenMP::initialize WARNING: OMP_PROC_BIND environment variable not set
  In general, for best performance with OpenMP 4.0 or better set OMP_PROC_BIND=spread and OMP_PLACES=threads
  For best performance with OpenMP 3.1 set OMP_PROC_BIND=true
  For unit testing set OMP_PROC_BIND=false

  using 1 OpenMP thread(s) per MPI task
package kokkos
package kokkos newton on neigh half
# Authors: Anders Johansson, Marc Descoteaux

variable L index 2
variable STRUCTURE index BaTiO3-sc333-expt
variable PERIOD index 100000
variable EQUILSTEPS index 5000
variable TEMP index 300
variable MODEL index BaTiO3.nequip
variable SEED index 1
variable ITER index 1
log Logs_${ITER}/log.equilibrate_${MODEL}_${STRUCTURE}_${TEMP}_$L_${ITER}_${PERIOD}
log Logs_1/log.equilibrate_${MODEL}_${STRUCTURE}_${TEMP}_$L_${ITER}_${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_${STRUCTURE}_${TEMP}_$L_${ITER}_${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_${TEMP}_$L_${ITER}_${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_$L_${ITER}_${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_2_${ITER}_${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_2_1_${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_2_1_100000

units		metal
atom_style	atomic

read_data ${STRUCTURE}.data
read_data BaTiO3-sc333-expt.data
Reading data file ...
  orthogonal box = (0 0 0) to (11.97285 11.97285 12.1056)
  1 by 1 by 1 MPI processor grid
  reading atoms ...
  135 atoms
  read_data CPU = 0.014 seconds
replicate $L $L $L
replicate 2 $L $L
replicate 2 2 $L
replicate 2 2 2
Replication is creating a 2x2x2 = 8 times larger system...
  orthogonal box = (0 0 0) to (23.9457 23.9457 24.2112)
  1 by 1 by 1 MPI processor grid
  1080 atoms
  replicate CPU = 0.006 seconds

pair_style allegro
NequIP/Allegro is using input precision d and output precision d
pair_coeff	* * ${MODEL}.pth Ba O Ti
pair_coeff	* * BaTiO3.nequip.pth Ba O Ti
NequIP/Allegro: Loading model from BaTiO3.nequip.pth
NequIP/Allegro: Freezing TorchScript model...
Type mapping:
NequIP/Allegro type | NequIP/Allegro name | LAMMPS type | LAMMPS name
0 | Ba | 1 | Ba
1 | Ti | 3 | Ti
2 | O | 2 | O
ti=0 tj=0 cut=5.00
ti=0 tj=1 cut=5.00
ti=0 tj=2 cut=5.00
ti=1 tj=0 cut=5.00
ti=1 tj=1 cut=5.00
ti=1 tj=2 cut=5.00
ti=2 tj=0 cut=5.00
ti=2 tj=1 cut=5.00
ti=2 tj=2 cut=5.00
mass 1 137.3
mass 2 15.9994
mass 3 47.9

timestep 0.002
compute polarization all allegro polarization 3
compute allegro will evaluate the quantity polarization of length 3
compute polarizability all allegro polarizability 9
compute allegro will evaluate the quantity polarizability of length 9
compute borncharges all allegro/atom born_effective_charges 9 1
compute allegro/atom will evaluate the quantity born_effective_charges of length 9 with newton 1

thermo_style custom pe fmax fnorm spcpu cpuremain

variable efield equal 1e-2*1.5
fix born all addbornforce 0.0 0.0 ${efield}
fix born all addbornforce 0.0 0.0 0.015

restart $(v_EQUILSTEPS) ./Restarts_${ITER}/restart.equilibrate_${MODEL}_${STRUCTURE}_${TEMP}_$L_${ITER}_${PERIOD}.*
restart 5000 ./Restarts_${ITER}/restart.equilibrate_${MODEL}_${STRUCTURE}_${TEMP}_$L_${ITER}_${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_${MODEL}_${STRUCTURE}_${TEMP}_$L_${ITER}_${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_${STRUCTURE}_${TEMP}_$L_${ITER}_${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_${TEMP}_$L_${ITER}_${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_$L_${ITER}_${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_2_${ITER}_${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_2_1_${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_2_1_100000.*

thermo 10
velocity all create ${TEMP} ${SEED} dist gaussian rot yes mom yes
velocity all create 300 ${SEED} dist gaussian rot yes mom yes
velocity all create 300 1 dist gaussian rot yes mom yes
fix nvt all nvt temp ${TEMP} ${TEMP} $(100*dt)
fix nvt all nvt temp 300 ${TEMP} $(100*dt)
fix nvt all nvt temp 300 300 $(100*dt)
fix nvt all nvt temp 300 300 0.2000000000000000111
run $(v_EQUILSTEPS)
run 5000

CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE

Your simulation uses code contributions which should be cited:
- KOKKOS package: https://doi.org/10.1145/3731599.3767498
The log file lists these citations in BibTeX format.

CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE

Neighbor list info ...
  update: every = 1 steps, delay = 0 steps, check = yes
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 7
  ghost atom cutoff = 7
  binsize = 7, bins = 4 4 4
  1 neighbor lists, perpetual/occasional/extra = 1 0 0
  (1) pair allegro/kk, perpetual
      attributes: full, newton on, kokkos_device
      pair build: full/bin/kk/device
      stencil: full/bin/3d
      bin: kk/device
Setting up Verlet run ...
  Unit style    : metal
  Current step  : 0
  Time step     : 0.002
cudaStreamSynchronize(stream) error( cudaErrorIllegalAddress): an illegal memory access was encountered /opt/lammps/lib/kokkos/core/src/Cuda/Kokkos_Cuda_Instance.cpp:157
Backtrace:
[0x559b7e7e0303] 
[0x559b7e79ac49] 
[0x559b7e7e59f5] 
[0x559b7cdcb5f9] 
[0x559b7e7e56b9] 
[0x559b7e7e81cf] 
[0x559b7e7e570b] 
[0x559b7e7e61b1] 
[0x559b7e7e7bea] 
[0x559b7e405d59] 
[0x559b7e40390d] 
[0x559b7e3ffddd] 
[0x559b7e3fac23] 
[0x559b7e3f045a] 
[0x559b7d613002] 
[0x559b7cd9d5be] 
[0x559b7cb9135a] 
[0x559b7cb8da9e] 
[0x559b7cb89783] 
[0x7fa92099ed90] 
[0x7fa92099ee40] __libc_start_main
[0x559b7cb89615] 
/input_lbg-428977-22644592/lbg-428977-22644592.sh: line 5: 30706 Aborted                 ./lmp_gpu -sf kk -k on g 1 t 1 -pk kokkos newton on neigh half -in in.equilibrate -echo screen

My CMake command is:TORCH_CMAKE=Misplaced &{TORCH_CMAKE}" -DPKG_KOKKOS=ON -DKokkos_ENABLE_CUDA=ON -DKokkos_ARCH_VOLTA70=ON -DKokkos_ENABLE_SERIAL=ON -DKokkos_ENABLE_OPENMP=ON -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12.4 -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12.4/bin/nvcc -DMKL_INCLUDE_DIR=/tmp -DCMAKE_CXX_STANDARD=17 -DCMAKE_CXX_EXTENSIONS=OFF -DCMAKE_CXX_FLAGS=“-D_GLIBCXX_USE_CXX11_ABI=0” -DCMAKE_CUDA_FLAGS=“-D_GLIBCXX_USE_CXX11_ABI=0” -DKokkos_CXX_FLAGS=“-D_GLIBCXX_USE_CXX11_ABI=0”

@2297019091 Can you try running the same simulation but replace pair Allegro with pair LJ/cut? If the bug is in LAMMPS not allegro then the issue should persist and we can debug. One guess is that your simulation is exploding on timestep 0. This can lead to cudaErrorIllegalAddress when binning atoms for the neighbor list because the value is too large. For example, see LAMMPS Kokkos GPU: cudaErrorIllegalAddress during neighbor build (NBinKokkos::bin_atoms) - #3 by Qixuan.

Thank you for your suggestion. I have re-run the simulation with pair style lj/cut instead of the Allegro potential. The simulation completed successfully without any explosion or CUDA error. The full log is attached below.
It seems the issue is specific to Kokkos/GPU execution, not the pair potential itself. With LJ/cut, the simulation runs fine on CPU, but when using Kokkos GPU acceleration (even with LJ/cut), I still encounter the cudaErrorIllegalAddress error during neighbor list construction, which aligns with the guess you mentioned.

nternal error!
Mon May 18 20:11:44 CST 2026
Writing to /root/.config/pip/pip.conf
LAMMPS (11 Feb 2026)
KOKKOS mode with Kokkos version 5.0.2 is enabled
  using double precision
  using view layout = legacy
  will use up to 1 GPU(s) per node
Kokkos::OpenMP::initialize WARNING: OMP_PROC_BIND environment variable not set
  In general, for best performance with OpenMP 4.0 or better set OMP_PROC_BIND=spread and OMP_PLACES=threads
  For best performance with OpenMP 3.1 set OMP_PROC_BIND=true
  For unit testing set OMP_PROC_BIND=false

  using 1 OpenMP thread(s) per MPI task
package kokkos
package kokkos newton on neigh half
# Authors: Anders Johansson, Marc Descoteaux

variable L index 1
variable STRUCTURE index BaTiO3-sc333-expt
variable PERIOD index 100000
variable EQUILSTEPS index 5000
variable TEMP index 300
variable MODEL index BaTiO3.nequip
variable SEED index 1
variable ITER index 1
log Logs_${ITER}/log.equilibrate_${MODEL}_${STRUCTURE}_${TEMP}_$L_${ITER}_${PERIOD}
log Logs_1/log.equilibrate_${MODEL}_${STRUCTURE}_${TEMP}_$L_${ITER}_${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_${STRUCTURE}_${TEMP}_$L_${ITER}_${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_${TEMP}_$L_${ITER}_${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_$L_${ITER}_${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_1_${ITER}_${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_1_1_${PERIOD}
log Logs_1/log.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_1_1_100000

units		metal
atom_style	atomic

read_data ${STRUCTURE}.data
read_data BaTiO3-sc333-expt.data
Reading data file ...
  orthogonal box = (0 0 0) to (11.97285 11.97285 12.1056)
  1 by 1 by 1 MPI processor grid
  reading atoms ...
  135 atoms
  read_data CPU = 0.013 seconds
#replicate $L $L $L

# ========== 这里开始我帮你改成了纯LJ势 ==========
pair_style lj/cut 7.0
pair_coeff  * * 1.0 1.0
# ========== 上面是官方要求的测试势 ==========

mass 1 137.3
mass 2 15.9994
mass 3 47.9

timestep 0.002

# ========== allegro 相关全部注释掉(必须删掉) ==========
# compute polarization all allegro polarization 3
# compute polarizability all allegro polarizability 9
# compute borncharges all allegro/atom born_effective_charges 9 1

thermo_style custom pe fmax fnorm spcpu cpuremain

# ========== 电场相关也注释掉(测试不需要) ==========
# variable efield equal 1e-2*1.5
# fix born all addbornforce 0.0 0.0 ${efield}

restart $(v_EQUILSTEPS) ./Restarts_${ITER}/restart.equilibrate_${MODEL}_${STRUCTURE}_${TEMP}_$L_${ITER}_${PERIOD}.*
restart 5000 ./Restarts_${ITER}/restart.equilibrate_${MODEL}_${STRUCTURE}_${TEMP}_$L_${ITER}_${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_${MODEL}_${STRUCTURE}_${TEMP}_$L_${ITER}_${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_${STRUCTURE}_${TEMP}_$L_${ITER}_${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_${TEMP}_$L_${ITER}_${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_$L_${ITER}_${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_1_${ITER}_${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_1_1_${PERIOD}.*
restart 5000 ./Restarts_1/restart.equilibrate_BaTiO3.nequip_BaTiO3-sc333-expt_300_1_1_100000.*

thermo 10
velocity all create ${TEMP} ${SEED} dist gaussian rot yes mom yes
velocity all create 300 ${SEED} dist gaussian rot yes mom yes
velocity all create 300 1 dist gaussian rot yes mom yes
fix nvt all nvt temp ${TEMP} ${TEMP} $(100*dt)
fix nvt all nvt temp 300 ${TEMP} $(100*dt)
fix nvt all nvt temp 300 300 $(100*dt)
fix nvt all nvt temp 300 300 0.2000000000000000111
run $(v_EQUILSTEPS)
run 5000

CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE

Your simulation uses code contributions which should be cited:
- KOKKOS package: https://doi.org/10.1145/3731599.3767498
The log file lists these citations in BibTeX format.

CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE

Generated 0 of 3 mixed pair_coeff terms from geometric mixing rule
Neighbor list info ...
  update: every = 1 steps, delay = 0 steps, check = yes
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 9
  ghost atom cutoff = 9
  binsize = 9, bins = 2 2 2
  1 neighbor lists, perpetual/occasional/extra = 1 0 0
  (1) pair lj/cut/kk, perpetual
      attributes: half, newton on, kokkos_device
      pair build: half/bin/newton/kk/device
      stencil: half/bin/3d
      bin: kk/device
Setting up Verlet run ...
  Unit style    : metal
  Current step  : 0
  Time step     : 0.002
Per MPI rank memory allocation (min/avg/max) = 3.364 | 3.364 | 3.364 Mbytes
    PotEng          Fmax          Fnorm          S/CPU         CPULeft    
-17.639411      0.21244164     2.0641778      0              0            
-18.36947       0.76607054     3.0519263      1087.9444      4.5875096    
-24.253907      5.0165855      10.755619      1268.7127      4.2517151    
-37.607772      24.544069      70.745811      1102.7437      4.3311029    
-44.646165      18.963171      52.868207      1863.1852      3.9072617    
.......
-682.36878      13.343836      50.745303      1975.3875      0.018191926  
-685.42901      11.28736       43.75057       1583.968       0.012128952  
-688.01914      13.523786      51.361799      1814.4052      0.0060633684 
-691.02656      11.431005      44.31348       871.75408      0            
Loop time of 3.04154 on 1 procs for 5000 steps with 135 atoms

Performance: 284.067 ns/day, 0.084 hours/ns, 1643.905 timesteps/s, 221.927 katom-step/s
90.6% CPU use with 1 MPI tasks x 1 OpenMP threads

MPI task timing breakdown:
Section |  min time  |  avg time  |  max time  |%varavg| %total
---------------------------------------------------------------
Pair    | 0.74822    | 0.74822    | 0.74822    |   0.0 | 24.60
Neigh   | 0.024114   | 0.024114   | 0.024114   |   0.0 |  0.79
Comm    | 0.82683    | 0.82683    | 0.82683    |   0.0 | 27.18
Output  | 0.028471   | 0.028471   | 0.028471   |   0.0 |  0.94
Modify  | 0.74695    | 0.74695    | 0.74695    |   0.0 | 24.56
Other   |            | 0.667      |            |       | 21.93

Nlocal:            135 ave         135 max         135 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Nghost:           2487 ave        2487 max        2487 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Neighs:          12677 ave       12677 max       12677 min
Histogram: 1 0 0 0 0 0 0 0 0 0

Total # of neighbors = 12677
Ave neighs/atom = 93.903704
Neighbor list builds = 49
Dangerous builds = 0
Total wall time: 0:00:03

Due to the character limit, I have omitted the intermediate calculation steps from the log.So the bug is likely in the Kokkos GPU implementation rather than the Allegro potential.

Can you please post a *complete* reproducer for the crash with LJ/cut and KOKKOS? This includes the full input file, and any data files or initial configurations. Thank you.

I’d like to point out that your log starts with Internal error! line. This is not normal.

1 Like

Your quoted log does NOT show any KOKKOS neighbor list issue with lj/cut and that is despite using very untypical LJ parameters for metal units. Typically, the sigma would be 3-5 angstrom instead of 1.0 and the epsilon would be much smaller than 1.0 eV (more around 0.01 eV). Thus your system is likely collapsing and experiencing very strong forces due to the depth and steepness of the atom-atom potential function. See the log files in the examples/UNITS folder of the LAMMPS distribution, which demonstrates how the same Argon model looks with different choices of units.

So far, I see no convincing evidence of a failure of the KOKKOS neighbor list code. This would be rather unusual, since we are running a lot of tests with it regularly. So the most likely causes are:

  • some inconsistency in your geometry and parameterization/training that leads to bad forces
  • some incorrect or inconsistent invocation of the pair style you are using
  • some issue with your compilation or the libraries you are loading

So to have this further investigated, you need to get busy and provide convincing evidence that LAMMPS itself is a fault and that requires providing some meaningful input deck that allows us to reproduce the failure without the non-canonical pair style.