Kokkos cudaErrorIllegalAddress and DualView crash with fix wall/gran/kk

Hello Lammps community,

I was setting up an initial settling phase and used create_atoms random to pack 6 million particles into a 2mm slice right up against the xhi granular wall.

When running Step 0, the simulation immediately crashed with: cudaErrorIllegalAddress: an illegal memory access was encountered and Kokkos::DualView::modify_device ERROR: Concurrent modification of host and device views in DualView "wall/gran/kk:history_one_mirror_mirror"

Interestingly, changing the boundary to fix wall/reflect/kk allowed the simulation to pass Step 0 without crashing.

Input also attached and can anybody reproduce same error or find any potential bug on wall/gran/kk SRC?

---------------------------------------------------------

SCRIPT 1: SETTLING PHASE (ULTRA-DENSE RANDOM FILL)

---------------------------------------------------------

units si
dimension 3
atom_style sphere
boundary f f f
newton off

---- 1. KOKKOS SETUP ----

package kokkos neigh half comm device

comm_modify vel yes

---- 2. DOMAIN ----

variable xlo equal 0
variable xhi equal 0.140
variable ylo equal -0.40
variable yhi equal 0.40
variable zlo equal -0.40
variable zhi equal 0.40
region domain block {xlo} {xhi} {ylo} {yhi} {zlo} {zhi}
create_box 1 domain

---- 3. PARTICLES & DENSE FILLING ----

variable dp equal 0.0005
variable rho_p equal 2500
variable bed_start equal 0.137
variable thickness equal 0.002
variable bed_end equal {bed_start}+{thickness}

region bed block {bed_start} {bed_end} -0.40 0.40 -0.40 0.40

create_atoms 1 random 6000000 482748 bed

set type 1 diameter ${dp}
set type 1 density ${rho_p}

---- 4. INTERACTION SETUP ----

pair_style gran/hooke/history 25000.0 7142.9 179411.2 89705.6 0.70 1
pair_coeff * *

---- WALLS ----

#fix wall_reflect all wall/reflect/kk xhi 0.140 units box

Correct Syntax: [ID] [group] [style] [fstyle] [params…] [wallstyle] [lo] [hi]

fix wall_at_hi all wall/gran/kk hooke/history 25000.0 7142.9 122297.0 61148.5 0.70 1 xplane NULL 0.140

---- 5. CLOSEST POSSIBLE OVERLAP CHECK ----

Deleting at 0.999*dp allows particles to be nearly touching

#variable overlap_dist equal ${dp}*1
#delete_atoms overlap ${overlap_dist} all all

fix int all nve/sphere/kk

fix g all gravity/kk 10 vector 1 0 0

High damping to absorb energy from the dense initial state

fix damp all viscous 0.5

---- 7. SETTINGS ----

neighbor 0.0001 bin
neigh_modify delay 0 every 1 check yes one 1000

compute fmax all reduce max fx fy fz
thermo 1000
thermo_style custom step atoms dt time c_fmax[*] cpu
thermo_modify lost ignore flush yes

dump vtk1 all vtk 5000 dump/particles_settle*.vtp id type vx vy vz fx fy fz diameter

---- 9. FINAL STABILIZATION ----

timestep 2.0e-7
run 5000

---- 10. SAVE RESTART ----

print “Writing Restart File…”

write_restart restart/restart.settled

1 Like

There are multiple problems with your post that make it next to impossible to debug this:

  • Your quoted input file is unreadable since you didn’t quote correctly using triple backtics ```. See the post with guidelines and suggestions for posting here with more information on this.
  • You don’t provide the exact command line you are using, nor the LAMMPS version you are running with, or the platform that you are running on.
  • You don’t seem to be using the -sf suffix flag and thus it appears that your pair style would not be KOKKOS accelerated since it is lacking the the /kk suffix.
  • Nobody can easily debug a system with 6 million atoms. Please try to find the minimum system size that reproduces this issue
  • You should check if the same input can run without KOKKOS acceleration
  • You are using thermo_modify lost ignore that can hide a lot of bad things. Are you really expecting particles to leave the simulation?
  • Since you are creating atoms at random locations, you should remove any close contacts. The create_atoms command has an option for that, or you can use the delete_atoms command.

Unless you address all of these points, it is unlikely that somebody will have a closer look. It would just be too much work and this is not a commercial support facility, but people are volunteering their time. So you should make it as easy as possible to help you.

As a simple starting step, you may want to run your particle creation, save the system into a data file, and start your run from the data file to see if there is a bug in atom creation or in the actual pair style itself.

Additionally, if the non-Kokkos version of the fix works, that is also useful information.

In general, doing a few more similar steps by yourself (decomposing the overall error into little testable pieces so you can start ruling out various parts of the input script as sources of error) will not only help us, the forum volunteers. It will also help you, as (1) you will not have to wait for us to make progress; (2) you will learn and understand LAMMPS better at a fundamental level.

1 Like