voronoi occupation calculation, segmentation fault exit (signal 11)

Hi,

I am using the latest version of lammps (16th Feb 2016) and computing occupation in irradiation simulations in tungsten using the command

compute vacint cascade voronoi/atom occupation

While writing the output, after 36400 steps, the program crashes with the error:

APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
/home/maya/maya_lammps/lammps_debug_segfault

If I remove the occupation calculation, it works fine.

Could you suggest where did it go wrong?

The segfault happens when i invoke dump 5, the printing of occupation of final cascade after 36400 steps.

The input script is:
# LAMMPS Input File for BCC unit cell lattices

# ---------- Setup Variables ---------------------
variable latparam equal 3.1652

# ---------- Initialize Simulation ---------------------
clear
units metal
dimension 3
boundary p p p
atom_style atomic

# ---------- Create Atomistic Structure ---------------------
lattice bcc {latparam}** **region box block 0.000000 221.494 0.00 221.494 0.000000 221.494 units box** **create_box 2 box** **lattice bcc {latparam} orient x 1 0 0 orient y 0 1 0 orient z 0 0 1
create_atoms 2 box
region reg_PKA sphere 110.5 110.5 110.5 1.45 units box
#mass * 183.84

# ---------- Define Interatomic Potential ---------------------
pair_style eam/fs
pair_coeff * * W_Dudarev.Bjorkas.eam.fs W W
neighbor 2.0 bin
neigh_modify delay 10 check yes

#------------Define regions------------------
region reg_all block INF INF INF INF INF INF units box
region reg_interior block 4 217 4 217 4 INF units box
region reg_exterior block 4 217 4 217 4 INF side out units box
#-----------Define groups-------------------
group cascade region reg_interior
group thermostat region reg_exterior
group PKA region reg_PKA
# ---------- Define Settings ---------------------
compute eng all pe/atom
compute eatoms all reduce sum c_eng
compute alltemp all temp
compute 2 cascade coord/atom 2.2
compute kecascade cascade ke/atom
compute vacint cascade voronoi/atom occupation
compute volu cascade voronoi/atom

#---------------------Equilibration settings------------------
#velocity all create 300.0 34986 # initialize atom velocities to temperature, seed random num generator
fix 1 all nve
fix 2 thermostat temp/rescale 1 300.0 300.0 0.5 1.0 region reg_exterior
fix 3 cascade temp/rescale 1 300.0 300.0 0.5 1.0 region reg_interior
#fix 240 all recenter INIT INIT INIT units box

thermo 100
thermo_style custom step temp pe etotal press
run 1000 # run the equilibriation phase

unfix 3 # remove the rescaling fix from the interior

dump 90 all custom 500 initial_coordinates.dat id type x y z vx vy vz
#dump 60 cascade custom 500 initial_energy.dat id type c_kecascade c_vacint[1] c_vacint[2] c_volu[1]
run 1000 # continue to equilibriate
undump 90
#undump 60
#----------------------Define PKA settings-----------------------
# set PKA velocity to correspond to ~ 20 keV -1474.4Ang/ps
velocity PKA set 619.891696 57.999341 808.557871 units box
#set group PKA vx 10.0 units box
#set group PKA vy 13.0 units box
#set group PKA vz -329.19 units box
#---------------------Defining initial time step----------------------
# set timestep to smaller value for initial phase of collisions (.01 fs for .2 ps)
timestep 0.00001
thermo 100
#dump 1 cascade custom 100 test_init_col.dump x y z c_2 type c_3
#dump 2 PKA custom 100 PKA_traj_init_col.dump x y z c_2 type c_4
#dump 1 cascade custom 100 initial_cascade.dump id x y z c_kecascade c_vacint[1] c_vacint[2] c_volu[1]
dump 2 PKA custom 100 initial_pka.dump id x y z vx vy vz
#dump 10 cascade xyz 100 initial_cascade.xyz

run 20000 # run the collisional phase for .2 ps

# run intermediate phase with intermediate timestep (.1 fs for 1 ps)
timestep 0.0001
#undump 1
undump 2
#dump 3 cascade custom 100 test_inter_evolve.dump x y z c_3 type c_2
#dump 4 PKA custom 100 PKA_traj_inter_evolve.dump x y z c_3 type c_2
#dump 3 cascade custom 100 inter_cascade.dump id x y z c_kecascade c_vacint[1] c_vacint[2] c_volu[1]
dump 4 PKA custom 100 inter_pka.dump id x y z vx vy vz
#dump 30 cascade xyz 100 inter_cascade.xyz
run 10000
# after initial phase, let evolve for remainder of time using .001 ps timestep (1 fs for 10 ps)
#undump 3 # close the previous dumps
undump 4

timestep 0.001
#dump 5 cascade custom 200 test_final_evolve.dump x y z c_3 type c_2
#dump 6 PKA custom 200 PKA_traj_final_evolve.dump x y z c_3 type c_2
dump 5 cascade custom 100 final_cascade.dump id x y z c_kecascade c_vacint[1] c_vacint[2] c_volu[1]
dump 6 PKA custom 100 final_pka.dump id x y z vx vy vz
#dump 50 cascade xyz 100 final_cascade.xyz

run 15000

Thank you

If you want an answer, you are going to have to work a little harder. 624308 atoms 36400 steps is clearly not the minimal problem size.

If you want an answer, you are going to have to work a little
harder. 624308 atoms 36400 steps is clearly not the minimal problem size.

​yeah, when sending examples with problems like this​, keep in mind that
when running such an input with debugging enabled, the simulation has to be
run on a desktop machine, often with excutables that run slower. nobody can
afford to run a job for a week on their desktop in order to try to
reproduce a segfault.

however, looking at your input the line:

neigh_modify delay 10 check yes

is highly suspicious. it may be acceptable. for equilibration and using a
short time step, but as soon as you set the velocity of the PKA, this can
cause problems, especially when running in parallel. ...and even more so,
as you are ramping up the time steps by two orders of magnitude during the
various chunks of the run.

so pretty much the only recommendation that can be given at this point is
to try again with delay set to 0.

axel.

Dear Axel,

Thanks for the input and really sorry for sending you the input file with a large simulation cell.

However, we started to observe this problem only after 50 unit cells with voronoi_occupation call invoked.

We have tried with neigh_modify delay 0. But this also ended up in segfault (signal 11)

While debugging we found a few points, as follows:

  1. We don’t find any segmentation fault if we run the input script in a single CPU.

  2. When we run the parallel version the values start to deviate from the single CPU version after a few number of steps (~ from 6400 steps or so) and always end in segfault.

  3. When we do the debugging using gdb, we get the following error.

gdb)
Core was generated by `./lmp_mpi’.
Program terminated with signal 11, Segmentation fault.
#0 LAMMPS_NS::ComputeVoronoi::checkOccupation (this=0xf6b200) at …/compute_voronoi_atom.cpp:436
436 voro[j][1] = c;

gdb) bt
#0 LAMMPS_NS::ComputeVoronoi::checkOccupation (this=0xf6b200) at …/compute_voronoi_atom.cpp:436
#1 0x000000000051c3e2 in LAMMPS_NS::DumpCustom::count (this=0x2f21250) at …/dump_custom.cpp:417
#2 0x0000000000513898 in LAMMPS_NS::Dump::write (this=0x2f21250) at …/dump.cpp:301
#3 0x0000000000743c9f in LAMMPS_NS::Output::write (this=0xf2b4b0, ntimestep=36500) at …/output.cpp:303
#4 0x0000000000a4154c in LAMMPS_NS::Verlet::run (this=0xf2ce10, n=15000) at …/verlet.cpp:333
#5 0x0000000000a0a8c7 in LAMMPS_NS::Run::command (this=0x7fff388411e0, narg=, arg=0xf38ea0) at …/run.cpp:175
#6 0x00000000006b9cb5 in LAMMPS_NS::Input::command_creator<LAMMPS_NS::Run> (lmp=, narg=1, arg=0xf38ea0) at …/input.cpp:723
#7 0x00000000006b4493 in LAMMPS_NS::Input::execute_command (this=0xf1ee20) at …/input.cpp:706
#8 0x00000000006b5335 in LAMMPS_NS::Input::file (this=0xf1ee20) at …/input.cpp:243
#9 0x00000000006cc48a in main (argc=1, argv=0x7fff38841458) at …/main.cpp:31

However core dump happens only after a few thousands of steps.

Is this information helpful for suggesting where did it go wrong?

Thank you

Maya

Sorry to post a big input script again.

# LAMMPS Input File for BCC unit cell lattices

# ---------- Setup Variables ---------------------
variable latparam equal 3.1652

# ---------- Initialize Simulation ---------------------
clear
units metal
dimension 3
boundary p p p
atom_style atomic

# ---------- Create Atomistic Structure ---------------------
lattice bcc {latparam}** **region box block 0.000000 221.494 0.00 221.494 0.000000 221.494 units box** **create_box 2 box** **lattice bcc {latparam} orient x 1 0 0 orient y 0 1 0 orient z 0 0 1
create_atoms 2 box
region reg_PKA sphere 110.5 110.5 110.5 1.45 units box
#mass * 183.84

# ---------- Define Interatomic Potential ---------------------
pair_style eam/fs
pair_coeff * * W_Dudarev.Bjorkas.eam.fs W W
neighbor 2.0 bin
neigh_modify delay 0 check yes

#------------Define regions------------------
region reg_all block INF INF INF INF INF INF units box
region reg_interior block 4 217 4 217 4 INF units box
region reg_exterior block 4 217 4 217 4 INF side out units box
#-----------Define groups-------------------
group cascade region reg_interior
group thermostat region reg_exterior
group PKA region reg_PKA
# ---------- Define Settings ---------------------
compute eng all pe/atom
compute eatoms all reduce sum c_eng
compute alltemp all temp
compute 2 cascade coord/atom 2.2
compute kecascade cascade ke/atom
compute vacint cascade voronoi/atom occupation
compute volu cascade voronoi/atom
#compute kepka PKA ke/atom
#---------------------Equilibration settings------------------
#velocity all create 300.0 34986 # initialize atom velocities to temperature, seed random num generator
fix 1 all nve
fix 2 thermostat temp/rescale 1 300.0 300.0 0.5 1.0 region reg_exterior
fix 3 cascade temp/rescale 1 300.0 300.0 0.5 1.0 region reg_interior
#fix 240 all recenter INIT INIT INIT units box

thermo 100
thermo_style custom step temp pe etotal press
run 1000 # run the equilibriation phase

unfix 3 # remove the rescaling fix from the interior
dump 60 cascade custom 500 initial_energy.dat id type c_kecascade c_vacint[1] c_vacint[2] c_volu[1]
run 1000
# continue to equilibriate
undump 60
#----------------------Define PKA settings-----------------------
velocity PKA set 619.891696 57.999341 808.557871 units box
#---------------------Defining initial time step----------------------
# set timestep to smaller value for initial phase of collisions (.01 fs for .2 ps)
timestep 0.00001
thermo 100
dump 2 PKA custom 100 initial_pka.dump id x y z vx vy vz
#dump 10 cascade xyz 100 initial_cascade.xyz

run 20000 # run the collisional phase for .2 ps

# run intermediate phase with intermediate timestep (.1 fs for 1 ps)
timestep 0.0001
undump 2
dump 4 PKA custom 100 inter_pka.dump id x y z vx vy vz
run 10000

# after initial phase, let evolve for remainder of time using .001 ps timestep (1 fs for 10 ps)
undump 4
timestep 0.001
dump 5 cascade custom 100 final_cascade.dump id x y z c_kecascade c_vacint[1] c_vacint[2] c_volu[1]
dump 6 PKA custom 100 final_pka.dump id x y z vx vy vz

run 15000

Dear Axel,
Thanks for the input and really sorry for sending you the input file with
a large simulation cell.
However, we started to observe this problem only after 50 unit cells with
voronoi_occupation call invoked.

We have tried with neigh_modify delay 0. But this also ended up in
segfault (signal 11)

While debugging we found a few points, as follows:

1. We don't find any segmentation fault if we run the input script in a
single CPU.

​ahh! this is a bit of a hint.​

2. When we run the parallel version the values start to deviate from the
single CPU version after a few number of steps (~ from 6400 steps or so)
and always end in segfault.

​this is the expected behavior. ​

3. When we do the debugging using gdb, we get the following error.

gdb)
Core was generated by `./lmp_mpi'.
Program terminated with signal 11, Segmentation fault.
#0 LAMMPS_NS::ComputeVoronoi::checkOccupation (this=0xf6b200) at
../compute_voronoi_atom.cpp:436
436 voro[j][1] = c;

gdb) bt
#0 LAMMPS_NS::ComputeVoronoi::checkOccupation (this=0xf6b200) at
../compute_voronoi_atom.cpp:436
#1 0x000000000051c3e2 in LAMMPS_NS::DumpCustom::count (this=0x2f21250) at
../dump_custom.cpp:417
#2 0x0000000000513898 in LAMMPS_NS::Dump::write (this=0x2f21250) at
../dump.cpp:301
#3 0x0000000000743c9f in LAMMPS_NS::Output::write (this=0xf2b4b0,
ntimestep=36500) at ../output.cpp:303
#4 0x0000000000a4154c in LAMMPS_NS::Verlet::run (this=0xf2ce10, n=15000)
at ../verlet.cpp:333
#5 0x0000000000a0a8c7 in LAMMPS_NS::Run::command (this=0x7fff388411e0,
narg=<optimized out>, arg=0xf38ea0) at ../run.cpp:175
#6 0x00000000006b9cb5 in
LAMMPS_NS::Input::command_creator<LAMMPS_NS::Run> (lmp=<optimized out>,
narg=1, arg=0xf38ea0) at ../input.cpp:723
#7 0x00000000006b4493 in LAMMPS_NS::Input::execute_command
(this=0xf1ee20) at ../input.cpp:706
#8 0x00000000006b5335 in LAMMPS_NS::Input::file (this=0xf1ee20) at
../input.cpp:243
#9 0x00000000006cc48a in main (argc=1, argv=0x7fff38841458) at
../main.cpp:31

However core dump happens only after a few thousands of steps.

Is this information helpful for suggesting where did it go wrong?

​not much. it confirms that the problem is with the voronoi compute, as you
already mentioned.

​i have a hunch (not so much from your report, but from having seen several
segfault problems​ with per atom computes before).
please try changing the code for the voronoi compute as follows and let us
know whether that resolves the issue.

diff --git a/src/VORONOI/compute_voronoi_atom.cpp
b/src/VORONOI/compute_voronoi_atom.cpp
index ffdf5b7..a2d4927 100644
--- a/src/VORONOI/compute_voronoi_atom.cpp
+++ b/src/VORONOI/compute_voronoi_atom.cpp
@@ -188,8 +188,7 @@ void ComputeVoronoi::compute_peratom()
   invoked_peratom = update->ntimestep;

   // grow per atom array if necessary
- int nlocal = atom->nlocal;
- if (nlocal > nmax) {
+ if (atom->nmax > nmax) {
     memory->destroy(voro);
     nmax = atom->nmax;
     memory->create(voro,nmax,size_peratom_cols,"voronoi/atom:voro");
@@ -199,7 +198,7 @@ void ComputeVoronoi::compute_peratom()
   // decide between occupation or per-frame tesselation modes
   if (occupation) {
     // build cells only once
- int i, nall = nlocal + atom->nghost;
+ int i, nall = atom->nlocal + atom->nghost;
     if (con_mono==NULL && con_poly==NULL) {
       // generate the voronoi cell network for the initial structure
       buildCells();

axel.

Thank you Axel.

It worked with the change.

Maya