Shake atoms missing

Dear LAMMPS users,

I am getting the following error from my system including 500 water molecules and 20 ions.
I checked the comments for similar issues, so I tried to follow those comments like:

1- I first tried with “kspace_style pppm 1.0e-4” but I got that error then I tried with “kspace_style ewald 1.0e-4” I again i got the same error
2- I decreased the time steps to 0.5 and 0.2 fs which with 0.2 fs I did not get that error but I want to run with 1 or 2 fs.

I also ran it without shake but i was getting other errors

I am Using Nvidia container for lammps and tried running using 1 or 4 GPUs.
I got the same error from lammps on my desktop (LAMMPS (3 Mar 2020)) with both single and multiple cpu cores

I appreciate your comments.
Sadra

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
SHAKE stats (type/ave/delta/count) on step 710
2 0.957199 1.60258e-05 1500
3 0.969997 1.12358e-05 10
2 104.520 0.00149062 500
ERROR on proc 0: Shake atoms missing on proc 0 at step 719 (src/KOKKOS/fix_shake_kokkos.cpp:288)

MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Here is my code

========================================================================

units real
dimension 3
boundary p p f
atom_style full

##Set Potential Parameters_______

    ##==============Force Field====================

pair_style lj/cut/coul/long 10
bond_style harmonic
angle_style harmonic
#kspace_style pppm 1.0e-4 # 1.0e-6
kspace_style ewald 1.0e-4

kspace_modify slab 3.0

read_data lammps-ion-wat-500.dat
#read_data 10_ions-140k_solv.dat
##============== Set Charges =======================

set type 1 charge 0.33
set type 2 charge 0.415
set type 3 charge 0.32
set type 4 charge -0.32
set type 5 charge -0.830
set type 6 charge -1.32

    ##============== Bond Coefficients ======================

bond_coeff 1 403.0 1.040
bond_coeff 2 450.0 0.9572
bond_coeff 3 545.0 0.9700

     ##============= Angle Coefficients ==================

angle_coeff 1 44.0 109.50
angle_coeff 2 55.0 104.52

      ##============ Pair Coefficients ====================

pair_coeff 1 1 0.04600 0.40001
pair_coeff 2 2 0.00000 0.0
pair_coeff 3 3 0.04600 0.40001
pair_coeff 4 4 0.20000 3.29633
pair_coeff 5 5 0.10200 3.188
pair_coeff 6 6 0.12000 3.02910
pair_coeff 2 5 0.00000 0.0

group wat type 2 5
group hyox type 3 6
group amon type 1 4

neighbor 2.0 bin
#neigh_modify delay 0 every 1 check yes
neigh_modify every 1 delay 5 check yes

thermo_style multi
thermo 1000
minimize 1.0e-4 1.0e-6 1000 1000

dump myatom all custom 1000 atom.dat id
dump mydmp all atom 1000 dump.lammpstrj

dump dcd_1 all dcd 1000 wat-ion.dcd

###run
run_style verlet
timestep 1

velocity all create 300.0 4928454 rot yes dist gaussian

fix 0 all nvt temp 300.0 300.0 100

fix 1 wat shake 1.0e-4 20 10 b 2 a 2

run 1000

========================================================================

Please visualize your system and check how far atoms move toward the box boundaries in z-direction.

Also it would be helpful to see the thermodynamic output and particularly any warnings.

What “other errors” did you get when running without shake (and a short timestep)?

Thank you for your response,

I included the thermodynamic data of different runs.
As you mentioned, water molecules are shifting toward the z-direction. I extended the water box 5 A in x-y and 10 A in the z-direction, it worked for 1000 steps with timestep=1 fs. But when I extended the simulation to 10000 steps again this error happened. Visualization of the data shows that when the water molecules reach the boundaries of the z-direction the simulation crashes. Even with the 0.2 fs for longer simulation, it crashes due to the mentioned reason

change
-30.138500 29.861500 xlo xhi
-30.038500 29.961500 ylo yhi
-75.166000 74.834000 zlo zhi
To:
-32.638500 32.361500 xlo xhi
-32.538500 32.461500 ylo yhi
-80.166000 79.834000 zlo zhi

===================================================================

1 fs with shake, minimization_____

WARNING: Fixes cannot yet send exchange data in Kokkos communication, switching to classic exchange/border communication (src/KOKKOS/comm_kokkos.cpp:580)
Per MPI rank memory allocation (min/avg/max) = 28.30 | 46.15 | 98.98 Mbytes
---------------- Step 0 ----- CPU = 0.0000 (sec) ----------------
TotEng = 705.5311 KinEng = 0.0000 Temp = 0.0000
PotEng = 705.5311 E_bond = 884.9525 E_angle = 24.9092
E_dihed = 0.0000 E_impro = 0.0000 E_vdwl = 40.3744
E_coul = 29615.8912 E_long = -29860.5962 Press = -112.1410
---------------- Step 857 ----- CPU = 7.9366 (sec) ----------------
TotEng = -4042.8881 KinEng = 0.0000 Temp = 0.0000
PotEng = -4042.8881 E_bond = 202.3410 E_angle = 83.6056
E_dihed = 0.0000 E_impro = 0.0000 E_vdwl = 1031.4593
E_coul = 25267.4316 E_long = -30627.7255 Press = -24.4109
Loop time of 7.93665 on 4 procs for 857 steps with 1570 atoms

99.9% CPU use with 4 MPI tasks x 1 OpenMP threads

Minimization stats:
Stopping criterion = energy tolerance
Energy initial, next-to-last, final =
705.531082951009 -4042.52821855711 -4042.88809876678
Force two-norm initial, final = 1898.9752 23.968992
Force max component initial, final = 82.666300 5.3232472
Final line search alpha, max atom move = 0.0084247062 0.044846794
Iterations, force evaluations = 857 1614

========================================================================

========================================================================
error with no SHAKE, 1fs____

Setting up Verlet run …
Unit style : real
Current step : 1000
Time step : 1
WARNING: Inconsistent image flags (src/domain.cpp:812)
Per MPI rank memory allocation (min/avg/max) = 30.35 | 48.19 | 101.0 Mbytes
---------------- Step 1000 ----- CPU = 0.0000 (sec) ----------------
TotEng = -2848.6930 KinEng = 1403.0673 Temp = 300.0000
PotEng = -4251.7603 E_bond = 209.8851 E_angle = 86.6630
E_dihed = 0.0000 E_impro = 0.0000 E_vdwl = 1067.2554
E_coul = 25052.9284 E_long = -30668.4922 Press = 83.4153
ERROR on proc 3: Bond atoms missing on proc 3 at step 1499 (src/KOKKOS/neigh_bond_kokkos.cpp:294)

MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.

===================================================================

NO SHAKE timestep 0.2 it complete the simulation with this warning____

Setting up Verlet run …
Unit style : real
Current step : 819
Time step : 0.2
WARNING: Inconsistent image flags (src/domain.cpp:812)
Per MPI rank memory allocation (min/avg/max) = 28.28 | 46.12 | 98.97 Mbytes
---------------- Step 819 ----- CPU = 0.0000 (sec) ----------------
TotEng = -2581.0057 KinEng = 1403.0673 Temp = 300.0000
PotEng = -3984.0730 E_bond = 200.7534 E_angle = 83.2573
E_dihed = 0.0000 E_impro = 0.0000 E_vdwl = 1010.9618
E_coul = 25340.4312 E_long = -30619.4767 Press = 89.4223
---------------- Step 1000 ----- CPU = 0.8658 (sec) ----------------
TotEng = -2549.9063 KinEng = 948.8284 Temp = 202.8759
PotEng = -3498.7347 E_bond = 342.8834 E_angle = 118.1140
E_dihed = 0.0000 E_impro = 0.0000 E_vdwl = 931.3792
E_coul = 25704.1452 E_long = -30595.2566 Press = 44.7718
---------------- Step 1819 ----- CPU = 4.7551 (sec) ----------------
TotEng = -2262.8271 KinEng = 1152.4224 Temp = 246.4078
PotEng = -3415.2495 E_bond = 425.8192 E_angle = 161.6104
E_dihed = 0.0000 E_impro = 0.0000 E_vdwl = 874.2687
E_coul = 25706.4139 E_long = -30583.3617 Press = -22.8654
Loop time of 4.75518 on 4 procs for 1000 steps with 1570 atoms

Performance: 3.634 ns/day, 6.604 hours/ns, 210.297 timesteps/s
99.8% CPU use with 4 MPI tasks x 1 OpenMP threads

SHAKE, and timestep 0.2______

Setting up cg/kk style minimization …
Unit style : real
Current step : 0
WARNING: Fixes cannot yet send exchange data in Kokkos communication, switching to classic exchange/border communication (src/KOKKOS/comm_kokkos.cpp:580)
Per MPI rank memory allocation (min/avg/max) = 28.30 | 46.15 | 98.98 Mbytes
---------------- Step 0 ----- CPU = 0.0000 (sec) ----------------
TotEng = 705.5311 KinEng = 0.0000 Temp = 0.0000
PotEng = 705.5311 E_bond = 884.9525 E_angle = 24.9092
E_dihed = 0.0000 E_impro = 0.0000 E_vdwl = 40.3744
E_coul = 29615.8912 E_long = -29860.5962 Press = -112.1410
---------------- Step 858 ----- CPU = 7.9359 (sec) ----------------
TotEng = -4067.9198 KinEng = 0.0000 Temp = 0.0000
PotEng = -4067.9198 E_bond = 202.6510 E_angle = 86.2966
E_dihed = 0.0000 E_impro = 0.0000 E_vdwl = 1033.4122
E_coul = 25241.6939 E_long = -30631.9734 Press = -27.8429
Loop time of 7.93596 on 4 procs for 858 steps with 1570 atoms

99.8% CPU use with 4 MPI tasks x 1 OpenMP threads

.
.
.
.
.
.

SHAKE stats (type/ave/delta/count) on step 1850
2 0.957200 1.22435e-08 1500
2 104.520 1.43463e-06 500
---------------- Step 1858 ----- CPU = 5.3603 (sec) ----------------
TotEng = -2650.9654 KinEng = 903.3774 Temp = 283.5027
PotEng = -3554.3427 E_bond = 25.1198 E_angle = 12.6082
E_dihed = 0.0000 E_impro = 0.0000 E_vdwl = 738.5605
E_coul = 26269.0451 E_long = -30599.6764 Press = -1.1541
Loop time of 5.36041 on 4 procs for 1000 steps with 1570 atoms

Performance: 3.224 ns/day, 7.445 hours/ns, 186.553 timesteps/s
99.7% CPU use with 4 MPI tasks x 1 OpenMP threads

=====================================================================

1fs, extended box, running for 2600 steps and then crash___

Current step : 0
WARNING: Fixes cannot yet send exchange data in Kokkos communication, switching to classic exchange/border communication (src/KOKKOS/comm_kokkos.cpp:580)
Per MPI rank memory allocation (min/avg/max) = 34.02 | 55.35 | 118.9 Mbytes
---------------- Step 0 ----- CPU = 0.0000 (sec) ----------------
TotEng = 729.3044 KinEng = 0.0000 Temp = 0.0000
PotEng = 729.3044 E_bond = 884.9525 E_angle = 24.9092
E_dihed = 0.0000 E_impro = 0.0000 E_vdwl = 40.8265
E_coul = 29542.6724 E_long = -29764.0562 Press = -88.7078
---------------- Step 1000 ----- CPU = 9.9260 (sec) ----------------
TotEng = -4076.4575 KinEng = 0.0000 Temp = 0.0000
PotEng = -4076.4575 E_bond = 203.7613 E_angle = 80.4202
E_dihed = 0.0000 E_impro = 0.0000 E_vdwl = 1036.7877
E_coul = 25159.6063 E_long = -30557.0330 Press = -21.3162
Loop time of 9.92609 on 4 procs for 1000 steps with 1570 atoms

99.7% CPU use with 4 MPI tasks x 1 OpenMP threads

Minimization stats:
Stopping criterion = max iterations
Energy initial, next-to-last, final =
729.304387657936 -4075.48899065154 -4076.45753164012
Force two-norm initial, final = 1898.4694 44.319482
Force max component initial, final = 82.853830 13.469429
Final line search alpha, max atom move = 0.0066938231 0.090161972
Iterations, force evaluations = 1000 1899

.
.
.
SHAKE stats (type/ave/delta/count) on step 1000
2 0.968125 0.123735 1500
2 102.266 11.8059 500
Per MPI rank memory allocation (min/avg/max) = 36.12 | 57.45 | 121.0 Mbytes
---------------- Step 1000 ----- CPU = 0.0000 (sec) ----------------
TotEng = -2934.1453 KinEng = 1403.0673 Temp = 440.3181
PotEng = -4337.2126 E_bond = 16.0881 E_angle = 7.3384
E_dihed = 0.0000 E_impro = 0.0000 E_vdwl = 1036.7877
E_coul = 25159.6063 E_long = -30557.0330 Press = 80.8368

.
.
.

---------------- Step 2000 ----- CPU = 5.7279 (sec) ----------------
TotEng = -2757.8284 KinEng = 965.5110 Temp = 303.0018
PotEng = -3723.3393 E_bond = 32.1849 E_angle = 15.7230
E_dihed = 0.0000 E_impro = 0.0000 E_vdwl = 761.2240
E_coul = 26085.0541 E_long = -30617.5253 Press = 10.9277
SHAKE stats (type/ave/delta/count) on step 2010

Please note the following that you need to keep in mind when using a long-range coulomb solver (pppm or ewald) with the slab option:

  • The calculation is still done for a periodic system, only the box is enlarged by the given factor. So if you initially have a box length of 150 angstrom in z-direction it will be expanded to 450 angstrom.
  • The KSpace calculation will then be performed for a regular periodic system but followed by a “correction” term that is based on solving the Poisson equation for the residual dipole moment in z-direction.
  • For the slab correction to be sufficiently accurate, NO charged atoms must leave the original 150 angstrom sized region of the simulation box. From your description that is happening and thus a problem
  • Another potential problem may result from the fact that you may have bad initial coordinates (or rather incorrect image flags). For the z-direction coordinates all image flags must be zero or else the scaling of the box dimensions will stretch the molecules in an unphysical manner and also cause unexpectedly large forces or other problems (e.g. shake not converging).

Now there are three steps that you could do:

  1. You should “sanitize” your initial coordinates. I.e. write them out as unwrapped coordinates, wrap them back into the principal simulation cell on a per-molecule basis (not per-atom), so that you you have no non-zero image flags overall or only a value of +/-1 for a few coordinates that are part of molecules with bonds across periodic boundaries in x-, and y-direction only.
  2. Then you may consider starting your calculation with a (short) minimization to avoid any significant pockets of potential energy. That is particularly important, if you have a system that was assembled from pieces that were individually equilibrated as periodic systems
  3. Add soft repulsive wall potentials at suitable locations in z-direction. You may need to expand the box in z-directions for that so that the walls are not in direct contact with any surfaces.
1 Like

thank you,
I will follow your instructions

thank you Axel,
the problem is solved by adding this line to my code.

fix zwall all wall/reflect zlo EDGE zhi EDGE

You have to be careful with using EDGE for this.

This will probably only work correctly when you specify the fix command to add the wall before the kspace_modify command. Otherwise you will still have atoms go into the “no-go zone” for the Poisson solver.

Please note that LAMMPS cannot check if you are running an unphysical model. It can only detect syntax error (and the syntax is most certainly correct). For as long as the numbers do not diverge, LAMMPS will keep continuing, but it may produce bogus results.

Thank you for your detailed explanation.