Problems with binary restart files and PPPM

Hello lammps-users,

in my recent simulations of liquid alkanes (with a modified OPLS-aa force field, PPPM kspace solver, rRESPA and Nose-Hoover thermostat), I think I stumbled upon a possible error in LAMMPS. At first, I became suspicious when three out of 20 simulations (4 different molecules, each at 5 different pressures) terminated with Error: Out of range atoms - cannot compute PPPM (…/pppm.cpp:1905). All of these simulations were conducted with the last stable release of LAMMPS (1 FEB 2014). Each system was equilibrated for more than 3.2 ns and checked for energy conservation with 1ns NVE runs without any troubles (building neighbor lists with zero dangerous builds). When I started my production runs from binary restart files created from these equilibrated systems (command write_restart), the above mentioned PPPM error occurred on seemingly random timesteps in 3 systems with different molecules and different pressures after a few days. Since I created restart files with the command “restart N file-*.out” as well as “restart N a.out b.out” I tried to restart the system at a timestep close to where the error occurred. This resulted in “read_restart” failing with the Error: Did not assign all atoms correctly (…/read_restart.cpp:473). The number of atoms, as printed after “read_restart”, is less then in the original run. This error occurred with all the restart files created with the “restart” command.

I managed to replicate this error with attached input script (in.create-restarts) and data file (i-hd.data) with the current LAMMPS version (6 Mar 2014). I am aware that this system is small and by far not equilibrated - it is supposed to replicate the error and nothing else. One of the two generated restart files can then be run by the input script (in.run-restart) - to show the error.
A further Interesting detail is, that a dump or write_data command issued at the same timestep as the “restart” command finds all the atoms strictly within the simulation boundaries and a restart from these files is possible without any problems.

I am wondering now, if it could be possible that both, the incorrect restart files, and the PPPM error could have a common cause - ? hiding ? atoms. So far, I was not able to reproduce the PPPM error in a short run - still working on it.
If necessary, I can provide restart files and input scripts - to try to reproduce the PPPM error - currently I can not provide Log-files for this error - they contain a lot of thermodynamic output and would be too large.

Thank you for your Help!

Best regards,
Thomas

Simulation details:

LAMMPS was built with standard packages only.

Example input script header:

units metal
atom_style full
boundary p p p

pair_style lj/cut/coul/long 13.0
bond_style harmonic
angle_style harmonic
dihedral_style opls
kspace_style pppm 1.0e-5

#control variables

variable T equal 473.15 # temperature
variable dt equal 0.004 # timestep

read_restart damaged-1000.res
#read_restart working.res

reset_timestep 0

special_bonds lj/coul 0.0 0.0 0.5
pair_modify shift no mix geometric tail yes

neighbor 2.5 bin
neigh_modify delay 3

time step and rRESPA

timestep ${dt}
run_style respa 4 2 2 2 inner 2 4.5 6.0 middle 3 8.0 10. outer 4

fix 1 all nvt temp $T $T 0.1 tchain 4

run 800000

in.create-restarts (1.8 KB)

i-hd.data (7.2 KB)

create-restarts.out (13 KB)

in.run-restart (1.82 KB)

run-restart.out (2.36 KB)