Strange Behavior: Non-numeric pressure error that depends on length of input script?

asgross · December 20, 2021, 11:09pm

All,

Relatively new Lammps user. I’ve been running some test cases on my system to check that my input parameters give me a well-behaved system. I’ve run into a very odd issue that I can’t understand.

Background
System is a set of 433 benzene molecules with the Amber gaff2 forcefield. I’ve used this same initial configuration & gaff2 benzene force field previously without any simulation errors. I am now running the following test with it to check pppm and rattle parameters (2 fs time step), all from within the same input script:

Minimize
Heat to 293K using fix nve and velocity scaling
3 sequential 100 ps NVE runs (fix NVE)
3 sequential 100 ps NVT runs (fix NVT)
3 sequential 100 ps NPT runs (fix NPT)

Whenever I switch to a different time integration scheme (from heating to NVE, from NVE to NVT, or from NVT to NPT) I remove all active fixes (via unfix), and then start the new fixes I need for the next simulation type. However, I do not remove-then-start fixes if the time step integration is unchanged between blocks (e.g. end of 1st NVE block to beginning of second NVE block)

The above runs fine. The problem is as follows

Strange Behavior
If, instead of running 3 sequential 100 ps NPT runs, I run 4, the simulation crashes upon starting the first NPT block, with an error of “ERROR: Non-numeric pressure - simulation unstable (src/fix_nh.cpp:1069)”. Again, this occurs at the onset of the first NPT block out of four. However, if I only have 3 NPT blocks in the input script, Lammps runs without issue and the simulation energies are completely normal.

So what I don’t understand is why Lammps fails at the onset of the first NPT block in the 4-NPT block simulation, when it has no issue with it (or any of the NPT blocks) for the 3-NPT block simulation.

Also, my understanding is that lammps reads and processes/executes input scripts on a line-by-line basis, as opposed to reading and processing an entire input script before executing), so why would an additional run statement line cause an earlier run statement line to behave differently (and fail, in this case)? Is this some strange floating point math issue?

Other Notes
This issue only occurs for NPT runs when Rattle is enabled. Without rattle, or for constant-volume simulations, I don’t see the issue. Regarding rattle, its occurred for two different rattle tolerances that I’ve tried, 1E-8 and 1E-4. For the simulation that runs successfully (the 3-NPT block), system side length increases by ~0.5% when going from the NVT blocks to the NPT blocks, from 40 angstrom to 40.2 Angstrom (with 0.1 angstrom length fluctuations during the NPT runs)

Lammps Input Script that Fails:

units real
atom_style full
boundary p p p

#pair_style lj/charmm/coul/long 8.0 12.0
pair_style lj/cut/coul/long 10.0
pair_modify mix arithmetic
dielectric 1.0
special_bonds amber
bond_style harmonic
angle_style harmonic
dihedral_style charmm
improper_style cvff

read_data benzene.box.data

kspace_style pppm 1E-6
#kspace_modify gewald 0.2935 mesh 40 40 40
group mobile union all
thermo_style custom elapsed elaplong etotal ke pe emol ebond eangle edihed eimp evdwl ecoul elong temp press vol lx ly lz etail ecouple econserve
thermo_modify line multi
thermo 250
thermo_modify flush yes

#Minimization

min_style sd
minimize 0.0001 0.000001 3000 30000
min_style cg
minimize 0.00001 0.000001 7000 70000
write_restart min.pppm.rst

#Heat

reset_timestep 0
timestep 2

fix 2 mobile nve
run 0

velocity mobile create 10 1625586901 mom yes rot yes dist gaussian
fix 1 mobile temp/rescale 500 10 293 1 1
fix rattle all rattle 1E-8 200 50 t 2
#dump 2 all dcd 250 heat.dcd
run 14500
write_restart heat.rst
#undump 2

#NVE 1

unfix 2
unfix 1
unfix rattle
#reset_timestep 0

fix 1 mobile nve
fix rattle all rattle 1E-8 200 50 t 2
#dump 2 all dcd 500 nve.1.dcd
run 50000
write_restart nve.1.rst
#undump 2

#NVE 2
#reset_timestep 0

#dump 2 all dcd 500 nve.2.dcd
run 50000
write_restart nve.2.rst
#undump 2

#NVE 3
#reset_timestep 0

#dump 2 all dcd 500 nve.3.dcd
run 50000
write_restart nve.3.rst
#undump 2

#NVT 1
unfix 1
unfix rattle
#reset_timestep 0

fix 1 mobile nvt temp 293 293 $(100.0*dt)
fix rattle all rattle 1E-8 200 50 t 2
#dump 2 all dcd 500 nvt.1.dcd
run 50000
write_restart nvt.1.rst
#undump 2

#NVT 2
#reset_timestep 0

#dump 2 all dcd 500 nvt.2.dcd
run 50000
write_restart nvt.2.rst
#undump 2

#NVT 3
#reset_timestep 0

#dump 2 all dcd 500 nvt.3.dcd
run 50000
write_restart nvt.3.rst
#undump 2

#NPT 1
unfix 1
unfix rattle
#reset_timestep 0

fix 1 mobile npt temp 293 293 $(100.0*dt) iso 1.000000 1.000000 $(1000.0*dt)
fix rattle all rattle 1E-8 200 50 t 2
#dump 2 all dcd 500 npt.1.dcd
run 50000 <— As is, with 4 NPT blocks, this fails
write_restart npt.1.rst
#undump 2

#NPT 2
#reset_timestep 0

#dump 2 all dcd 500 npt.2.dcd
run 50000
write_restart npt.2.rst
#undump 2

#NPT 3
#reset_timestep 0

#dump 2 all dcd 500 npt.3.dcd
run 50000
write_restart npt.3.rst
#undump 2

#NPT 4 <—Removing this line and everything after causes the script to run without issue
#reset_timestep 0

#dump 2 all dcd 500 npt.4.dcd
run 50000
write_restart npt.4.rst
#undump 2

akohlmey · December 21, 2021, 12:55pm

It is difficult to come up with any possible explanation or bugfix on this kind of report.
There are multiple possible explanations: it can be that your system is marginal or there could be a bug somewhere. However, removing a commented out line cannot cause it, there must be something else happening. LAMMPS reads and processes input files line by line, so any change in a later part does not affect it.

There are multiple things that you can do to help tracking this issue down.

try changing the random seed for the initial velocity
try changing the number of timesteps of individual runs
try using fix shake instead of fix rattle
try running the individual sections (minimization, heating/nve, npt) as separate inputs with restarting

Ultimately, you need to figure out the smallest possible system and input file that runs the fastest on a few as possible processors that can reliably reproduce this issue and then this can be analyzed with the usual debug toolchains.

P.S.: you heating with fix temp/rescale is a really bad approach. Temperature rescaling like this does not promote equipartitioning but works against it and thus is counterproductive for equilibration. You are better off using a dissipative thermostat like one of the many langevin like thermostat variants in LAMMPS (langevin, temp/csld, gld, gle).

asgross · December 21, 2021, 2:01pm

Thanks Axel

Running every section from its own input script was my next step actually. As well as using the restart file from the last NVT block to initiate the NPT dynamics.

I realized I forgot to mention that this was occurred using the 29 Sep 2020 version of Lammps if it matters. I also have 29 Oct 2020 build readily available, so I may try running it on that to see if it makes a difference.

As another test, I reran the script, but with 8-100 ps NPT blocks instead of 3 or 4, and that run without issue. But I also attempted 20-100 ps NPT blocks, and that crashed at the first NPT block as well with the same error.

At this point, solving this issue is somewhat of a side issue for me, but it would be nice to resolve. I’ll add updates as I get them.

And thanks for the tip on heating. I agree, velocity rescaling is a very brute force way to heat the system. For my real simulations, I only using it to get the system to approximately the right temperature, after which I equilibrate using NVT (and/or NPT) for an appropriate time based on examining the system properties. I’ll look into those alternative methods.