[lammps-users] NaNs in position and velocity, exit with segmentation fault

Hello,

I am running a very simple simulation where I'm compressing a cube of metal via the fix npt and fix strain commands (see attached input script). I vary the strain rate down to 10^6 s^-1 and the temperature between 50 K and 1200 K. Most simulations run fine, but when I lower the strain rate and/or temperature, they crash and exit with a segmentation fault. I found out the problem comes from NaN values in the position of atoms and their velocities. I checked my restart file (which comes from energy minimization and is the same used for the simulations that are running), and it is not corrupted. I also found out that reducing the timestep will usually fix the problem, although I am trying to run with a reasonable timestep (1 to 5 fs).

Is there a known bug in LAMMPS (I suspect from the fix npt algorithm) that would cause this problem?

Thank you for your help.

Virginie Dupont

compression.in (884 Bytes)

Fix npt recently changed, so are you using the most current version
of LAMMPS? 1 fmsec is a fine timestep for EAM, but 5 fmsec is
too big, so I wouldn't use that. In general, if you get NaNs, it
means you are blowing atoms out of the box, so I would monitor
the thermodynamics (T,Press,Vol) on a fine timescale and also
viz the system near when it blows up and try to see what is going
wrong. If you continue to have problems, please post more details.

Aidan has done a variety of these kinds of simulations, so he may
have additional comments.

Steve

The old NPT should handle this just fine, if you do it right. Nose-Hoover
barostats in general are prone to blowing up if you abuse them. Typical
causes are:

1. too large a timestep
2. too small a damping parameter
3. unequilibrated initial state
[best to start each NPT MD simulation with particles that are already
equilibrated (postions and velocities) to the initial target T and P]
4. very small sample sizes (large fluctuations)
5. doing other stuff at the same time, like straining the box very fast

So, the bottom line is, find a simulation that works by choosing very
conservative values of the simulation parameters that limit the
above-mentioned problems, and then see how far you can push the parameters
before things fail.

Increasing the drag factor may also help.

Finally, if the simulation is working properly, the results should not be
sensitive to the choice of simulation parameters (except strain rate).

Aidan

Steve, Aidan,

Let me clarify some points.

- I tried both with an older version (21Nov09) and a recent one (30Mar10), I got the same result.
- The NaNs arrive at the very first timestep, not after running for a while, which means I can't monitor much what happens. The only thing I can visualize is my initial system (from the restart), and it's fine (and working with other simulations at different strain rates/temperature).
- Even with a timestep of 1fs, some of my simulations crash (I'm not that surprised for the 5fs, and thought it was the actual problem). 2fs works on some simulations but not all.
- My sample is 50x50x50 lattices^3 (325x325x325A^3 for Cu)
- My simulations at higher strain rates (up to 10^12s^-1) work fine, the problem appears when I lower my strain rate (below 10^8 or less depending on the temperature studied).
- I haven't tried to play with the drag factor or the damping parameter yet.
- I start from a restart file that was created by energy minimization, but I actually have more problems at lower temperature (50K).
- For a given temperature, I only change the strain rate and the timestep to have more precision at high strain rates. But even for a fixed timestep (and all the other parameters fixed as well), lowering the strain rate can cause a simulation to crash.

I can try to equilibrate my system at the given temperature before and see if it improves the behavior.

Thank you for your help.

Virginie

I can try to equilibrate my system at the given temperature before and
see if it improves the behavior.

I think this will fix the problem. Make sure to match the pressure as well.

The NaNs arrive at the very first timestep, not after running for a while, which means I can't monitor much what happens. >The only thing I can visualize is my initial system (from the restart), and it's fine (and working with other simulations at >different strain rates/temperature).

Do you mean at timestep 0 or timestep 1? If the former, then your
initial configuration is bad. If the latter, then you must be generating
huge forces at timestep 0 (is the pressure huge), in which case
you will get huge bad displacements on the first timestep. Either
way, you are doing something wrong, and the thermo output
should give you some indication of what it is.

Steve

The NaNs were appearing at timestep #1. With equilibration and a timestep of 1fs, it seems to be working. But if I equilibrate and then strain from the equilibrated restart, my temperature and pressure still oscillate significantly for a while at the beginning. That doesn't happen when I equilibrate and strain without stopping the simulation.

Virginie

If you run NVE from your starting state, are things stable?
If you run NPT from your starting state, are things stable?
Both should be the case if you have a reasonably equilibrated
system. If NVE is not stable, then NPT may have problems.
If NVE is stable, then NPT should work OK as well.
If both are OK, then you run fix deform (straining) form your
starting state, you should be OK, unless you strain very
fast.

Steve

With the help of Davis Herring, we finally found the bug that caused the NaNs. It turns out that p_target in fix_nh.cpp was never initialized, which means that 3 of its components (depending on whether we were using the option triclinic) were never assigned a value. It looks like in my case, a NaN value would be assigned for these. p_period is initialized twice though, so maybe it's just a typo and one of the p_period was supposed to be a p_target.

Virginie

I don't see that in the latest version of LAMMPS. Can you send me your copy
of fix_nh.cpp?

Aidan

I have a version including fixes up to June 19th (see attached). Lines 77 and 79, p_period is initialized twice to 0, but we never have an initialization of p_target.

Virginie

fix_nh.cpp (54.9 KB)