angle atoms missing crash -- comm or pair-cutoff issue?

Hi everybody! Hope you’re all doing well :slight_smile:

I am trying to equilibrate semiflexible Kremer-Grest-like melts using the DBH method [Auhl et. al, J. Chem. Phys., 119, 12718 (2003)] and am experiencing reproducible “Angle atoms 74952 74953 74965 missing” crashes.

I’m using the 16Mar18 version of LAMMPS. My input script is below. It’s rather long, but I think the problem is probably quite simple.

DBH equilibration for Nch = 522 N = 200 kb = 9 Kremer-Grest chains

combine the gradual hardening of the pair and bond potentials with DBH

units lj

atom_style angle

special_bonds lj 1 1 1

read_restart pushoff.res.1250

reset_timestep 0

neighbor 0.3 bin

neigh_modify every 1 delay 0 check yes

comm_modify mode single cutoff 5

pair_style nm/split 1.12246

pair_coeff * * 1.0 1.1246 12 6

pair_modify shift yes

bond_style fene

bond_coeff 1 30 1.5 0 1

angle_style cosine

angle_coeff 1 8

fix 1 all nve/limit .009375

fix 2 all langevin 2.0 2.0 1 4528641 zero yes

compute msd1 all msd com yes

fix 3 all ave/time 1 1 125 c_msd1[4] file diffusion

DBH bond swapping

fix 4 all bond/swap 25 0.5 1.3 598298

dump 1 all custom 1250000 configs.DBH id mol type x y z ix iy iz

rem need dump bondlist so can reoorder into correct chains

compute 2 all property/local btype batom1 batom2

dump 2 all local 1250000 bonds.DBH index c_2[1] c_2[2] c_2[3]

output double-bridging statistics

variable 1 equal f_1

variable accept equal f_4[1]

variable attempt equal f_4[2]

timestep .001

thermo 125

thermo_style custom step temp etotal epair ebond eangle press vol v_accept v_attempt

restart 1250000 equilib.res

run 25000

increase timestep, go to softer pair and bond interactions

unfix 1

fix 1 all nve/limit .01875

pair_coeff * * 1 1.12246 7 6

bond_coeff 1 11.4582 1.5 0.0 1.0

timestep .002

run 25000

increase timestep

unfix 1

fix 1 all nve/limit .028125

timestep .003

run 25000

increase timestep

unfix 1

fix 1 all nve/limit .0375

timestep .004

run 25000

increase timestep

unfix 1

fix 1 all nve/limit .046875

timestep .005

run 3650000

In the above, the “nm/split” pair potential is a simple modification of pair_style nm/cut. “pair_coeff 1.0 1.12246 12 6” gives the standard Lennard-Jones potential, while “pair_coeff 1.0 1.12246 7 6” gives a 7-6 potential for r < 1.12246 and standard LJ for r > 1.12246. “bond_coeff 1 11.4582 1.5 0.0 1.0” gives FENE bonds that have the same equilibrium length as the usual K-G model, i.e. l_eq = .961. I’m doing this to combine DBH with a DPD-like softening of the pair potential. It would equilibrate systems very quickly if it didn’t crash. I’ve run a lot of similar simulations without DBH in the past, and had only occasional crashes.

Unfortunately, the above job always crashes after a few hundred thousand steps when run in parallel. The thermodynamics don’t blow up and I see no indication that there are huge forces causing atoms to get blown out of the cell and cause the crash.

The initial version of the above script lacked the

neigh_modify every 1 delay 0 check yes

comm_modify mode single cutoff 5

commands. I added these after looking up the “Angle atoms %d %d %d missing on proc %d at step %ld” section of the doc pages, which states “One or more of 3 atoms needed to compute a particular angle are missing on this processor. Typically this is because the pairwise cutoff is set too short or the angle has blown apart and an atom is too far away.”

Doing more frequent reneighborings does seem to help some, and the “comm_modify” command also seems to help some, but “Angle atoms missing” crashes still occur every time IF I run in parallel. Single-processor jobs don’t seem to crash, so it appears the problem is with the MPI communication rather than the physics.

The other possibility mentioned above is that “the pairwise cutoff is set too short”. Doubling the cutoff radius to 2.24492 allows parallel jobs to run without crashing! However, I don’t want to use this larger cutoff. Besides greatly slowing down the simulations, it adds attractive LJ interactions I don’t need or want.

So, what should I do here? Would increasing the skin depth help? Or is my “comm_modify” command not the right one to use here? Or is there some other strategy you’d suggest to stop these crashes?

Thanks,
Rob

LAMMPS will show the cutoffs in use at the beginning of a run as part of the “Neighbor list info” section.

increasing the communication cutoff should normally suffice. it will increase the number of ghost atoms, but not the neighbor list. by how much it needs to be increased depends on the length of your bonds.

these kinds of crashes will never happen for a serial run, since atom->map() will never fail. it may just return the position of an atom from “far, far away”.

axel.

Hi Axel (and everybody)! LAMMPS indeed lists the various cutoffs at the beginning of runs; they are what I’d expect them to be given my input commands.

I’ve played around with this some more, without success. Thinking that this issue might be caused by overstretched bonds, I modified bond_fene.cpp so that it triggers a crash when a bond length exceeds R_0 (rather than 2R_0 as is the case in the stock version). With R_0 = 1.5σ, this should prevent atoms 1 and 3 (of the angle 1-2-3) from being more than 3σ apart. If I’m understanding this correctly, using the command

comm_modify mode single cutoff 4

should mean that all angles with a 1-3 distance less than 4σ should be properly accounted for, i.e. “Angle atoms missing” errors should occur only if an angle has a 1-3 distance > 4σ. Any 1-3 distance > 3σ necessarily means a bond is overstretched and should trigger a crash, so “Angle atoms missing” errors should be impossible. Nevertheless they’re continuing to occur! And they aren’t even proceeded by overstretched-FENE-bond warnings. So I think the root of this problem is understanding how I’m getting angles with a 1-3 distance less than 3σ running afoul of the MPI angle communication even with the above comm_modify command. Any thoughts?

Potentially relevant: I’m now using the current (3 Mar 2020) version of LAMMPS; switching to the newer version didn’t seem to change anything. I found the old thread “crashes during DBH equilibration” from March 2018 I hadn’t been able to find before. I’d encountered similar problems back then, and was able to solve them by using “comm_modify cutoff 2.95”. The main differences between the current runs and the ones from 2018 are that the temperature is higher and the angles are stiffer. Back then you (Axel) suggested I might need to use a smaller timestep. I am indeed using a smaller timestep now (.008), and am not getting thermodynamic blowup or lost atoms or the other typical problems one gets with a too-large timestep.

Thanks,
Rob

Hi again! PS – I found that I’d forgotten to assign atoms’ molecule IDs in my initial state in the way one has to to properly set up DBH equilibration of monodisperse chains. Rather than setting the molecule ID of atom i on chain j to min(i,N+1-i) [where N is the chain length], I set it to j. Thus fix bond/swap was only performing INTRAchain swaps. I corrected the initial state’s molecule IDs and now the system has successfully run for a couple million steps without crashing. Hopefully this fixed the problem!

However, it still doesn’t explain why I was getting the crashes in the first place. The only thing I can think is somehow the swaps involved a chain’s interaction with its own periodic image. Obviously one doesn’t want such interactions, but it might be useful to add a check to fix bond/swap to prevent this from happening.

Best,
Rob