crashes during DBH equilibration

Hi everybody. I am getting repeated “Angle atoms missing” errors when trying to equilibrate Kremer-Grest polymers via the “DBH” (Auhl 2003) method. The command I’m using to do the bond/angle swapping is standard, i.e.

fix 4 all bond/swap 10 0.5 1.3 598101

After a short number of steps (typically a few thousand), the run crashes with errors like

ERROR on proc 18: Angle atoms 98824 98822 98823 missing on proc 18 at step 4360 (…/ntopo_angle_all.cpp:68)

These errors are clearly associated with the fix; if I reduce the swap attempt fraction (the ‘0.5’ in the above line) to zero, no errors occur.

They tend to occur faster the more processors I run the job on. For this system, results are:

Nproc Crashes on timestep #
27 4360
24 3450
20 8110
16 3210
12 5110
8 6030
6 11630
4 13350
2 24420
1 none? no crashes after 150k steps.

This is for a system of 107 flexible chains, each containing 2800 monomers, at standard K-G melt conditions (rho = .85, T = 1) with all the usual parameters. No thermodynamic quantities are blowing up before these crashes. The above crash-timestep numbers reproduce if I repeat the runs. The atoms involved don’t repeat, they’re not always on the same chain or anything like that.

Any idea what could be causing this? I’ve been doing DBH runs for years and set up this run the same way I always have. I suspect that the trend with Nproc implies there’s some sort of interprocessor-communication glitch (that can be resolved by editing my LAMMPS script; there’s nothing wrong with the machine I’m running this on) that’s causing this. For reference, my input script is

high-T const-V DBH equilibration for Nch = 107 N = 2800 kb = 0 Kremer-Grest chains

units lj

atom_style angle

special_bonds lj 0 1 1

read_data data.kb0.Nch.107.N.2800.endoffastpushoff

neighbor 0.3 bin

pair_style lj/cut 1.12246

pair_coeff * * 1.0 1.0 1.12246

pair_modify shift yes

bond_style fene

rem start off with weaker FENE bonds

bond_coeff 1 10 1.5 1.0 1.0

angle_style cosine

angle_coeff 1 0

set Tinit to 1

velocity all create 1.0 2358365 dist gaussian mom yes rot yes

fix 1 all nve/limit 0.05

fix 2 all langevin 1 1 1 4528641 zero yes

compute msd all msd com yes

fix 3 all ave/time 1 1 1000 c_msd[4] file diffusion

DBH bond swapping

fix 4 all bond/swap 10 0.5 1.3 598101

dump 1 all custom 2000000 configs.eq.constP.DBH id mol type x y z ix iy iz

rem need dump bondlist so can reoorder into correct chains

compute 2 all property/local btype batom1 batom2

dump 2 all local 2000000 bonds.eq.constP.DBH index c_2[1] c_2[2] c_2[3]

output double-bridging statistics

variable 1 equal f_1

variable accept equal f_4[1]

variable attempt equal f_4[2]

timestep .01

thermo 100

thermo_style custom step temp etotal epair ebond eangle press vol v_accept v_attempt

restart 4000000 equilib.res

run 4000000

increase FENE k

bond_coeff 1 15 1.5 1.0 1.0

run 4000000

increase FENE k

bond_coeff 1 22.5 1.5 1.0 1.0

run 4000000

back to normal FENE k

bond_coeff 1 30 1.5 1.0 1.0

run 4000000

Any help would be appreciated :slight_smile:

Thanks,
Rob

Hi everybody. I am getting repeated "Angle atoms missing" errors when trying to equilibrate Kremer-Grest polymers via the "DBH" (Auhl 2003) method. The command I'm using to do the bond/angle swapping is standard, i.e.

fix 4 all bond/swap 10 0.5 1.3 598101

After a short number of steps (typically a few thousand), the run crashes with errors like

ERROR on proc 18: Angle atoms 98824 98822 98823 missing on proc 18 at step 4360 (../ntopo_angle_all.cpp:68)

These errors are clearly associated with the fix; if I reduce the swap attempt fraction (the '0.5' in the above line) to zero, no errors occur.

They tend to occur faster the more processors I run the job on. For this system, results are:

Nproc Crashes on timestep #
27 4360
24 3450
20 8110
16 3210
12 5110
8 6030
6 11630
4 13350
2 24420
1 none? no crashes after 150k steps.

This is for a system of 107 flexible chains, each containing 2800 monomers, at standard K-G melt conditions (rho = .85, T = 1) with all the usual parameters. No thermodynamic quantities are blowing up before these crashes. The above crash-timestep numbers reproduce if I repeat the runs. The atoms involved don't repeat, they're not always on the same chain or anything like that.

Any idea what could be causing this? I've been doing DBH runs for years and set up this run the same way I always have. I suspect that the trend with Nproc implies there's some sort of interprocessor-communication glitch (that can be resolved by editing my LAMMPS script; there's nothing wrong with the machine I'm running this on) that's causing this. For reference, my input script is

this usually happens, when your system is changing too fast for either
the neighbor list update or the exchange of coordinates between
neighboring subdomains.
the parameters, that you want to check out are the neighbor list
update frequency (neigh_modify: every, delay, and check) and the
communication cutoff (comm_modify: cutoff)

axel.

Ah, thanks Axel! Seems this issue occurs more for stiffer chains and increasing the communication cutoff with comm_modify helps…

Best,
Rob

Ah, thanks Axel! Seems this issue occurs more for stiffer chains and
increasing the communication cutoff with comm_modify helps....

​if you say, "more for ​stiffer chains", then this rings another alarm bell
for me. perhaps you need to reduce the time step in those cases, or look
into using one of the soft-core potentials, to reduce the repulsion for
significant overlaps a bit.

axel.