Dear Lammps Users,
I want to perform tensile deformation of a polymer network having two types of bond (quartic and harmonic), but experiencing problem while running the simulation in parallel.
Below is the snippet of the script (small test run) that applies a uniaxial strain in the z direction with x and y are controlled to have zero pressure (total beads 3800):
units lj
atom_style bond
pair_style lj/cut 2.5
bond_style hybrid harmonic quartic
special_bonds lj 1 1 1
neighbor 0.4 bin
neigh_modify every 1 delay 0 check yes
velocity all create 1 54654 dist gaussian mom yes
timestep 0.001
fix fxext all deform 1000 z erate 1 remap x units box
fix fxnpt all npt temp 1 1 0.1 x 0 0 1 y 0 0 1
run 5000
(run using the following command: mpirun --mca btl vader,self -np 4 lmp_mpi < in.deform)
As prescribed by the above commands, strain is applied after every 1000 steps with remap x (affine deformation). At step 1000, the given strain rate (erate 1) doubles the box length in z. I monitor the maximum bond lengths (compute bond/local dist, compute reduce max) and also dump all the bonds. As expected, just after the first deformation, at step 1001, a few bonds bonds break.
The above simulation runs perfectly fine on a single processor, but in parallel, it terminates at step 1011 (10 steps after bonds break) with the Segmentation fault (Signal code: Address not mapped (1)). As suggested by Axel on some previous posts, here is the trace of the error obtained using gdb (from one of the core.# output files):
#0 0x0000000000936926 in LAMMPS_NS::BondHybrid::compute(int, int) () at …/bond_hybrid.cpp:80
#1 0x000000000215bd7a in LAMMPS_NS::Verlet::run(int) () at …/verlet.cpp:315
#2 0x000000000210b395 in LAMMPS_NS::Run::command(int, char**) () at …/run.cpp:183
#3 0x0000000000f26786 in void LAMMPS_NS::Input::command_creator<LAMMPS_NS::Run>(LAMMPS_NS::LAMMPS*, int, char**) () at …/input.cpp:863
#4 0x0000000000f24bc1 in LAMMPS_NS::Input::execute_command() () at …/input.cpp:846
#5 0x0000000000f256e7 in LAMMPS_NS::Input::file() () at …/input.cpp:243
#6 0x0000000000f43796 in main () at …/main.cpp:64
I tried changing neighbor list attributes; increasing the cutoff increases the number of steps it runs, but even for a really high cutoff it doesn’t run more than 1500 steps (cutoff = 0.0 terminates at step 1001 immediately after bond breaking).
I compared the output of every step until 1010 (or the maximum steps in parallel) with the serial run:
Serial Parallel
step Lz maxb max_bondenergy step Lz maxb max_bondenergy
999 16 1.082 10.83 999 16 1.082 10.83
1000 32 2.072 1030 1000 32 8.580 7201870
1001 32 1.499 74.33 1001 32 1.499 74.33
At step 1000 (first deformation step), serial run shows maximum bond length doubles while the parallel run shows a significantly high bond length. But just after step 1000, both run gives the same output (step 1001 onwards). Naively, it appears a problem of recreating bondlist, neighborlist after bonds break in the domain decomposition setup (probably that is giving error in bondhybrid compute function noted in trace). I tried many different things by reading previous posts on quartic bonds but still couldn’t identify the problem. I would truly appreciate if someone can help me with this error.
Thanks for your time in advance!
Akash