Problem with bond or angle atom missing

Thank you so much for devoting time into this! I really appreciate it!
I ran most of the simulations on my university cluster with the latest stable release 23 June 2023 and also tried on my desktop with 29 Sep 2021 release. They both returns error.
I also tried equilibration with only nvt and npt before (actually i also tried compression and decompression to remove large voids and then annealing). But it also reported error.
I am not sure about “getting errors at different locations in the system”. The bond atom reported missing during my runs are all C-H bond at lease without fixing shake.
I also tried fix shake on type 3 bond with timestep 1.0 (or gradually increased timestep) and it turns out it failed at an earlier step than fix shake with timestep 0.5.
I am thinking about adding a drag on npt because the pressure really varies a lot.
Also, I ran the system with 10 chains and 20 chains before and sometimes it reported bond atom missing but most of time it rans successfully. And the parameters are exactly the same as the param I uploaded (30 chains).
Not sure if the above info is useful but that is some experience i got from hundreds of failures.

I will also try a few other runs with your suggestion implemented today.

After I modified the input script with only cg minimization, fix shake with 1fs or 0.5 fs time step and nvt npt equilibration, the non-numeric pressure error still occurs. And there are also a few runs returning segmentation fault described as below. And no error is reported in the output or log file, it looks like it suspended without error but the err file shows the segmentation fault.

*** Process received signal ***
Signal: Segmentation fault (11)
Signal code: Address not mapped (1)
Failing at address: 0x10

…/lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0)[0x7f3dcf86e3c0]
…/home/jd784/lammps/lammps-stable_23Jun2022/src/lmp_mpi(+0x3ed4c4)[0x55ac5dac04c4]

1 Like

Good afternoon, Dr. Kohlmey. Thanks for your effort on this topic. Do you happen to find any cause of error within the source code? I have been trying these days but still cannot resolve the issue. And there also randomly reported the segmentation fault a lot of times. Could you provide some other hints for possible causes?

1 Like

Short of a hardware defect or overheating, I do not see any specific indication of a problem.

I have not had the time to do any of the things that I mentioned.

Thanks for your reply. I will try running on different machines.

Okay I tried on another machine (stampede2) and it returns

= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 260857 RUNNING AT c454-113
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES

Intel(R) MPI Library troubleshooting guide:
Documentation Library

And the error file indicates the path to lmp_mpi. Is there a way that I can debug the source code myself?

Sure, please see: 11.4. Debugging crashes — LAMMPS documentation

I’ve been seeing a similar issue to what @demi describes while using the class2 force field style. The simulation will run for 1-2 million steps and then give a “Bonds atoms # # missing on proc #” error. If I restart using a binary restart file, it continues right past the step at which it previously crashed with no problems. I did notice that the last thermo output shows a sudden rise in kinetic energy; unfortunately the dump frequency is much lower than the thermo output, so I wasn’t able to visualize what caused the spike, and since it doesn’t reoccur after using a binary restart I haven’t been able to visualize the problem yet. I’ve seen it happen multiple times and for my particular forcefield (which seems to be the same parameter set as @demi ) it seems to always occur with a C-H bond. I’m going to look at the forcefield parameters and try running a simplified version of the system to try and narrow down possible causes, including trying a few different choices of timestep size.

For reference, I’m using version 2 Aug 2023.

That means atoms are getting too close and then there is a big repulsion that will catapult one of the atoms across the box.

This is usually an indication of either bad force field parameters (i.e. partial charges are too large for the LJ repulsion) or too large a time step (bonds with hydrogen atoms need to be constrained with fix shake or the timestep needs to be about 0.25fs)