Running terminated unexpectedly without error reported

Hi, all,

I recently encountered a problem: my in.file terminated after only running part of it without any error. The .sta file displays the following information:

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 9721 RUNNING AT n0108
= EXIT CODE: 134
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES

YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Aborted (signal 6)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions

Does it suggest that there is a problem with my lammps program, that is, there is a problem with lammps compilation? I don’t know much.

Has anyone encountered a similar problem? Welcome to leave a message. Thank you very much!

in.tightpath (5.8 KB)
86485.err (23.4 KB)
tightpath.sta (11.0 KB)

The answer is in the .err file. There you have:

*** Error in `/home1/hangc/lammps/lammps-29Sep2021/src/lmp_mpi': double free or corruption (!prev): 0x000000000437c4c0 ***
*** Error in `/home1/hangc/lammps/lammps-29Sep2021/src/lmp_mpi': double free or corruption (!prev): 0x0000000003993e80 ***

This would be due to a bug in the C++ source code somewhere.
Unfortunately, there is no debug info in this output, so it is not easily possible to identify which part of LAMMPS is causing this.

There is a small chance that the bug has been found and fixed since, so you can try running with the latest LAMMPS version 23 June 2022.

Thank you very much, teacher! I’ll try to use the latest version of lammps to see what will happen.

But I have another problem. During this running, many core.* files have been generated (not seen in other simulation), and total number of neighbors is always zero, so what does this mean?

The core files are a consequence of the bug in the code. You can safely delete them for now.

Nothing.

Unfortunately, even if I used the latest version (2022) of lammps (Intel compilation), the same running termination occurred again…
tightpath (1).sta (11.1 KB)
87906 (1).err (23.4 KB)

OK, I suspected as much, but it is helpful to confirm that the bug is present in the latest sources.

There are three things you can do to help tracking it down.

  1. Obtain a stack trace from the crashed executable and provide it. Please see 11.4. Debugging crashes — LAMMPS documentation
    In your case, you can use something like gdb /path/to/lmp_mpi core.### to try obtaining a stack trace from one of the already crashed coredumps. You can use google if you need to learn more details about the gdb command
  2. Obtain a stack trace and memory access error report using valgrind (if that is available on the machine you are running). Again see: 11.4. Debugging crashes — LAMMPS documentation but note that LAMMPS will run very slow when run under valgrind
  3. create and provide an input deck that reproduces the issue with as few atoms as possible and with fewer simulation steps. This doesn’t have to do proper science, just trigger the bug easily and quickly, so that I could do steps 1. and 2. by myself on my development machine and then experiment with various changes to the source code to resolve the issue.

Fortunately, I divided this simulation into two stages (two in. files), and the problem was solved. If I have time, I will consider your suggestions carefully. Thank you very much for your guidance.

Nothing is solved this way. This bug can hit you any time again.
How do you know that your simulation is correct? You may have bypassed the part that makes your simulation crash, but it still can have corrupted data.