Lammps stopping without any error

richard · January 31, 2024, 7:33am

embf240f.dat (4.1 MB)
IL.in (12.3 KB)
I am using lammps version LAMMPS (8 Feb 2023) the simulation stops first step after making the inversion matrix which is 2 fs steps without any error. I have tried various minimization permutations but still the problem exists.

akohlmey · January 31, 2024, 1:52pm

Please note that the people responding here are volunteering their time and also are not familiar with your research, so if you want to get help you have to make it as easy as possible to help you. Unfortunately, your post is the exact opposite of that.

There are multiple issues here:

LAMMPS does NOT stop prematurely without any error message. You probably have not looked in the right place. When running in parallel on a cluster, you may not find the error in the log file, but in the output to stderr. Thus you must not suppress that.
You report that you are using a LAMMPS version that is neither the latest feature version not the latest stable version. So it is up to you to see if there is perhaps a problem that has since been fixed.
You use an input deck that uses a directory hierarchy to collect its output. While this may be convenient to you, it is extremely annoying for testing. It would be better to have an input deck that can be run from a flat hierarchy like the LAMMPS examples and benchmark inputs
You do not provide any information about how you compiled and run LAMMPS and what kind of platform you are running on with how many processors and whether this can be easily reproduced on a desktop machine with a few (say no more than 8) processors. Since this is what most people have available for testing and exploring which modifications to your input could resolve the issue you are seeing
Your input deck is a complex multi-step calculation. That is wasting your and other people’s time trying to debug what is wrong. You can (and in parts you do) output suitable data files after each stage of your simulation, so you could have told a) at which stage the simulation fails and b) constructed an input file that started from the data file preceding the failure, so that the other steps do not have to be repeated by everybody.
Your input deck is cluttered with lots of statements for collecting data for your analysis and creating output. This should be removed when they have no impact on the trajectory itself. This will either avoid the crash, and then you can easily narrow down which of those triggers the crash and then construct a minimal input deck with only that one set of commands that cause the trouble, or it will be easier to determine which features of LAMMPS could be failing on you.
In general, the smaller your test system, the faster it can be run, the better the chances that somebody can help you narrow down the problem (please note my statement about not requiring a cluster to run the input). Some debugging tools slow down execution significantly so it can be very much worth to construct a new system that is very small (has as few atoms/molecules as possible to still reproduce the crash) and thus can be debugged more easily.

In summary, the more context you can provide and the more you can narrow down the issue to a minimal input deck that will reproduce the issue quickly, the better your chance to get help.

Your “here is an input deck that does not run to completion” type of post puts all the burden on the person that would potentially want to help you and thus your chances to get help are extremely low.

stamoor · January 31, 2024, 3:31pm

I’ve seen LAMMPS stop without any error, but it normally is due to a segmentation fault, or when running in parallel and the output didn’t get flushed to the screen/log before the job was aborted.

Like Axel said we need a minimal working example otherwise we cannot provide meaningful help.

srtee · February 1, 2024, 1:40am

Luckily I am (1) a forum nerd (2) who co-develops the ELECTRODE package, and the most straightforward initial diagnosis is that you are trying to use matrix-based charge equilibration methods on a system large enough that the matrix cannot be held in memory. So LAMMPS crashes upon requesting more memory than the system can allocate.

The simplest thing to do is switch to the conjugate gradient method and see if it works. But why are you trying such a complicated method on your system anyway? More importantly, it will help you in future if you are using unconventional methods (like constant potential) to be upfront about them – and also investigate if maybe that is the source of the problem. I have a feeling your script will run without ELECTRODE.

stamoor · February 1, 2024, 4:38pm

Also true, out of memory will cause signal 9 from the Linux OOM killer (see https://www.reddit.com/r/linux/comments/zauqxt/linux_outofmemory_killer_oom_killer/), which would also cause LAMMPS to stop without an error. This is different than signal 11 (segmentation fault) or an MPI abort message.

akohlmey · February 1, 2024, 5:03pm

While these are not LAMMPS errors, but in either case the MPI library and/or the operating system will print some error message to stderr.

So with that in mind the failing LAMMPS executable will still generate an error message, even if it is not generated by LAMMPS itself.

Buffering issues can be suppressed by using the “-nb” or “-nonbuf” flag, which turns off buffering for log and screen output (it won’t affect the buffering/communication from the MPI library for remote nodes).