Memory error (src/memory.cpp:66) while write_data

Dear all,

My simulations are running into a memory error and so I wanted to ask if anyone has experienced the same problem or can help me with expertise in the software’s memory management and hints on how to solve it.

I am running lammps 24 Mar 2022.
The error message in lammps output is

ERROR on proc 0: Failed to allocate 73904 bytes for array comm:buf_send (src/memory.cpp:66)
Last command:   write_data        data.run1

The array specified in the error message changes, but the reference to the source code is always the same.

The simulation is on a 600 atom system, I am using a hybrid potential of vashishta and coul/long.
The simulation protocol looks like this:

run 100
write_data data.run1

group satoms type 4

compute 4 satoms msd

fix outputmsd all ave/time 10 1 10 c_4[1] c_4[2] c_4[3] c_4[4] file msd.txt

run 100
write_data data.run2

The error does not depend on the simulation length of the individual runs and occurs either during the first or the second write_data.

I also tracked the available memory during the simulation and got 48 GB out of the 64 GB RAM as available when the simulation crashed. So I would guess that the problem is not lack of available memory…

The strange thing is, that the same input file with the same system runs fine on other nodes with the same lammps version (compiled for those nodes). This specific node used to work, too, but now this error occurs.

I first spoke to our local administrator. As no changes have been made to the node I am having problems with, I was told that as this seems to be a lammps error/bug, I should post this question here.

Any help or advice would be appreciated. Thanks in advance!

If the exact same input works with the exact same LAMMPS version on other nodes, then it is highly unlikely that this is a LAMMPS error.

But let is assume, that it is a LAMMPS bug, then the first thing you should do is to compile a more recent version of LAMMPS to check if that issue has been resolved. Your version is almost three years old and some bugs have been fixed since.

With this few atoms, there should not be any requests for large amounts of memory.

One possibility is that there is some memory corruption in the kernel or some other process running (possibly even a system process, or a so-called zombie) that is hogging almost all of the address space and thus not leaving much space for even small amounts of memory to be allocated. A forced reboot of that node, if otherwise idle, could clear that up.

1 Like