"Lost atoms" when running on more than 8 cpus, irrespective of system size.

Dear Mailing List,

I’m having some problems with running LAMMPS on more than 8 cpus and am hoping for your help. I’m using the python interface with ASE, running the exact example script given on the ASE webpage:
https://wiki.fysik.dtu.dk/ase/ase/calculators/lammps.html#ase.calculators.lammpslib.LAMMPSlib
(An example of the LAMMPS input that is effectively run is given just below the ASE python example on this page). Note this is just a static potential energy calculation, not a dynamic simulation.

The behaviour is as I expect until up to 8 CPUs, despite the following warning because this is only a 5-atom system:
WARNING: Proc sub-domain size < neighbor skin, could lead to lost atoms (src/domain.cpp:933)

However, beyond 8 CPUs I get a seemingly common problem of “Lost atoms”, an example being:

ERROR: Lost atoms: original 5 current 4 (src/thermo.cpp:441)

Given this is a small test system, I thought this was just a domain decomposition problem but if I create a 10x10x10 supercell - so 4001 atoms (1 H dopant in Ni) - I still get exactly the same problem of “Lost Atoms” on >8 CPUs.

Can anyone offer any advice what I might be doing wrong, or ways to isolate the cause of this problem? My executable is managed by a sys admin, built with intel 2018/2 and using Python3.7.0. Attached are the input.py, EAM potential, example submission script and a range of labelled outputs. Thanks in advance for any help you can offer!

All the best,

Andy

input.py (455 Bytes)

input.py (455 Bytes)

16cpu.supercell.log (111 KB)

8cpu.supercell.log (112 KB)

16cpu.default.log (1.77 KB)

8cpu.default.log (2.63 KB)

1cpu.default.log (2.59 KB)

lammps.ase.submission.script.txt (648 Bytes)

NiAlH_jea.eam.alloy.txt (287 KB)

This is unlikely to be a Python problem. Can you reproduce the same
behavior with a standard LAMMPS input script? If not, then it could
be a Python issue.

Steve

Hi Steve,

Thanks for your suggestions, and sorry for the delay getting back to you - it’s taken me sometime to build a comparable LAMMPS input file and do the necessary testing.

In short, standalone LAMMPS does not produce this behaviour, nor does using a simple test harness for the LAMMPS Python library. Therefore, the fault seems to lie instead with the ASE interface, which I will pursue further on their mailing list.

Thanks again for your help.

Andy