LAMMPS ids in python

noam.bernstein · January 31, 2025, 2:11pm

Two years ago I contributed a change to ASE’s LAMMPSlib to get atomic energies, and it was definitely working. I recently updated to the latest ASE (3.24.0) and LAMMPS (current develop branch HEAD, commit 48893236), and code that I could swear worked before does not.

In particular, line 501 in ase/calculators/lammpslib.py ase/calculators/lammpslib.py · master · ase / ase · GitLab returns an error. The system is a periodic unit cell, smaller than the cutoff of the potential, and I believe I compiled LAMMPS without MPI. There are 29 atoms, but the length of the extracted ids is 1856, so the assertion fails.

I’m trying to understand what might have changed in relation to this, which seems like pretty basic functionality that should have been stable for a long time. Or if nothing has, how it’s been (apparently) working for the last two years.

Should the ASE code really be expecting the length of ids to be equal to the number of atoms? Are the additional IDs just those for ghost atoms from periodic images (since MPI is not enabled)?

noam.bernstein · January 31, 2025, 2:17pm

Is the problem that this code really needs to use extract_atom_size, as suggested in 2.4. The lammps Python module — LAMMPS documentation? And for some reason previously this was (apparently) not needed?

akohlmey · January 31, 2025, 2:21pm

The array must likely contains both local and ghost atoms.

The LAMMPS code and also the LAMMPS python interface undergoes continuous development and refactoring to make the code more flexible and predictable. If you want to find out what changed, you can use git log or git blame on the affected files/functions.

In this case you should find that we added a function lammps_extract_atom_size() to the library interface and imported it into python, so we can determine with certainty the dimensions of per-atom arrays. Some include ghost atom info and some don’t. In some cases, e.g. velocities, this can be changed (comm_modify vel yes)

akohlmey · January 31, 2025, 2:22pm

Previously there was some (unreliable) heuristic, so querying LAMMPS for the exact information is certainly an improvement.

akohlmey · January 31, 2025, 2:32pm

Well, the way I see it, the python code in ASE has some room for improvement.

The comment suggests that the length of the per-atom array is used to determine if LAMMPS is run in parallel with domain decomposition. That is not a very reliable way. Much better would be to use self.lmp.extract_setting('world_size') and check that this is 1.

The atom->tag array does include ghost atoms, so what LAMMPS now returns is correct and what was returned previously was wrong.

noam.bernstein · January 31, 2025, 2:34pm

The LAMMPS code and also the LAMMPS python interface undergoes continuous development and refactoring to make the code more flexible and predictable. If you want to find out what changed, you can use git log or git blame on the affected files/functions.

This wasn’t meant as a complaint - I was just very surprised. I’m happy to adapt to any changes that improve reliability.

Thanks for the clarification. I’m a bit surprised the ASE tests didn’t catch this (maybe the LAMMPS version they use doesn’t include it yet). I’ll work on patching the ASE code to use the explicit check of the size. However, my first attempt to do this isn’t working.

n_id_rows = self.lmp.extract_atom_size("id", lammps.constants.LMP_SIZE_ROWS)

is returning 1856, same as the shape of the numpy array returned by extract_atom(...), which isn’t what I want. I guess what I really need is to get nlocal, which I haven’t been able to find in 2.4. The lammps Python module — LAMMPS documentation

noam.bernstein · January 31, 2025, 2:36pm

Well, the way I see it, the python code in ASE has some room for improvement.

Surely true. Thanks for the explanation. I’ll change it to use 'world_size'.

akohlmey · January 31, 2025, 2:39pm

You have to look in the documentation for the C library interface to see which arguments are supported. We try to avoid replication as much as possible in the documentation.
This was something that had plagued the library interface for many years. There were comments in the code, a README file, and a page in the manual. And they would all be different and none was matching the code. Now there are just comments in the library.cpp and library.h files that we extract with doxygen and import into Sphinx and the python docstrings in the python module. In almost all cases, the python functions refer to the corresponding C library interface.

srtee · February 1, 2025, 2:07am

Hi Noam – I work on the LAMMPS code (when I can) and have also recently used ASE for some machine learning MD simulations. I’m more than happy to lend a hand for development needs – you can find me on GitHub at @srtee.