Lost atoms and C-API access

I’m using lammps via the C-API library with a no-MPI compilation. I go through a sequence of setting the configuration via lammps_command("change_box ....") and lammps_scatter_atoms(... "x"....), then lammps_command("run 0") to do the energy evaluation. I’ve been getting “Lost atoms” errors from this sequence for one particular configuration.

  1. Can anyone explain how this could be happening? I feel like setting the box and positions should ensure that everything is OK, at least for the initial evaluation (no propagation yet). All the positions I’m setting are within scaled positions 0…1 (and I also checked that doing lammps_scatter_atoms(..."images"...) with all 0s doesn’t change anything).
  2. Is there any cleaner way to change the box?

Having looked at the CommBrick::exchange code, I feel like my claim about all the scaled positions being in 0…1 must be wrong. Did I miss where it’s documented that there’s such a restriction on the values passed to lammps_scatter_atoms(...."x"....)?

Aha - my transformation from Cartesian to lattice coordinates is different from x2lamda, so my code thinks one atom is at 0.036, but lammps thinks it’s at 1.036 (and hence tries to transfer it). I’m pretty sure the Cartesian positions I get are straight from lammps_gather_atoms(..."x"...). Am I required to process those through images * cell before passing them back to lammps_scatter_atoms(..."x"...) ?

I am confused about the mention of fractional coordinates. Where does it say that the scatter and gather atoms functions return or take fractional coordinates?

I am only expecting to receive and pass Cartesian coordinates. I just meant that I checked by doing my own transformation to scaled coordinates, and confirmed that my code thinks everything is between 0 and 1, so the scaled-coordinated based mechanism used by CommTiled::exchange for triclinic boxes should not be trying to exchange anything. However, how my code does the transformation and what x2lamda does must be slightly different, because they get values that differ by 1.

Since my code is always receiving coordinates from lammps_gather_atoms and then passing them to lammps_scatter_atoms, I assumed they would always be compatible. I’m now suspecting that it’s just a bug in how I handle non-zero boxlo (because my code only has cell vectors, not boxlo and boxhi).

@akohlmey Please don’t worry about this until I post a followup, hopefully either documenting the solution to my confusion, or asking a more specific question.

OK, I was doing things not as I intended, but I guess I do still have this question: are the coordinates passed to lammps_scatter_atoms(...."x"....) (which is then going to be followed by lammps_command("run 0")) required to be in the scaled interval 0…1, or is lammps supposed to take care of that by itself?

My observation is that if the positions I scatter have scaled position < -1.0, they are not wrapped correctly in domain->pbc(), still have negative scaled coordinates (-1.0 < lamda < 0.0) when Verlet::setup() gets to comm->exchange(), and are therefore lost. Is this expected?

If I remember correctly, atoms that are outside the box must not be farther away than the communication cutoff or they will be “lost”. BTW: your arguing in scaled coordinates and not having some simple to reproduce examples make it very difficult to follow your remarks and questions. It may be straightforward for you, but that is not how I look at things and thus you are requiring me to “translate” your thinking into mine on top of trying to understand the problem. :frowning_face:

Sorry it wasn’t clear. I just thought, since the exchange code that loses the atoms requires that they be in scaled coordinates (for triclinic boxes, which I have), that their scaled coordinate values were relevant.

No, that is an internal thing. When you have a tilted box, LAMMPS will covert to fractional coordinates for every operation that assumes an orthogonal box and then back afterwards. For the user serviceable parts you should only need to worry about normal coordinates.

Yes, I realize that. The fundamental problem for my code, however, was that the the lammps-internal scaled coordinates need to be between 0…1 for CommBrick::exchange() to work, so I needed to be aware of the scaled coordinates to ensure that I was passing valid ones. LAMMPS calls Domain::pbc() before exchange(), but that function is not guaranteed to fully wrap around pbcs. It only adds/substracts one multiple of the periodicity. As a result, if you pass Cartesian coordinates that x2lamda() turns into values < -1 or > 2, they will not be wrapped sufficiently, and exchange() will lose them. That’s what was happening with my code (had to do with how I deal with boxlo), but is fixed now.

I wouldn’t call it a LAMMPS bug, but it might be worth documenting, or perhaps improving robustness by modifying lammps_scatter_atoms to wrap with a version of Domain::pbc that will wrap multiple times around the periodicity, even if that’s too slow to do during a normal run.

The main issue is that the scatter/gather functions of the library interface still need refactoring, consolidation, documenting, and adding unit tests. Since you are writing directly into the LAMMPS storage, any data has to conform to the expectations of LAMMPS, i.e. positions have to be within the box and - if needed - image flags must be set accordingly. Thus the expectation that LAMMPS would wrap those coordinates is not correct. There is no way to put some check or warning into the scatter/gather functions, since they are generic and the requirement of providing meaningful data applies to all supported data. The situation is a bit different for the read_data because it knows that the data it processes are coordinates and thus will either drop or wrap the coordinates depending on whether atom coordinates are within the designated box boundaries.

You don’t think that the C API is different from a truly internal routine? By definition it’s dealing with “external” data, and if it could make that data consistent with LAMMPS’s internal assumptions, that’d be helpful.

That is not how LAMMPS is designed. On the contrary, most of the library interface is meant to provide access to internal data. Internally it does directly access data where that should go through getter/setter functions that limit what you can do and can take care of side effects. When I get too irritated about this I sometimes tease Steve that the C++ code in some classes is reminding me very much of how COMMON blocks in Fortran 77 were used.

The library interfaces are really just a thin layer on the internal data and functionality in order to allow you to write code on top of LAMMPS without having to use C++ but also without sacrificing performance (much).

There are lots of things that could be done (also in terms of input files and checking of consistency etc.) but that a) takes a whole lot of effort to implement (and we all have other, higher priority projects), so volunteers would be required to implement it and b) it would not have a chance to be accepted anywhere where this would negatively impact performance.