Mitigate Overhead when using LAMMPS as a Force Calculator

I’m using LAMMPS through the C-API (via Julia) to do single-point force/energy calculations. Things work wonderfully, but I’m seeing more overhead than I might expect. There’s about a 2x performance hit compared to a native implementation of LJ vs. calling run 0 from LAMMPS via the C-API. There should be no overhead from moving/copying as the gather and scatter APIs allow you to use pre-allocated memory.

I was wondering if there is some extra setup each step because of using run 0 that does not take place during a normal simulation. If I had to guess the neighbor list is being updated every time I call run 0 but I was wondering if there are other things I was missing or other places I could mitigate overhead. Thanks!

Have you tried using run 0 pre no post no?
Please check with the documentation of the run command of the implications of using those two flags and whether they can be applied in your use case.

1 Like

When I went to actually use I hit some issues. If I call scatter to set new positions and then run 0 pre no and then extract the potential energy compute the energy is always just what it was the very first time I ran run.

Reading the docs more updating the positions probably counts as a set command which means I need pre yes. So my new question is can I do run 0 and not trigger a neighbor list re-build but still re-calculate forces/energies? I am doing single point calculations of a crystal and the initial neighbor list should always be valid.

The settings for neigh_modify seem to rely on actual timesteps to take effect.

@ejmeitz There is not much that can be done from remote without actually tracking exactly what you are doing and reproducing it locally (which is not going to happen due to lack of time and information).

It is up to you to decide whether you need initialization after making changes to the system. Please note that a) “pre no” turns off more than just an initial neighbor list and force computation and b) for the very first “run” command, the “pre no” flag is ignored.

One possible reason for problems is could be that during the first neighbor list building, also a spatial sorting of the atom happens, so you may want to turn that off depending on how you communicate your updated positions. In general, what you are doing is tricky business, since LAMMPS cannot know what needs to be updated or not, since you are only using parts of its per-timestep processing, so it is your job to communicate that. Unfortunately, the granularity is rather course (not unsurprising). In the end, the cleaner approach is to integrate your external processing into LAMMPS, e.g. as a custom run style or a custom fix style.

1 Like

Yeah ill look into other options. I’m really just trying to use LAMMPS as a force calculator given a set of positions.