Which way is better to glue with PySCF?

I see that LAMMPS has multiple ways to use other software packages to calculate the forces for MD simulations. In the python module I found the fix_external_get_force function, but it seems that it still uses the fix command way to execute external programs, which makes the utilization of python no different from directly running the binary. Or they can be combined via the server/client mode, and I see that a couple of wrapper scripts are written in python.

My question is then, is it possible that I can use pyscf to calculate the forces in a way like:

import pyscf
from lammps import lammps
lmp = lammps()

# lammps reads the input file
# lammps generates the current step geometry in a python string (xyz format)
# pass the string to build a pyscf molecule and calculate the forces
# pass the forces back to lammps to continue the MD step

If not possible, I feel that both the fix command method and the server/client mode will work, which one is preferable?

Please note that there are multiple issues at hand.

  • LAMMPS uses velocity verlet time integration, that means you need to do the callout to get the forces from an external code during startup and then in every timestep after the first (half) velocity update and after the (full) position update but before the second (half) velocity update.
  • you cannot write your forces to the LAMMPS forces array directly, you may only add them. but that can also only be done in the “middle” of a timestep because the force array will be cleared after the first (half) velocity update.
  • you have not studied the documentation carefully enough.

To do what is closest to what you are describing, you need to use fix external in pf/callback mode. Then you can define a function (either in C or in Python) that has to follow a particular format. See 2.4. The lammps Python module — LAMMPS documentation
This function then is “registered” with set_fix_external_callback(). Now LAMMPS will via fix external call that callback function at just the right time and pass the current (and properly updated) positions as numpy array and also a reference to the numpy array representation of the force array. together with the information about he number of (local) atoms, the timestep and the array of the atom IDs, you can do whatever needs to be done to compute the forces you want to compute and then copy them to the designated numpy array.
Mind you the order of atoms is not fixed, but you can access the atom IDs to put them in order. When you are running LAMMPS in parallel (not likely) then you have to account for the fact that each MPI rank will only provide positions for the atoms it “owns”.

This mechanism is the most efficient way to integrate a code via python (or via C/C++) without having to change LAMMPS code.

You are more guessing here than actually following the documentation. As explained before, you must copy the data to the storage provided by fix external to have your forces added to the atoms at the right time during a timestep. This function will work with fix external in pf/array mode and provides direct access to the storage for the forces in the fix.
Unlike with the callback mechanism there is nothing that triggers the execution of your external code. So either you have to find another way, e.g. via fix python/invoke and then extract the relevant information from LAMMPS using extract_atom or gather_atom library functions and then copy the resulting forces from your external code to the force array you got the reference to with the function call (which must be done in every step because the array in the fix may change). That will usually result in your forces being out-of-sync with the time integration (e.g. you have the “wrong” forces for one of the two velocity updates and/or for the position update).

There is a minimal example using both kinds of mechanism in this file from our test library: lammps/unittest/python/python-fix-external.py at 798975b8090c5cc41b434ebcbe4baa59aa395701 · lammps/lammps · GitHub