Extracted bounds not updated in python interface

alphataubio · May 3, 2025, 2:24pm

in fitsnap-reaxff, im trying to optimize as much as possible every cycle of the lammps calculator because its called millions of times for small configs. i want to avoid delete_atoms group all and recreating the atoms every time so i create an original box with [-99,99]^3 and the maximum number of atoms then i want to change the values with the extracted pointers in python interface:

    def allocate_per_config(self, configs: list):

        self._configs = configs
        ncpn = self.pt.get_ncpn(len(configs))
        if self.dipole: self._lmp.command("compute dipole all dipole fixedorigin")
        if self.quadrupole: self._lmp.command("compute quadrupole all quadrupole")

        max_atoms = max(c["NumAtoms"] for c in configs)
        self._lmp.create_atoms( n=max_atoms, id=None,
            type=(max_atoms * c_int)(*([1] * max_atoms)),
            x=(max_atoms * 3 * c_double)(*([0.0] * max_atoms * 3))
        )

        self._nlocal = self._lmp.extract_global("nlocal")
        self._boxlo = self._lmp.extract_global("boxlo")
        self._boxhi = self._lmp.extract_global("boxhi")
        self._sublo = self._lmp.extract_global("sublo")
        self._subhi = self._lmp.extract_global("subhi")
        self._type = self._lmp.extract_atom("type")
        self._x = self._lmp.extract_atom("x")
        self._q = self._lmp.extract_atom("q")

    # --------------------------------------------------------------------------------------------

    def _prepare_lammps(self):

      self._nlocal = self._data["NumAtoms"]
      #self._boxlo[:] = self._data["Bounds"][0]
      #self._boxhi[:] = self._data["Bounds"][1]
      #self._sublo[:] = self._data["Bounds"][0]
      #self._subhi[:] = self._data["Bounds"][1]

      self._lmp.reset_box(self._data["Bounds"][0], self._data["Bounds"][1], 0.0, 0.0, 0.0)

      if self.esp: 
          self._lmp.command("compute esp all esp/grid spacing 1.0")
          self._esp_reference = self._lmp.numpy.extract_compute('esp', 
              LMP_STYLE_GLOBAL, LMP_TYPE_VECTOR)
          self._esp_reference[:] = self._data["ESP"][:]

      for i, (x, y, z) in enumerate(self._data["Positions"]):
          self._type[i] = self.type_mapping[self._data["AtomTypes"][i]]
          self._x[i][0], self._x[i][1], self._x[i][2] = x, y, z
          self._q[i] = self._data["Charges"][i][0]

however because of the way the bounds are extracted in library.cpp, they are not updated when the extracted values are changed in python. if i use reset_box instead then i get “Calling lammps_reset_box() not supported when atoms exist”.

doing a clear and rebuilding everything from scratch at each cycle adds ~40% overhead that im trying to avoid.

any suggestions how to get around this ?

alphataubio · May 3, 2025, 4:51pm

unless somebody else has a better idea, i went with a workaround with a custom lammps_reset_box_single() in reaxff_library.cpp that works when there’s only one proc.

see workaround to quickly reset box on one proc · alphataubio/lammps-alphataubio@2f475aa · GitHub

srtee · May 4, 2025, 3:22am

Why exactly are you having to delete and recreate all atoms??

alphataubio · May 4, 2025, 3:58pm

https://alphataubio.com/fitsnap/reaxff.html

in fitsnap-reaxff fitting, the loop goes through several configurations (eg. 1000 different systems of 50 water molecules), calculates a loss on energy/force/esp/… compared to dft reference data as a function of different reaxff parameter values to optimize a reaxff force field with QEQ, ACKS2, or QTPIE:

    def process_configs_with_values(self, values):

        if self.energy: self.sum_energy_residuals = 0.0
        if self.force: self.sum_force_residuals = 0.0
        if self.charge: self.sum_charge_residuals = 0.0
        if self.dipole: self.sum_dipole_residuals = 0.0
        if self.quadrupole: self.sum_quadrupole_residuals = 0.0
        if self.esp: self.sum_esp_residuals = 0.0
        if self.bond_order: self.bond_order_residuals = 0.0
        self._lmp.set_reaxff_parameters(self.parameters, values)
        self._charge_fix()

        for config_index, c in enumerate(self._configs):
            self._data = c
            self._prepare_lammps()
            try:
                self._run_lammps()
                self._collect_lammps(config_index)
                if self.esp: self._lmp.command(f"uncompute esp")
            except Exception as e:
                print(f"*** rank {self.pt._rank} exception {e}")
                raise e

        self._lmp.command(f"unfix {self.charge_fix.split()[1]}")
        answer = []
        if self.energy: answer.append(self.sum_energy_residuals)
        if self.force: answer.append(self.sum_force_residuals)
        if self.charge: answer.append(self.sum_charge_residuals)
        if self.dipole: answer.append(self.sum_dipole_residuals)
        if self.quadrupole: answer.append(self.sum_quadrupole_residuals)
        if self.esp: answer.append(self.sum_esp_residuals)
        if self.bond_order: answer.append(self.sum_bond_order_residuals)
        answer.append(sum(answer))
        return np.array(answer)

    def _prepare_lammps(self):

      self._nlocal = self._data["NumAtoms"]
      self._lmp.reset_box_single(self._data["Bounds"][0], self._data["Bounds"][1], 0.0, 0.0, 0.0)

      if self.esp:
          self._lmp.command("compute esp all esp/grid spacing 1.0")
          self._esp_reference = self._lmp.numpy.extract_compute('esp',
              LMP_STYLE_GLOBAL, LMP_TYPE_VECTOR)
          self._esp_reference[:] = self._data["ESP"][:]

      for i, (x, y, z) in enumerate(self._data["Positions"]):
          self._type[i] = self.type_mapping[self._data["AtomTypes"][i]]
          self._x[i][0], self._x[i][1], self._x[i][2] = x, y, z
          self._q[i] = self._data["Charges"][i][0]

in fitsnap scaling across multiple mpi ranks is done by splitting configs by ranks so basically each loop of the optimization algorithm calls thousands of small single-core lammps instances with python repeatedly doing a run 0 while varying reaxff parameters to be optimized.