Modified compute fep to appy pertubations to single atoms

qchemjax · April 14, 2023, 3:42pm

Dear all,

This is a rather specified topic since it concerns an issue with the compute fep which I modified to fit my needs better.
The modification was to change the perturb_params() function to apply the charge perturbation not by atom type but by their ID, repectively tag as it is called in the source code. This is the modified .cpp file.
compute_fep.cpp (18.6 KB)

To test if the perturbation is applied and computed correctly I came up with the following test:
I take a system, e. g. a molecule in water, in an equilibrated state and set up the compute fep that it perturbs the system from the fully interacting state λ=1.0 to the next state, say λ=0.5.
The system is then run for 0 timesteps, the perturbation is applied once and compute fep gives the energy difference between the two states. Also the potential energy of the frozen snapshot (the atoms don’t move) is calculated.
Now the system is actually adapted to λ=0.5, not only perturbed to and the potential energy of the snapshot is evaluated. Since no thermo-/barostat is applied and the run 0 does not change the atomic positions the geometry of the system remains in one state. Hence, the difference between the potential energies at the two λ-steps should match the output of compute fep.

The issue is that for some systems it does seem to work and for some it does not. For example for aspirin in water (TIP4P/2005) I seem to get the right results whereas for aspirin adsorbed in a zeolite framework it fails. (Inputs and logfiles attached)
Aspirin in TIP4P/2005:
aspirin.txt (4.7 KB)
h2otip4p.txt (437 Bytes)
hydrated_ppcp.data (248.2 KB)
in.bar_example_aspirin_aq (6.7 KB)
log.bar_example_aspirin_aq (64.1 KB)
bar_C0.5.fep (90 Bytes)
bar_C1.0.fep (90 Bytes)
bar_C0.0.fep (91 Bytes)
bar_C0.5.fep (91 Bytes)

Aspirin in zeolite:
aspirin_coeff.data (2.2 KB)
in.bar_example_aspirin_zeo (7.6 KB)
log.bar_example_aspirin_zeo (65.1 KB)
low-E_aspirin-in-MFI.lmp (41.1 KB)
bar0.5.fep (80 Bytes)
bar1.0.fep (80 Bytes)
fw_bar0.0.fep (83 Bytes)
fw_bar0.5.fep (83 Bytes)

I know that this is a very specific and individual issue but if someone would happen to have a suggestion what might go wrong here I would appreciate it very much!
Kindest regards
Jakob

srtee · April 15, 2023, 12:37am

While it’s a good idea to post your complete log files, it would also help if you summarise what’s in them, i.e. what numbers are “wrong” on which lines and what you expect them to be.

As a general wild guess, if you are changing charges, make sure you change them for ghost atoms as well as local atoms on each processor. This might explain a difference between aspirin in water (where Coulombic interactions will be primarily dipolar) and a zeolite (where I’d expect larger partial charges and significantly more point charge-point charge nature to the Coulombics).

Also, if your code is directly changing charges or pair coefficients, you can try write_data after running your compute to see if the changes are what you expect. And running on different numbers of MPI processes can help you isolate an error in your general algorithm vs an error specific to domain decomposition and parallelism, where the latter often shows up as different results when run on different number of processes.

srtee · April 15, 2023, 12:48am

This part of your code seems wrong and makes me wonder if you have some misunderstandings of atom indices in the source code:

   int maxid = atom->map_tag_max;

        for (i = 0; i < maxid; i++)
          if (atag[i] >= pert->ilo && atag[i] <= pert->ihi)
            if (mask[i] & groupbit) q[i] += delta;

You should be looping over atom arrays from 0 to either atom->nlocal or atom->nlocal + atom->nghost. What you’ve written here either skips some or all ghost atoms, or accesses arrays past bound and gives a segmentation fault, depending on the number of processes and whether there are gaps in the tag IDs.

qchemjax · April 17, 2023, 7:24am

Dear Mr. Tee,

Thanks for your reply and advice!
To clarify what is “wrong” in the logfiles: In the Themo outputs written to the logfiles (aspirin_zeo: lines 871 and 1286, aspirin_aq: lines 831 and 1249) the Potential Energy (PotEng) is written at the state λ=1.0 and λ=0.5. The output of the compute fep is written to the .fep files. From my reasoning I would expect the difference in potential energy printed in the logfile to match the energy difference that is computed by compute fep It appears to me, that this is sometimes the case, but not always.

I ran the test calculations all in serial mode, so i would not expect to have artifacts due to the mishandling of ghost atoms.

From my understanding the compute fep performs the perturbation, calculates the potential energy difference and resets the perturbed parameters all in the process of being invoked. I would expect write_data not to work here (or show me the “results” of the perturbation), since it writes the state of the system at a timestep, between timesteps, so either before or after the compute and the application of the perturbation, not during, right?

srtee · April 17, 2023, 7:41am

Famous last words. (I’ve said them myself!) Periodic images (for calculating pair and molecule interactions) are ghost atoms, so incorrect handling of ghosts (charges in particular) will cause errors even in a serial single process run.

qchemjax · April 17, 2023, 7:57am

Indeed I am not sure how atoms are counted or enumerated in the code. Also I am not quite familiar with C++ so my understanding of the code itself is limited.

The reason I chose to loop until map_tag_max is because it happens that atom IDs are not consecutive. When I place a molecule in water and delete the water molecules that overlap with the molecule the atom IDs are not updated and thus not consecutive and I could not reset them using reset_atoms id because of the TIP4P pair_style. When I try to reset the atom IDs in the system LAMMPS tell me
ERROR on proc 0: TIP4P hydrogen has incorrect atom type (src/FEP/pair_lj_cut_tip4p_long_soft.cpp:226)

Hence I thought looping until nlocal would possibly miss some atoms.

Periodic images (for calculating pair and molecule interactions) are ghost atoms, so incorrect handling of ghosts (charges in particular) will cause errors even in a serial single process run.

From your replies I now guess, that nlocal is not dependent on atom IDs but loops over all atoms owned by the processor regardless of their ID or tag. I will change this and also take the ghost atoms into account and report back.

Kind regards and thanks a lot!
Jakob

srtee · April 17, 2023, 8:21am

Check out the map and sametag members of the Atom class.

qchemjax · April 19, 2023, 8:19am

By adapting the code (in part back) to looping over nlocal and nghost

      if (pert->aparam == CHARGE) {    // modify charges
        int *atag = atom->tag;
        double *q = atom->q;
        int *mask = atom->mask;
        int natom = atom->nlocal + atom->nghost;

        for (i = 0; i < natom; i++)
          if (atag[i] >= pert->ilo && atag[i] <= pert->ihi)
            if (mask[i] & groupbit) q[i] += delta;

it works!
The potential energy difference that is calculated by compute fep now matches the difference between the “unperturbed” states! Thanks a lot!

I tried to take the adaptation one step further and wanted to apply the perturbation still by atom ID but pass Δλ to the compute instead of Δq(atom)=q(atom)*Δλ.

The code I came up with is:

 if (pert->aparam == CHARGE) {    // modify charges
        int *atag = atom->tag;
        double *q = atom->q;
        int *mask = atom->mask;
        int natom = atom->nlocal + atom->nghost;

        for (i = 0; i < natom; i++)
          if (atag[i] >= pert->ilo && atag[i] <= pert->ihi)
            if (mask[i] & groupbit) q[i] += q[i]*delta;

taking delta from the compute fep command as the v_delta argument. The compute fep command would now look like this:

compute FEPbw all fep ${TK} &
atom    charge  2989*3009       v_minusdl

which results in no perturbation at all.

Do you have an idea what this might be due to?

In principle I am satisfied with the adaptation that works now, since it fits my need to perturb charges by atom ID, but I think it would make the code a bit more neat to be able to directly pass the Δλ for the perturbation instead of having to calculate the variable in the input script before.

Again, I want to thank you very much for your advice, for the already given and beforehand!

srtee · April 19, 2023, 8:40am

Here’s how:

Declare a “status” in the header, like int delta_is_equalstyle
Adjust the input parser to read a v_ string as the variable name and set the “status” variable to 1.
Look up the variable name and error out if it’s invalid.
When delta is about to be used, compute the variable value and assign it to delta (if the “status” variable is 1).

It takes a little while but is fairly straightforward. Look at the source code of a fix you understand to see how it works – I recommend fix gravity or fix addforce; look for the v_ string in the source code to see how the variable name is read in and then track how the surrounding variables are used throughout the code.

Really, someone should automate this, but you can take the lack of automation to be a good sign – it’s easy enough to do manually that manual implementation hasn’t really dragged down the dev process.