Neighbor list problems

Hi,

as I was preparing my EAM model driver for submission into the KIM repository I encountered a problem which I couldn't solve. It concerns small boxes and periodic boundary conditions. Maybe it's a trivial one, but currently I'm stuck.

I attached a small 1D sketch which should describe the problem.

There is a simulation box with length 5 containing 2 atoms (index 0 and 1). They are sitting at x=0 and x=2, the cutoff is assumed to be 4.

The problem is the calculation of the electron density for the ghost atoms 2 and 5.

To calculate this for atom 2, one would need the contribution from atom 3. But this is not available since atom 2 is not contributing and has no neighbors. With periodic boundaries this however should not be a problem because atom 2 is the same as atom 0. So the contribution from 3 to 2 is the same as from 1 to 0. What happens with the KIM API and NEIGH_PURE and NEIGH_RVEC (half and full) neighbor lists is the following:

atoms 0 and 1 have the correct density assigned
atom 2 is missing the contribution from atom 3 (or 0)
atom 5 is missing the contribution from atom 4 (or 1)

This is not a problem for the energy, but the forces will come out wrong.

A simple solution is to map the neighbors back to the original atoms,
making the calculation of my example possible with only 2 atoms.
This means, that if I access atom 2 I actually am accessing atom 0.

Is there a way to "map" ghost atoms to real atoms? I am thinking of a function which gives you the contributing atom correspoding to a ghost atom. In my case something like an array which maps them back would be great so I could simply do

   atom_rho[remap[2]] += some_value

where remap[2]=0, because atom 2 is the same as atom 0.

Or is there any other way to calculate the correct density for atoms 2 and 5 without knowing anything about the size of the simulation box and its geometry?

Any help is very much appreciated.

Best regards,
Daniel

Sketch.png

Dear Daniel,

I had a discussion with Ryan on exactly this topic during the meeting. This identification should be done BY THE TEST after the force calculation.

To put it in another way: The role of the model is NOT to calculate the forces on all contributing atoms, but to calculate the derivative of the sum of the energies of the contributing atoms with respect to the coordinates of ALL atoms.

In your case, there are the following terms contributing to the force on atom 1:
1) The change in energy of atom 0 because atom 1 is moving.
2) The change in energy of atom 1 because it is moving.
3) The change in energy of atom 0 because atom 5 (an image of atom 1) is moving. Or equivalently, the change in energy of atom 2 (an image of atom 0) because atom 1 is moving.

The third term is causing us trouble. What you and I were expecting is that we need to calculate the change in energy of atom 2, and add that term to the force on atom 1. Instead, we should calculate the change of energy of atom 0 because 5 is moving, and return it as a force contribution to atom 5. It is then the responsibility of the simulator/test to know that atom 5 is really atom 1, and "move" the contribution there. Ryan and Ellad: Is this correctly understood?

The point of doing it this way is that it will automagically work in parallel simulations as well, where atom 5 may be on a separate processor instead of being a ghost of atom 1. The model does not have to know about this, the parallel simulator will be able to collect the forces by communicating.

This should work with potentials up to the level of EMT/EAM, where the energy (NOT the force) of an atom depends only on quantities (here: electron density) calculated for that atom. It appears to be more tricky if the potential is more complicated, if for example the energy of an atom also had a term including the electron density of neighboring atoms. I do not know if any such potential exists (bond order potentials, perhaps?) but it would require an extension of OpenKIMs current ghost atom mechanism.

The price is that it will only work because the neighbor list is full, explaining why the RVEC_H neighbor list does not exist. Unfortunately, with potentials such as EMT/EAM and only a single element present in the simulation, all pairwise contributions to the electron density and similar sums are then calculated twice instead of once, causing close to a factor two performance penalty. A possible solution might be an RVEC_H list, where neighbor pairs between contributing atoms only appear once, but pairs between contributing and ghost atoms all appear. Hmm, I will have to think about this, it is not trivial if that would indeed work. :slight_smile:

Best regards

Jakob

Hi again!

Just a follow-up to my previous email. I almost managed to convince myself that my previous email was wrong, and that it would not work, because there would be extra terms in the force on the ghost atoms. But there is an additional subtlety that saves the day.

The energy of an atom is defined by its neighbor list, and the full neighbor lists are NOT necessarily symmetric: if there are ghost atoms, the ghost atoms will not have any neighbors.

So when you calculate the force on a ghost atom, there will only be contributions from the change in energy of the real atoms, computed when you traverse the real atom's neighbor list (but added to the ghost atom's force: force contributions should always be added to the neighbor, not to the atom itself, as it is a change of this atoms energy when the neighbor moves, and thus a force on the neighbor). The "self-contribution" of the ghost atom will be zero, as it has no neighbors its energy is constant (I think OpenKIM requires that it is zero, but I am not sure if that is really a requirement).

Jakob

Hi Jakob,

thank you very much for your answer. Assuming what you say is correct, then I personally see no way of implementing EAM potentials within the current version of the KIM API.

The reason for this is rather simple. The energy and pair forces can all be calculated additively. This is not possible for the forces from the embedding term. You have to know the complete density of every neighbor to calculate the gradient at the neighbor position. If the embedding function were linear, splitting these contributions would be possible, but this is never the case.

If it were possible to split up the contributions the way you described it, then there would be no need to do two consecutive loops to calculate EAM energies and forces.

Maybe I don't understand this way of parallelization completely and there are some mistakes at my end.
Is there any documentation for this available on the openKIM website?

Let's wait and see what Ryan and Ellad have to say.

Best regards,
Daniel

Hello Daniel and Jakob (and the rest of the KIM community),

It is great to see some activity on the list. Thanks for putting time into interacting with KIM and the community.

I think Jakob is essentially correct (after doing only a brief read).

I'll try to describe the force computation process for Daniel's scenario:

*) Atoms 0 and 1 are contributing and all other atoms have zero neighbors
    in the neighbor list.

*) So, first the EAM Model will loop over the atoms and compute the
    density rho_i of all atoms. This will give the correct values for
    atoms 0 and 1 and zero values for the rest (since they have no
    neighbors).

*) Next the EAM Model will loop over the atoms to compute the embedding
    energy U_i and its derivative dU_i. This will give correct values for
    atoms 0 and 1 and zero values (of U_i and dU_i) for the rest.

*) Then the EAM Model will loop over the atoms to compute the forces.

     For each neighbor j of atom i it computes the derivative of the pair
     term dphi_ij and apply the contribution to the forces (on atoms i and
     j) from the derivative of the energy of atom i.

     So, in this case, the loop will do the following (assuming a half
     list mode), where Rv_ij is the relative position vector and R_ij is
     its magnitude:

     atom 0)
             f_0 += (dphi_01 + dU_0 + dU_1)*(Rv_01/R_01)
             f_1 += -(dphi_01 + dU_0 + dU_1)*(Rv_01/R_01)

             f_0 += (0.5*dphi_05 + dU_0 + dU_5)*(Rv_05/R_05)
             f_5 += -(0.5*dphi_05 + dU_0 + dU_5)*(Rv_05/R_05)

       note that dU_5 = 0.0

     atom 1)
             f_1 += (0.5*dphi_12 + dU_1 + dU_2)*(Rv_12/R_12)
             f_2 += -(0.5*dphi_12 + dU_1 + dU_2)*(Rv_12/R_12)
       note that dU_2 = 0.0

*) At this stage the Model is done and returns to the Test. Now, the Test
    is responsible for combining the terms appropriately:

    Atoms 2&4 are actually atom 0, so the Test should add the forces f_2
    & f_4 to f_0

     f_0 = f_0 + f_2 + f_4
         = (dphi_01 + dU_0 + dU_1)*(Rv_01/R_01)
           +(0.5*dphi_05 + dU_0 + 0.0 )*(Rv_05/R_05)
           -(0.5*dphi_12 + dU_1 + 0.0 )*(Rv_12/R_12)

     noting that dphi_12 = dphi_05 and Rv_12 = - Rv_05, we have

         = (dphi_01 + dU_0 + dU_1)*(Rv_01/R_01)
           +(0.5*dphi_05 + dU_0 + 0.0 )*(Rv_05/R_05)
           -(0.5*dphi_05 + dU_1 + 0.0 )*(-Rv_05/R_05)

         = (dphi_01 + dU_0 + dU_1)*(Rv_01/R_01)
           +(0.5*dphi_05 + dU_0 + 0.0 )*(Rv_05/R_05)
           +(0.5*dphi_05 + dU_1 + 0.0 )*(Rv_05/R_05)

         = (dphi_01 + dU_0 + dU_1)*(Rv_01/R_01)
           +(dphi_05 + dU_0 + dU_1)*(Rv_05/R_05)

     Which is the correct value. A similar procedure of adding f_3 & f_5
     to f_1 will give the correct answer for f_1.

I hope this is clear. Let me know if you agree/disagree, or don't understand something...

Cheers,

Ryan

Dear Daniel,

Unless EAM is more different from EMT than I expect, then you should be able to calculate the forces. Or rather, you can calculate the derivatives of the energy that OpenKIM requires to calculate the force itself. In OpenKIM, the role of the model is NOT to calculate the forces (this is as you have noticed impossible when you go beyond the minimum-image convention) but to calculate the partial derivatives of the total energy with respect to the coordinates of the atoms, ASSUMING that they are independent. The chain rule makes it possible for the test to combine the derivatives with respect to ghost atoms with derivatives with respect to real atoms to construct the real force on the real atoms.

If your energy can be written as a sum over atomic energies, where each atomic energy only depends on the neighbors of that atom, then you can calculate the right derivatives. This I think is the case of EAM, where the energy is something like

E_i = F( sum_j g(r_ij) ) + sum_j h(r_ij)

where F is a function of an intermediate quantity (the electron density?), g is a function of the distance to a neighbor (contributions to the electron density?) and h is a pair potential like function.

If A and B are real atoms, and C is a ghost of A, D is a ghost of B,
arranged like this: D A B C

Then the force on A and B are
[Note that the d's in dE_tot/dr_X should be "hard" (total derivative) and should be "soft" in the rest (partial derivative)]
F_A = dE_tot/dr_A = dE_A/dr_A + dE_B/dr_A + dE_B/dr_C * dr_C/dr_A
F_B = dE_tot/dr_B = dE_A/dr_B + dE_B/dr_B + dE_A/dr_D * dr_D/dr_B
where the last terms only appears because C is a ghost of A and D of B (and dr_C/dr_A is of course = 1, etc).

Your potential CANNOT know that dr_C/dr_A = 1 and dr_D/dr_B = 1 without extra information, and therefore it CANNOT calculate the forces (as you have noticed). Fortunately, it doesn't have to. Instead, your model should return the following quantities:
[here, all derivatives should be partial]
dE_tot/dr_A = dE_A/dr_A + dE_B/dr_A (1)
dE_tot/dr_B = dE_A/dr_B + dE_B/dr_B (2)
dE_tot/dr_C = dE_B/dr_C (3)
dE_tot/dr_D = dE_A/dr_D (4)

Note that you NEVER have to calculate a quantity like dE_C/dr_X, which you cannot do as you do not have all neighbors of C!

As the TEST knows that C is a ghost of A, it can find the force on A as
F_A = dE_tot/dr_A + dE_tot/dr_C
and similar for F_B
(there are of course no F_C and F_D as they are ghost atoms)

This will work because the test also has reflected this ghost relation in the neigbor lists. They will look like this:
NB_A = [B, D]
NB_B = [A, C]
NB_C = [] (ghosts have no neighbors - this is essential!)
NB_D = []

When your potential loops over all atoms (in the second loop, where it calculates the "force" which is not the real force) it will calculate the following terms:
For atom A: dE_A/dr_A (contribution to 1; from all neighbors in NB list); dE_A/dr_B (contribution to 2 when it encounters B in the NB list); dE_A/dr_D (to 3, when encountering D)
For atom B: dE_B/dr_B (to 2; from all neighbors); dE_B/dr_A (to 1 when encountering A); dE_B/dr_C (to 3, when encountering C)
For atom C: Nothing, as there are no neigbors.
For atom D: Nothing.

As you can see, all terms are calculate, no extra terms are calculated, and the test that knows which ghost atoms are ghosts of which real atoms can prepare the neighbor lists accordingly before calling the model, and combine the terms returned to obtain the real forces on the atoms.

Best regards

Jakob

This can teach me to read ALL my emails before I start answering them :slight_smile:

I agree with you, and have just duplicated your answer in slightly different words. We say the same thing (or try to).

Well, this is sufficiently complicated that I needed to go through it anyway, just to convince myself of how I should do it in EMT. My EMT model does not yet support RVEC_F nor ghost atoms in any other way, exactly for this reason. It is extracted from a code assuming that the potential calculates the full force.

Best regards

Jakob

Hi Ryan,

thank you very much for the detailed explanation. I did not believe it but now I calculated everything explicitly for myself and it actually is correct.
I also got my model driver working with the correct forces.

Maybe there should be some kind of manual how the KIM API does handle periodicity and parallelization. As for me it was not clear that the tests are actually responsible for calculating the "real" forces and not the models.

Best regards,
Daniel

Hello KIM community,

I have another problem concerning neighbor lists. This is about the full neighbor lists (PURE and RVEC).

According to the standard these neighbor lists only need coordinates and the get_neigh method to compute energies and forces. The numberContributingParticles is not explicitly mentioned and also not present in the current implementation. It is provided by LAMMPS (courtesy?) but not by the vc_forces_numer_deriv test. So I cannot rely on that information.

So my question is, how to decide if a neighbor is contributing (real atom) or not (ghost atom). This is important because I need the full contribution of the embedding energy from ghost atoms and not only the half, as from pair contributions.

I came up with two solutions for this problem:

1) Call get_neigh() again for each neighbor, which would be really costly since I only need the number of neighbors and not the complete neighbor list.

2) Use the neighObject provided by the test. This however is not documented anywhere and I am assuming that each test can provide its own (different) neighbor object, so there is no general interface for accessing this data.

Is there another way of getting this information? Or maybe a different way of implementing this that doesn't rely this information?

Thank you very much
Daniel

Dear Daniel,

This sounds strange, are you absolutely sure you need that? Why should the energy of atom A depend on whether atom B is a ghost atom or a real atom?

Is the embedding energy not a local quantity of the atom, depending on the embedding density? The "real" atoms will then have ghost atoms contributing to their embedding densities, the ghost atoms on the other hand will have embedding density (and energy) of zero.

Are you sure you are not trying to calculate the derivative of the energy of the "wrong" atom? You should never need the derivative of the energy of a ghost atom.

You should never need to know if a neighbor of an atom is a real or a ghost atom, unless you try to use some kind of symmetry to calculate a contribution to a quantity once for two atoms, as you would do for half neighbor lists. This is no longer possible for full neighbor lists, and this will cost performance (hence models that can use such symmetry should "prefer" half neighbor lists by listing them first).

Best regards

Jakob

Hi Jakob,

let me try to make it clear by using my previous example:

This is what happens for half neighbor lists (Ryans mail):

atom 0)
  f_0 += (dphi_01 + dU_0 + dU_1)*(Rv_01/R_01)
  f_1 += -(dphi_01 + dU_0 + dU_1)*(Rv_01/R_01)

  f_0 += (0.5*dphi_05 + dU_0 + dU_5)*(Rv_05/R_05)
  f_5 += -(0.5*dphi_05 + dU_0 + dU_5)*(Rv_05/R_05)

  note that dU_5 = 0.0

atom 1)
  f_1 += (0.5*dphi_12 + dU_1 + dU_2)*(Rv_12/R_12)
  f_2 += -(0.5*dphi_12 + dU_1 + dU_2)*(Rv_12/R_12)
  note that dU_2 = 0.0

final result after summation of f_0 + f_2 + f_4:

  f_0 = (dphi_01 + dU_0 + dU_1)*(Rv_01/R_01)
       +(dphi_05 + dU_0 + dU_1)*(Rv_05/R_05)

This is correct and I get the correct forces.

For full neighbor lists I get:

atom 0)
  f_0 += (0.5*dphi_01 + 0.5*dU_0 + 0.5*dU_1)*(Rv_01/R_01)
  f_1 += -(0.5*dphi_01 + 0.5*dU_0 + 0.5*dU_1)*(Rv_01/R_01)

  f_0 += (0.5*dphi_05 + 0.5*dU_0 + 0.5*dU_5)*(Rv_05/R_05)
  f_5 += -(0.5*dphi_05 + 0.5*dU_0 + 0.5*dU_5)*(Rv_05/R_05)
  note that dU_5 = 0.0

atom 1)

  f_1 += (0.5*dphi_01 + 0.5*dU_0 + 0.5*dU_1)*(Rv_10/R_10)
  f_0 += -(0.5*dphi_01 + 0.5*dU_0 + 0.5*dU_1)*(Rv_10/R_10)

  f_1 += (0.5*dphi_12 + 0.5*dU_1 + 0.5*dU_2)*(Rv_12/R_12)
  f_2 += -(0.5*dphi_12 + 0.5*dU_1 + 0.5*dU_2)*(Rv_12/R_12)
  note that dU_2 = 0.0

final result after summation of f_0 + f_2 + f_4:

  f_0 = (dphi_01 + dU_0 + dU_1)*(Rv_01/R_01)
       +(dphi_05 + 0.5*dU_0 + 0.5*dU_1)*(Rv_05/R_05)

So I am missing 0.5*dU_0 and 0.5*dU_1. I can compensate this by removing the factor 0.5 in front of dU_i for all contributions from ghost atoms (i>=2). But I only can account for the full contribution if I know if j is a real or a ghost atom.

Please let me know if there is an error in my calculations.

Daniel

Hi Daniel,

Hi Jakob,

let me try to make it clear by using my previous example:

  [ … cut … ]

For full neighbor lists I get:

I think here is where it goes wrong - although I admit to being slightly in doubt.

Whenever you loop over an atom, only its own contribution to the forces should be calculated, not half its own and half the neighbor's:

atom 0)
  f_0 += (0.5 * dphi_01 + dU_0) * Rv_01/R_01
  f_1 += -(0.5 * dphi_01 + dU_0) * Rv_01/R_01

  f_0 += (0.5 * dphi_05 + dU_0) * Rv_05/R_05
  f_5 += -(0.5 * dphi_05 + dU_0) * Rv_05/R_05

atom 1)
  f_1 += (0.5 * dphi_10 + dU_1) * Rv_10/R_10
  f_0 += -(0.5 * dphi_10 + dU_1) * Rv_10_/R_10

  f_1 += (0.5 * dphi_12 + dU_1) * Rv_12/R_12
  f_2 += -(0.5 * dphi_12 + dU_1) * Rv_12/R_12

This time, you get

f_0 + f_2 = (dphi_01 + dU_0 + dU_1) * Rv_01/R_01
          + (dphi_05 + dU_0 + dU_1) * Rv_05/R_05

I hope this is correct, but I must admit that it is a bit confusing. :slight_smile:

Does this make sense?

Best regards

Jakob

Hi Jakob,

yes, that actually seems correct. I guess I still have to get used to this parallelization method.

Thank you very much for your help.

Daniel

Hi Daniel and Jakob,

I see that you have arrived at the correct solution. Nicely done.

I just wanted to add a comment.

The organizing principle here is the idea of the particleEnergy. Generally, each particle in the configuration is assumed to have an energy. The energy of particle i is defined to depend on the positions of i's neighbors only. (that is, it does not depend on the position of any other atoms.)

[ For an EAM potential we have: E_i = U_i + sum_jj(Phi(r_ij)) ]
[ where U_i is the embedding energy and jj runs over the ]
[ neighbors of atom i ]

The total energy of the configuration is then simply the sum of all particleEnergys. The forces are the derivative of the total energy with respect to the atomic positions. Etc...

So, I find it most helpful to think of the computational loops in a Model in this way:

for i=1:Nparticles
   // this loop corresponds to the energy of particle i: E_i
   for jj=1:NneighborsOfi
     // determine particle id of jj-th neighbor of atom i
     j = particleID(i,jj)

     // compute the contribution to E_i from j
     ...

     // compute dE_i/dr_{ij} and accumulate its contribution to the total
     // force values
     f_i = f_i + (dE_i/dr_{ij})( r_{ij}/r )
     f_j = f_j + (dE_i/dr_{ij})( -r_{ij}/r )

     ...
   end
end

If this idea is followed consistently you will get the right contributions...

Cheers,

Ryan