neighbor lists

So I've been thinking about my various KIM-QUIP interfaces. I think
that using KIM Models as QUIP Potentials, thereby allowing the QUIP
programs to be KIM simulators, has serious issues left. Some small
changes to Matt's neighbor list library, and waiting for it to be
included in the KIM distribution, are probably all that's necessary.
Well, that and some serious testing.

However, the interface the other way (QUIP Potentials as KIM Models)
is more problematic, because of the same types of issues that the
AIREBO people face, I think, when it comes to ghost atoms. There are
Brenner REBO type KIM models which need neighbor lists of the ghost
atoms (and I think GAP will too). Right now there's not way to do
that, right? Half lists work badly for angular models, and full lists
indicate ghost atoms by their having no neighbors. What is the
current thinking for how to deal with this? Give up and wait until we
have a proper way to indicate ghost atoms and request neighbor lists
of ghost atoms? For now implement it but give an error if any ghost
atoms are detected (which will fail needlessly if there are any
non-bonded atoms in the vacuum)?

I didn't fully appreciate this problem until the KCC, but I now think
we need a solution sooner rather than later.

          thanks,
          Noam

Excellent point.

Jim

However, the interface the other way (QUIP Potentials as KIM Models)
is more problematic, because of the same types of issues that the
AIREBO people face, I think, when it comes to ghost atoms. There are
Brenner REBO type KIM models which need neighbor lists of the ghost
atoms (and I think GAP will too). Right now there's not way to do
that, right? Half lists work badly for angular models, and full lists
indicate ghost atoms by their having no neighbors. What is the
current thinking for how to deal with this? Give up and wait until we
have a proper way to indicate ghost atoms and request neighbor lists
of ghost atoms? For now implement it but give an error if any ghost
atoms are detected (which will fail needlessly if there are any
non-bonded atoms in the vacuum)?

So, as you know, I agree that we need to transition to a scheme that separates the identification of padding particles from the neighbor list information. However, within the current system, I think the problem has a solution (unless I misunderstand):

You say that Models, such as the ones mentioned above, need the neighbors of padding particles in order to correctly compute the energy (and by differentiation the forces).

Based on this statement, I believe that the problem is that the Model's cutoff radius is being incorrectly specified. If the energy of particle i depends on its neighbors j (r^{ij} < r_1) AND in order to complete the calculation you also need to know the neighbors of j: k (r^{jk} < r_1), then in fact, the actual Model cutoff (as defined by the KIM API) is r_2 = 2*r_1.

With this specification of the Model cutoff, a domain-decomposition scheme will ensure that all padding j (neighbors of i) particles will have enough surrounding padding particles k (neighbors of j) to ensure that j has all its neighbors.

Now, it is true that the neighbor list for particle j will be empty. However, the neighbor list for particle i will contain j and all of its neighbors. So, although it will be somewhat inefficient, it is straight forward to find the neighbors of j. Just loop over the neighbors of i (with index k) and test the relative position vector between j and k.

Ryan

So, as you know, I agree that we need to transition to a scheme that separates
the identification of padding particles from the neighbor list information.
However, within the current system, I think the problem has a solution (unless
I misunderstand):

I know a plan is in the works, but I’m finding this discussion to be very
enlightening, and I think it’s certainly revealing to me some aspects of
any proposed improved scheme that I didn’t appreciate before.

You say that Models, such as the ones mentioned above, need the neighbors of
padding particles in order to correctly compute the energy (and by
differentiation the forces).

Based on this statement, I believe that the problem is that the Model’s cutoff
radius is being incorrectly specified. If the energy of particle i depends on
its neighbors j (r^{ij} < r_1) AND in order to complete the calculation you
also need to know the neighbors of j: k (r^{jk} < r_1), then in fact, the
actual Model cutoff (as defined by the KIM API) is r_2 = 2*r_1.

With this specification of the Model cutoff, a domain-decomposition scheme will
ensure that all padding j (neighbors of i) particles will have enough
surrounding padding particles k (neighbors of j) to ensure that j has all its
neighbors.

So this is where I just realized one thing I hadn’t realized before.

Really, the way KIM does neighbors, there are two cutoffs. There is one
cutoff for the neighbor list itself, and another cutoff for the ghost atoms.
The first is needed for direct interactions, i.e. things that have to be explicitly
looped over (I’ll call these immediate neighbors), e.g. atoms within
the interaction cutoff for LJ or Stillinger-Weber, or accumulating atomic bond
orders for REBO-type potentials. The new (to me) thing is that for
ghost atoms there’s a distinct and potentially larger cutoff, which includes
indirect interactions like neighbors of neighbors, where you need neighbors
of ghost atoms to calculate information about the ghost atoms, but you don’t
directly need to loop over an atom and its non-immediate neighbors. I’ll
call these extended neighbors.

Obviously you can always specify the larger cutoff and then, in the loops
that require only immediate neighbors skip atoms that are too far. But this
will increase computing time (perhaps trivially) and increase the memory
demands of the neighbor lists. For pair-type interactions the extended
cutoff is double the immediate cutoff, and then 7/8 of the neighbors (in 3D)
are too far. For dihedrals you need 3rd neighbors, and if you multiply
your cutoff by 3, 26/27 of the neighbors are too far.

Now, it is true that the neighbor list for particle j will be empty. However,
the neighbor list for particle i will contain j and all of its neighbors. So,
although it will be somewhat inefficient, it is straight forward to find the
neighbors of j. Just loop over the neighbors of i (with index k) and test the
relative position vector between j and k.

So I’ve been thinking about whether I can implement this for QUIP, and I
think I agree that it’s possible, but it’s not as simple as you make it sound.
In particular, the extended neighbor list for particle i is not guaranteed to
include all the immediate neighbors of particle j. I think it is true that the
union of all extended neighbor lists of all particles {i} that have j as a neighbor
includes all the neighbors {k} of particle j, but any one of them is not
guaranteed to do so.

For the benefit of anyone else who needs to deal with this in the future,
I’ve attached a picture to show the problem. j should have k and k’ as its
immediate neighbors. However, k is only in the extended neighbor list of i,
and k’ is only in the extended neighbor list of i’. So you have to create the
immediate neighbor list of j from the union of the extended neighbor lists
of i and i’. A further problem is illustrated by j and k’’. They are both in
the extended neighbor lists of both i and i’, which means that a naive
algorithm will add the j-k’’ neighbor relation twice. You therefore have to
check for duplicate neighbors, which in the case of rvec and the possibility
of multiple images means you have to search through the list of neighbors
of j and test the identity and the relative vector of any neighbor to make
sure it hasn’t already been added. I think I can do this for QUIP, but
it won’t be efficient. Probably won’t matter for GAP, but for a cheaper
potential, it could be an issue.

Noam

neighbor_of_ghost.pdf (6.96 KB)

So, as you know, I agree that we need to transition to a scheme that
separates
the identification of padding particles from the neighbor list
information.
However, within the current system, I think the problem has a solution
(unless
I misunderstand):

I know a plan is in the works, but I'm finding this discussion to be very
enlightening, and I think it's certainly revealing to me some aspects of
any proposed improved scheme that I didn't appreciate before.

Yes! This is a helpful discussion and your explainations and comments will be a good documentation of these issues. Thanks.

You say that Models, such as the ones mentioned above, need the neighbors
of
padding particles in order to correctly compute the energy (and by
differentiation the forces).

Based on this statement, I believe that the problem is that the Model's
cutoff
radius is being incorrectly specified. If the energy of particle i
depends on
its neighbors j (r^{ij} < r_1) AND in order to complete the calculation
you
also need to know the neighbors of j: k (r^{jk} < r_1), then in fact, the
actual Model cutoff (as defined by the KIM API) is r_2 = 2*r_1.

With this specification of the Model cutoff, a domain-decomposition scheme
will
ensure that all padding j (neighbors of i) particles will have enough
surrounding padding particles k (neighbors of j) to ensure that j has all
its
neighbors.

So this is where I just realized one thing I hadn't realized before.

Really, the way KIM does neighbors, there are two cutoffs. There is one
cutoff for the neighbor list itself, and another cutoff for the ghost atoms.
The first is needed for direct interactions, i.e. things that have to be
explicitly
looped over (I'll call these immediate neighbors), e.g. atoms within
the interaction cutoff for LJ or Stillinger-Weber, or accumulating atomic
bond
orders for REBO-type potentials. The new (to me) thing is that for
ghost atoms there's a distinct and potentially larger cutoff, which
includes
indirect interactions like neighbors of neighbors, where you need neighbors
of ghost atoms to calculate information about the ghost atoms, but you don't
directly need to loop over an atom and its non-immediate neighbors. I'll
call these extended neighbors.

Obviously you can always specify the larger cutoff and then, in the loops
that require only immediate neighbors skip atoms that are too far. But
this
will increase computing time (perhaps trivially) and increase the memory
demands of the neighbor lists. For pair-type interactions the extended
cutoff is double the immediate cutoff, and then 7/8 of the neighbors (in 3D)
are too far. For dihedrals you need 3rd neighbors, and if you multiply
your cutoff by 3, 26/27 of the neighbors are too far.

I guess you mean pair-functional-type interactions that need 2r (EAM, Glue, etc.)? Plain pair potentials don't need additional padding particles to perform the calculation correctly (I believe)....

Now, it is true that the neighbor list for particle j will be empty.
However,
the neighbor list for particle i will contain j and all of its neighbors.
So,
although it will be somewhat inefficient, it is straight forward to find
the
neighbors of j. Just loop over the neighbors of i (with index k) and test
the
relative position vector between j and k.

So I've been thinking about whether I can implement this for QUIP, and I
think I agree that it's possible, but it's not as simple as you make it
sound.
In particular, the extended neighbor list for particle i is _not_
guaranteed to
include all the immediate neighbors of particle j. I think it is true that
the
union of all extended neighbor lists of all particles {i} that have j as a
neighbor
includes all the neighbors {k} of particle j, but any one of them is not
guaranteed to do so.

For the benefit of anyone else who needs to deal with this in the future,
I've attached a picture to show the problem. j should have k and k' as its
immediate neighbors. However, k is only in the extended neighbor list of
i,
and k' is only in the extended neighbor list of i'. So you have to create
the
immediate neighbor list of j from the union of the extended neighbor lists
of i and i'. A further problem is illustrated by j and k''. They are both
in
the extended neighbor lists of both i and i', which means that a naive
algorithm will add the j-k'' neighbor relation twice. You therefore have
to
check for duplicate neighbors, which in the case of rvec and the
possibility
of multiple images means you have to search through the list of neighbors
of j and test the identity and the relative vector of any neighbor to make
sure it hasn't already been added. I think I can do this for QUIP, but
it won't be efficient. Probably won't matter for GAP, but for a cheaper
potential, it could be an issue.

The figure is VERY helpful. Thanks!

However, I would argue, that it is not quite relevant. You have focused on a particle j that is an "extended neighbor" (in your terminology) of i. However, I don't believe that it is necessary to have all "immediate neighbors" of the "extended neighbors". Instead, what is needed, is to be able to obtain all "immediate neighbors" of j where j is an "immediate neighbor" of i.

That is, The calculation of energy and (KIM API defined) forces for particles i and i' should not need to have access to all "immediate neighbors" of the "extended neighbor" j.

If this is right, then moving j inside the first ring around i will ensure that all "immediate neighbors" of j are contained (as "extended neighbors") within the neighbor list of i.

Yes? No?...

Ryan

I guess you mean pair-functional-type interactions that need 2r (EAM, Glue,
etc.)? Plain pair potentials don’t need additional padding particles to
perform the calculation correctly (I believe)…

Why do EAMs need 2r? I thought the energy on a site was dependent
only on its immediate neighbor (via the pair term) and its own density
(via the embedding term), and the density is only dependent on its
immediate neighbors.

REBO type models definitely depend on the bond orders of neighboring
pairs of atoms, and the bond order of each atom depends on its immediate
neighbors. This can also be the case for models with angle and dihedral
interactions, depending on how they define the local energy.

I’m going back on what I said about GAP, though. I think GAP will work
with a single cutoff.

The figure is VERY helpful. Thanks!

However, I would argue, that it is not quite relevant. You have focused on a
particle j that is an “extended neighbor” (in your terminology) of i.
However, I don’t believe that it is necessary to have all “immediate neighbors”
of the “extended neighbors”. Instead, what is needed, is to be able to obtain
all “immediate neighbors” of j where j is an “immediate neighbor” of i.

That is, The calculation of energy and (KIM API defined) forces for particles i
and i’ should not need to have access to all “immediate neighbors” of the
“extended neighbor” j.

I guess you’re right that the solution you proposed, i.e. starting from an
extended neighbor cutoff (= 2 * direct cutoff), is sufficient if all you need are
immediate neighbors of immediate neighbors, and I agree that that’s all you
should need, at least for REBO-type. I have to think further about
models that use dihedrals, but I think that should work too if you triple
the cutoff. I’ll try to code it up, and if this is wrong, I’ll figure it out when
my implementation breaks.

Noam

I guess you mean pair-functional-type interactions that need 2r (EAM,
Glue,
etc.)? Plain pair potentials don't need additional padding particles to
perform the calculation correctly (I believe)....

Why do EAMs need 2r? I thought the energy on a site was dependent
only on its immediate neighbor (via the pair term) and its own density
(via the embedding term), and the density is only dependent on its
immediate neighbors.

Yes! Right you are. So, as the KIM API defines it, pair potentials and EAM-like potentials only need the one r cutoff.

REBO type models definitely depend on the bond orders of neighboring
pairs of atoms, and the bond order of each atom depends on its immediate
neighbors. This can also be the case for models with angle and dihedral
interactions, depending on how they define the local energy.

I'm going back on what I said about GAP, though. I think GAP will work
with a single cutoff.

Good!

The figure is VERY helpful. Thanks!

However, I would argue, that it is not quite relevant. You have focused
on a
particle j that is an "extended neighbor" (in your terminology) of i.
However, I don't believe that it is necessary to have all "immediate
neighbors"
of the "extended neighbors". Instead, what is needed, is to be able to
obtain
all "immediate neighbors" of j where j is an "immediate neighbor" of i.

That is, The calculation of energy and (KIM API defined) forces for
particles i
and i' should not need to have access to all "immediate neighbors" of the
"extended neighbor" j.

I guess you're right that the solution you proposed, i.e. starting from an
extended neighbor cutoff (= 2 * direct cutoff), is sufficient if all you
need are
immediate neighbors of immediate neighbors, and I agree that that's all you
should need, at least for REBO-type. I have to think further about
models that use dihedrals, but I think that should work too if you triple
the cutoff. I'll try to code it up, and if this is wrong, I'll figure it
out when
my implementation breaks.

Great. Sounds right to me.

Cheers,

Ryan