Tags of all atoms in neighbor lists.

Hello,

Sorry if this sounds trivial, but I’ve got to get this up and running fast and I need to make sure I’m on the right track.

Background:

I’m building a many body pair potential that computes the energy iff the i and j atoms are the number 2 atom of a bond angle. I also need to determine the arms that make up the bond angle plane and compute normals to those arms for each i and j. I’d like to do this before I actually start to compute the pair potentials to avoid recalculating vectors, cross products and dot products.

Questions:

  1. What is the most efficient way, given an atoms local id or tag, to determine the bond angle id for which the atom is bond angle central atom?

  2. Can I just iterate from 0 to (inum + gnum) to build up a my precomputed arrays rather an have to iterate through al i/j lists; or are the ghost local ids not necessarily incremental integers above inum?

TIA

Robert

Hello,

Sorry if this sounds trivial, but I’ve got to get this up and running fast
and I need to make sure I’m on the right track.

Background:

I’m building a many body pair potential that computes the energy iff the i
and j atoms are the number 2 atom of a bond angle. I also need to determine
the arms that make up the bond angle plane and compute normals to those arms
for each i and j. I’d like to do this before I actually start to compute
the pair potentials to avoid recalculating vectors, cross products and dot
products.

the best way to avoid re-inventing the wheel or mistakes or coming up
with inefficient solutions, is to study the existing code. mutliple
manybody potentials maintain "close neighbor" lists that are generated
from the existing neighbor list for quickly finding candidates for
angles and dihedrals on the fly, e.g. AIREBO. others precompute
per-atom properties and move them along including the require forward
and/or reverse communication, e.g. EAM.

Questions:

1) What is the most efficient way, given an atoms local id or tag, to
determine the bond angle id for which the atom is bond angle central atom?

doesn't exist. all manybody potentials in LAMMPS do not use explicitly
defined angles or bonds. those are computed on-the-fly, since they are
dynamic and can change with reactive potentials. topological angles
remain.

2) Can I just iterate from 0 to (inum + gnum) to build up a my
precomputed arrays rather an have to iterate through al i/j lists; or are
the ghost local ids not necessarily incremental integers above inum?

local indices are dense, but not all of them may be applicable. some
might be excluded from the neighbor list. thus you always have to loop
over the neighbors. hence the processing of the neighbor list to build
custom lists for looping over bonded interactions as i mentioned
above.

axel.

Hi Alex,
I'll look at those two potentials.
I might have given you the wrong impression for question 1. In the data
file I defined an angle between atoms a, b, and c. Now these atoms happened
to be a c-alphas of a protein backbone and the bonds are held rigid. So
basically what I want to do is given atom b from the neighbor list I want to
look up the coordinates of atoms a, b and c. Isn't this what the
angle_atom1, angle_atom2, angle_atom3 lists are for?
Shouldn't I be able to use the tag of b to look up the tag of the angle from
angle_atom2 and then use the tag of the angle to get a's tag from
angle_atom1 and c's tag from angle_atom3 and then use their tags to get
their coordinates?
Maybe I've completely missed the boat on those lists.

Rob

Hi Alex,
I'll look at those two potentials.
I might have given you the wrong impression for question 1. In the data
file I defined an angle between atoms a, b, and c. Now these atoms happened
to be a c-alphas of a protein backbone and the bonds are held rigid. So
basically what I want to do is given atom b from the neighbor list I want to
look up the coordinates of atoms a, b and c. Isn't this what the
angle_atom1, angle_atom2, angle_atom3 lists are for?
Shouldn't I be able to use the tag of b to look up the tag of the angle from
angle_atom2 and then use the tag of the angle to get a's tag from
angle_atom1 and c's tag from angle_atom3 and then use their tags to get
their coordinates?
Maybe I've completely missed the boat on those lists.

there is an overall problem here. you ask about technical details but
fail to give the overall picture. all of what you describe here and in
the previous e-mail can be conveniently implemented in an angle style
where you don't even have to do the lookups your describe, but get a
ready-to-use list of angle triples, so please explain why this would
be needed as part of a non-bonded force/energy computation?

axel.

Shouldn’t I be able to use the tag of b to look up the tag of the angle from
angle_atom2 and then use the tag of the angle to get a’s tag from
angle_atom1 and c’s tag from angle_atom3 and then use their tags to get
their coordinates?

yes, you can do this. The atom->map() method converts

a global atom ID to a local index, which can then be used

to access the atom’s coords, etc. This is how the angle

classes work. Neighbor::angle_all() is where the map()

calls are made to build a list of local indices, once per reneighbor.

Steve

Hi Axel and Steve,

Thanks for taking the time with me on this. I did speak to my supervisor on this be coming to the group but he has no experience with parallel systems, this is our first attempt. Basically he said if I can’t sort it all out today to just go serial so everything will have a local ID and come back to parallel in a couple of months.

The MB model is different as we only have a single bond arm per molecule, not all exponentials are Gaussians, the LJ term is gone and being hybrided with another custom LJ. i-1 and i+1 are not potential bonding arms, but merely residues bonded to i. i and j are arbitrary. The is no torque term at all. I’ve attached a schematic of the situation.

I know that if j is a ghost particle it will have a local id and I can determine its position, and if it’s atom 2 of an angle; but what if tag[j] -1 and tag[j] + 1 are not local or ghosts, then they won’t have a local id. So how can I determine their positions from only their tags?

Am I worrying about something that will likely never happen because residue bond lengths are short if j is a ghost it’s bonded neighbors will also be ghosted?

Rob

ModifiedMB.png

If j is a ghost atom, then in serial, there will be an owned atom that

corresponds to j, which you can find via atom->map(tag[j]).

In parallel, there may be no such owned atom.

This is an important distinction b/c ghost atoms do not store

info that will let you figure out whether j is a 2 atom in an angle.

Only owned atoms store atom_angle2, etc.

The “right” way to do this in parallel, is to communicate that

info to ghost atoms, once per reneighboing (since it’s static info),

which your pair style could do.

Once you know that I,J is an interaction you want to compute,

you can find J-1 and J+1 by their tags, assuming you know the

tags either b/c they are consecutively numbered, or b/c you

also communicated atom_angle1 and atom_angle3.

If your non-bond cutoff is long enough, J-1 and J+ 1 will

be ghosts as well (or could be owned), or you can insure that

using comm_modify cutoff to increase the cutoff within which

ghost atoms are communicated.

Whether in serial or parallel, you will need to be careful

that are using a i-1,i,i+1 that are “close” to each other, and

ditto for j-1,j,j+1. There may be several images of any of these

atoms owned by a proc, and just looking them up via atom->map()

does not guarantee you found the image you need.

You should look at the domain->closest_image() method,

as used by Neighbor:angle_all() to see how it solves

this problem.

Steve

Hi Axel and Steve,

Thanks for taking the time with me on this. I did speak to my supervisor on
this be coming to the group but he has no experience with parallel systems,
this is our first attempt. Basically he said if I can’t sort it all out
today to just go serial so everything will have a local ID and come back to
parallel in a couple of months.

sorry, but that is bad advice. LAMMPS is designed from the ground up
to be parallel and even in serial mode it uses a parallel library
"stub" to be able to use the same code base. furthermore, the advice
assumes that LAMMPS way to handle coordinate follows minimum image
conventions, which is not correct. so even in serial mode, you can
have multiple copies of the same atom around. using Atom::map() will
give you only one of them, usually the one that is owned by the
domain. yet, that may not be the one you need, if one or more of your
tuple of angle atoms crosses a domain boundary.

if you ignore the basic management strategies in LAMMPS, you can
indeed get a serial code working, but you will have to spend time to
program features that you need which would otherwise automatically be
provided by LAMMPS and on top of that, you will have a *much* harder
time to make it work in parallel, because you will mostly have to undo
and then properly redo what you did for the serial version, as that
will not work in parallel (or will kill performance).

The MB model is different as we only have a single bond arm per molecule,
not all exponentials are Gaussians, the LJ term is gone and being hybrided
with another custom LJ. i-1 and i+1 are not potential bonding arms, but
merely residues bonded to i. i and j are arbitrary. The is no torque term
at all. I’ve attached a schematic of the situation.

ok. so i would say that calling this an MB model is a bit of a
misnomer, but that is besides the point of the discussion.

if i read the picture correctly, you have an additional energy term
between pairs of topological angles that depends on the distance
between the pairs of atoms i and j and the relative orientation of the
vector connecting the pair to the plane formed by the three atoms
comprising the respective angles (BTW: there seems to be a typo in the
definition of n_j).

this is indeed best implemented as a pair style to be used with
pair_style hybrid/overlay to add it to the regular interaction. as far
as i can see, there are a number of challenges to get this to work
well in LAMMPS (and that is mostly regardless of whether you try to
make it work in parallel or not).

1) are only r_ij pairs considered between different molecules or can
they also be in the same molecule? consider the case of a very long
chain: if this bends around, it will behave pretty much as if it was a
different molecule. in this case, one would have to determine what
would be a minimum distance between atom i and j on the same chain.
with the default exclusion settings (lj/coul 0.0 0.0 0.0) i,j pairs in
the same molecule will only show up in the neighborlist (or have a
non-zero scaling factor), if j is at least 4 atoms away from i, i.e.
there is one atom in between the two angles. this is probably not a
big deal in the beginning, but you should keep it in mind.

2) you have to take care of the different cases for the "newton" flag,
which can be on/off for both non-bonded and bonded interactions. this
affects what information is stored on ghost atoms and whether
contributions to force and energy need to be stored on ghost atoms.
the default of LAMMPS is "on" for both flags, but in your case, the
situation is likely going to be much easier when you require newton
off for both non-bonded pair and bonded interactions. programming for
the other cases can be done later.

3) it is indeed quite likely that would improve performance by
precomputing the normals for each atom triple before doing the force
loop. you can store it with the corresponding atom and then use a
forward communication to communicate it to its respective ghost atom
copies (either on the same or other processors). as an example, you
could look at the pair_eam.cpp file, where a single property (fp) is
stored with each atom. it is easy to extend this to have multiple
properties in there, you just need to adapt the
pack/unpack_forward_comm() functions accordingly. it may be simpler,
though, to first implement this without this optimization. once you
have a pair style that works correctly, you can consider all kinds of
optimizations but you will always have a reference to compare to.
(p.s.: you'll be having some serious "fun" deriving forces for this
potential...).

4) there are several optional methods that LAMMPS requires for special
cases (distance based r-RESPA, group/group energy/force computation
and more). these are implemented in many pair styles, but are not
required for running basic MD. so it is better to postpone those (e.g.
compute_inner/middle/outer, single) for later.

5) identifying the neighboring atoms should not make any assumptions
of the tag numbers being ordered or local atom indexing being done in
order. steve already mentioned that atom->map() will give you one of
the possible local/ghost indices, but that may not be the one you
need. that mapping also may change due to atoms migrating between
domains and local atoms being stored in a spatially sorted fashion
(for better performance). this happens during the re-neighboring
steps. more details are in steve's e-mail, so i won't repeat them
here.

in general, this is not a trivial model to add to a large package
program and i am a bit shocked by the rather cavalier attitude of your
adviser. there is so much badly written code floating around that is
causing increasing amounts of headaches to people, spending the time
to sufficiently understand the flow of control in a package program
requires time and some experimentation, but it is fully worth the
time, since afterwards you'll benefit a lot from the available
infrastructure and have something that is much easier to maintain and
improve.

HTH,
    axel.

Hi Axel & Steve,

Thank for your help. This will go a long way to doing it correctly.
I did have a lot of "fun" deriving the force, but it actually worked out to
be not too bad, a lot of terms go to zero.

I agree, my 5 year of software quality assurance and 5+ years of software
development experience are telling me to do it right from the start.
Unfortunately, I'm entered into a student competition at a conference in 6
weeks and the need for data is telling me to hack and fix later. Although
what you've given me may allow me to do it right, or at least less hacky.

I won't pester you anymore, for now.

Thanks again.
Rob

Also, I Yap et. Al are the ones I was taking my terminology from Proteins
2008 30:626-638.