electrostatics in the KIM API: plan and open questions

The issue of extending the KIM API to support long-range electrostatic interactions and charge equilibration was discussed with a group of experts at a working dinner at the SES Annual Technical Meeting at Purdue University on Wednesday 01-Oct-2014. Additional discussions with the KIM development team and Axel Kohlmeyer were held on 15 and 16-Oct-2014. Based on these discussions, the following plan has been devised. We would welcome your input on this plan. We also have specific questions marked below by "QUESTION" for your attention.

PLAN FOR EXTENDING THE KIM API TO SUPPORT LONG-RANGE ELECTROSTATICS AND CHARGE EQUILIBRATION

1. The need for electrostatics and/or charge equilibration is considered part of the Model definition.

While it is true that the electrostatics physics is independent of a particular model, it is the case that some models choose to include this physics and others do not. For this reason it is considered part of the Model. A Model must specify whether it requires such calculations to be performed in its ".kim" descriptor file.

2. Electrostatics

    * Simulators/Models must specify what type(s) of electrostatics support they
      provide/require by specifying an ordered list of supported methods. There
      will initially be four methods: `None', `ShortRange', `LongRange',
      `QEqShortRange'. (In the future additional types, such as
      split charge (SQEq); etc., may be added.)

      For example, a Simulator that can support all methods, but prefers
      long-range, might list:

        ElectroStaticsMethod1 := LongRange
        ElectroStaticsMethod2 := ShortRange
        ElectroStaticsMethod3 := QEqShortRange
        ElectroStaticsMethod4 := None

      A Model that requires QEq will list:

        ElectroStaticsMethod1 := QEqShortRange

      A Model that only works with one method would list:

        ElectroStaticsMethod1 := QEqShortRange

      The KIM API matching process will start at the top of the Model's
      ElectroStaticsMethods list and search through (from top to bottom) the
      Simulator's list until a match is found. If no match is found the Model
      and Simulator are incompatible. The KIM API will provide a function that
      allows the Model/Simulator to obtain the value of the matched method.

    * If a Simulator/Model lists anything other than `None', it must also
      include a "charges" argument in the Model Input section of its ".kim" file.

    * If `ShortRange' is successfully matched, then the Simulator must provide
      values in the charges array and the Model must use these values to compute
      electrostatic contributions (to the energy, forces, etc.) using a short
      range approximation of the Coulomb effect. The Model must include these
      contributions in its values for energy, force, etc. Thus, the Model will
      implement the total configuration energy function as:

        E = sum_i [ E_i^pot(r^{ij}, charges[j]) + E_i^c(r^{ij}, charges[j]) ],

      where E_i^pot is the (possibly charge dependent) "standard" short-range
      potential and E_i^c is the short-range approximation to the Coulomb
      interaction energy. The other Model output arguments (forces, virial, etc.)
      are derived as expressions in terms of the derivatives of E.

      [ QUESTION: Do we need a `electrostaticsCutoff' setting to allow for
      separate cutoff values for E_i^pot and E_i^c? If so, does the Model
      define this? Maybe we can just use the "multiple cutoff" support, that is
      planned for the KIM API, instead of support for a special electrostatics
      cutoff? ]

    * If `LongRange' is successfully matched, then the Simulator must provide
      values in the charges array and the Model must use these values only to
      compute its "standard" short-range potential. Thus, the model will
      implement its configuration energy function as:

        E = sum_i [ E_i^pot(r^{ij}, charges[j]) ].

      The other Model output arguments (forces, virial, etc.) are derived as
      expressions in terms of the derivatives of this E function. In order to
      obtain the total configuration energy, the Simulator must perform the
      long-range electrostatics computation and add the appropriate terms to
      each Model output obtained from the KIM Model. Thus, the total
      configuration energy is:

       E_tot = E + E^c(coordinates[i], charges[i]),

      where E is the energy obtained from the Model and E^c is the long-range
      Coulomb energy contributed by the Simulator.

    * If one of the QEq methods is successfully matched, then the Simulator and
      Model must, in addition to listing "charges" in the Model Input section,
      also list "electronegativities" and optionally "electrostaticsHessian"
      arguments in the Model Output section of their ".kim" files. (These are
      the first and second derivatives of the energy with respect to charge.)
      [ QUESTION: what about x^i q^j coupling terms, i.e. position-charge
      coupling: d2E/(dx^i dq^j)? If relevant, how do we implement this data
      structure? Memory management becomes challenging in this case if we want
      to avoid high memory usage data structures. ]

      The Simulator will be responsible for providing initial `charges' values
      and evolving the charges (using the electronegativities and
      electrostaticsHessian values) in a manner consistent with charge
      equilibration requirements.

      - If `QEqShortRange' is successfully matched, then the Model must use the
        charges values to compute the energy using a short-range approximation of
        the Coulomb energy. Thus, the Model will implement the total
        configuration energy function as:

          E = sum_i [ E_i^pot(r^{ij}, charges[i]) + E_i^c(r^{ij}, charges[j]) ],

        where E_i^pot is the (possibly charge-dependent) "standard" short-range
        potential and E_i^c is the short-range approximation to the Coulomb
        energy. The Model will compute, upon request, the first
        (electronegativities) and second (electrostaticsHessian) derivatives
        with respect to the charges of this energy function.
        [ QUESTION: We understand that for computational reasons, when charge
        equilibration is performed a short range approximation to the
        electrostatic energy is used. However, some models may choose to use the
        equilibrated charges to a long-range electrostatic calculation for the
        energy and forces. This means there are two energy functions. One is
        used only for charge equilibration to obtain charges. The second is the
        one used for evolving the atomic positions. Does this happen in
        practice? Should we design the standard to support or this or is having
        a single energy function as we propose above sufficient? Keep in mind
        that adopting the two-energy approach would increase the complexity of
        the KIM API and hence the burden on model developers. ]

    * If "None" is successfully matched, then neither the Simulator nor the Model
      will make use of any `charges', `electronegativities', or
      'electrostaticsHessian' arguments defined in their respective ".kim" files.

We look forward to your input!

Ryan & Ellad

kim-api-electrostatics-and-qeq-v06.txt (7.14 KB)

Dear Ryan,

I have limited experience with simulations containing long range electrostatic interactions, but I think your proposal sounds very reasonable. I only have a single comment

Comments:

1. For ShortRange, I don¹t know of any model that uses different cutoffs
for electrostatics and the non-bonded pairwise interactions. On the other
hand, the shortrange electrostatic cutoff is a sufficiently important
parameter that it would not be inappropriate to make it an explicit API
setting ŒelectrostaticsCutoff¹

2. For LongRange, it is important to keep in mind (or in API) that the
definition of electrostatic energy depends not just on charges[] and
coordinates[], but also on the definition of charge shape. Most models
assume a point charge, but other choices like Gaussian and Slater-type are
sometimes used. An example is the Streitz-Mintmire potential that is not
yet released in LAMMPS (F. H. Streitz and J. W. Mintmire, Phys. Rev. B 50,
11996 (1994)). So, a model might request:

ElectroStaticsMethod1 := LongRange

ElectroStaticsShape1 := Slater

If Gaussian or Slater is specified, then other parameters would be
required to specify the type-dependent parameters of the distribution. The
good news is that these variations can be accounted for entirely in the
real-space (short range electrostatics) calculation, and so the k-space
calculation does not need to know about it.

3. For QEq methods, there are only a handful of models and simulators out
there, so I think it may be premature to try to define a very general API.
You should pick a target Model/Simulator pairing and figure out how to
make it work. On the specific question of storage for arrays like
d2E/(dx^i dq^j), it is important to take advantage of the fact that these
matrices are usually very sparse. I know this is true for ShortRange, not
sure about LongRange. For ShortRange, the sparse matrix storage pattern is
very much analogous to a neighbor list.

Aidan

Ryan,

I’m glad to hear the support for electrostatics is moving forward.

Generally, I think the approach of doing long-ranged electrostatics in the Simulator and short-ranged electrostatics in the Model is a good one. It does raise some complications, though. See my comments below.

  • Simulators/Models must specify what type(s) of electrostatics support they
    provide/require by specifying an ordered list of supported methods. There
    will initially be four methods: None', ShortRange’, LongRange', QEqShortRange’. (In the future additional types, such as
    split charge (SQEq); etc., may be added.)

If I understand the descriptions well, “ShortRange” means that the Model is responsible for electrostatics (which will necessarily be short-ranged) and “LongRange” means that the Simulator is responsible for electrostatics, and the Model is only allowed to use the charges for charge-dependent potential terms that are not Coulomb-like. I think that’s a little too restrictive.

Most (nearly all?) bonded potentials that use electrostatics would not be happy to have the full N^2 Coulomb interaction be the only electrostatic contribution to the energy.

Example 1: many QEq potentials() use some form other than 1/r for Coulomb interactions at short range, say V_ij = q_i q_j J(r_ij), and switch over to q_i q_j r_ij at longer distances The normal method of doing this is to use Ewald or your favorite flavor of long-ranged periodic electrostatics to approximate the full pairwise sum of 1/r interactions, and then include a (J® - 1/r) correction for each short-ranged pair. () Non-QEq potentials can do this too; it’s not inherent in QEq for any reason, just something that they tend to do.)

Example 2: many biomolecular force fields exclude Coulomb interactions between bonded (1,2) and (1,3) and perhaps (1,4) neighbors. This is like the previous example, but with J® = 0 for some pairs. The implementation is often the same: do the full Ewald or whatever, then subtract out the bonded Coulomb contributions that you didn’t really want in the periodic sum.

In each of these examples, the Model needs to do some short-ranged electrostatics, in addition the long-ranged electrostatics interactions being calculated by the Simulator. This would seem to require the use of “LongRange” but violates the restriction that “the Model must use [charge] values only to compute its ‘standard’ short-range potential”. Perhaps I’m overinterpreting what is meant by the “standard” short-ranged potential, but in any case it seems like the Model should not be prevented from doing its own short-ranged electrostatics calculation, as long as it recognizes that the Simulator is also doing the full pairwise 1/r sum.

[ QUESTION: Do we need a `electrostaticsCutoff’ setting to allow for
separate cutoff values for E_i^pot and E_i^c? If so, does the Model
define this? Maybe we can just use the “multiple cutoff” support, that is
planned for the KIM API, instead of support for a special electrostatics
cutoff? ]

Separate cutoffs are most definitely needed. Most potentials use different cutoffs for electrostatics than for LJ and other interactions. So I would say it is most definitely up to the Model to define it. But since support for multiple cutoffs is already in the works (or at least on the drawing board), that is probably sufficient. I don’t see much advantage to labeling one of them as a special “electrostaticsCutoff”. Just let the Model request all the pairlists it wants, and do what it wants with them.

[ QUESTION: what about x^i q^j coupling terms, i.e. position-charge
coupling: d2E/(dx^i dq^j)? If relevant, how do we implement this data
structure? Memory management becomes challenging in this case if we want
to avoid high memory usage data structures. ]

I can’t think of (or even invent) a case where the Simulator needs to know this.

The Simulator will be responsible for providing initial `charges’ values
and evolving the charges (using the electronegativities and
electrostaticsHessian values) in a manner consistent with charge
equilibration requirements.

This sounds like a reasonable division (energy and forces in the model; dynamics in the simulator), and parallels the approach for position degrees of freedom. But there’s an extra complication for QEq potentials: the need to constrain the total charge on either the full system or a subset of the system.

(Without charge constraints, charge flows freely throughout the system, which is often a bad thing. Some models, like COMB, avoid this by tweaking the potential to prevent long-range transfer. Others, like fluc-q and polarizable CHARMM, constrain the total charge on individual molecules or residues.)

This means that either the Simulator performing the dynamics on the charges needs to know about the constraints so that it can impose them, or the Model needs to know about them so that it can compute constraint forces (on the charges) so that the constraints will be obeyed.

The comparable issue for position degrees of freedom is bond constraints, which hasn’t been fully thought out, but the two are related, and should perhaps be considered at the same time.

[ QUESTION: We understand that for computational reasons, when charge
equilibration is performed a short range approximation to the
electrostatic energy is used. However, some models may choose to use the
equilibrated charges to a long-range electrostatic calculation for the
energy and forces. This means there are two energy functions. One is
used only for charge equilibration to obtain charges. The second is the
one used for evolving the atomic positions. Does this happen in
practice? Should we design the standard to support or this or is having
a single energy function as we propose above sufficient? Keep in mind
that adopting the two-energy approach would increase the complexity of
the KIM API and hence the burden on model developers. ]

Yes, this happens in practice. But I don’t prefer to think about it as two different energy functions. Instead, one is merely a simplification used as a computational tool to generate approximate charges. In practice, the charges being used are not perfectly minimized, but suffer some error for this (among other) reasons. This is somewhat comparable to the fact that positions are also never known exactly, because we use finite-timestep integrators to move them around. I suppose you could say that we’re using two different energy functions: one to get the energy, and another (linearized) one that we use to push atoms around with. In neither case does it bother me.

So I don’t think there’s any need for the API to care if either the Model or Simulator is using an approximate method to estimate the charge forces or to solve for the charges.

-Steve Stuart

Ryan,

I’ll point out that Aidan’s point #2 (below) about the “shape” of the charge distribution is equivalent to my point about the short-ranged corrections to the Ewald sum. His point of view is that the Simulator can make these corrections if it knows the form of the distribution, since there are only a few common forms. My point of view was that the Model can make these corrections as long as it’s permitted to.

I suppose I don’t have particularly strong feelings about which side is responsible for them. The Simulator can probably accomplish them efficiently, in parallel, in the real-space sum. But having the Model do it keeps the API simpler and gives the Model the flexibility to use some form other than the versions canonized in the API. If they are negotiated via the API, I think a reasonable list would include delta, Slater, and Gaussian.

On Thursday, October 23, 2014 11:22:36 AM UTC-4, Thompson, Aidan wrote:¹

  1. For LongRange, it is important to keep in mind (or in API) that the
    definition of electrostatic energy depends not just on charges[] and
    coordinates[], but also on the definition of charge shape. Most models
    assume a point charge, but other choices like Gaussian and Slater-type are
    sometimes used. An example is the Streitz-Mintmire potential that is not
    yet released in LAMMPS (F. H. Streitz and J. W. Mintmire, Phys. Rev. B 50,
    11996 (1994)). So, a model might request:

ElectroStaticsMethod1 := LongRange

ElectroStaticsShape1 := Slater

If Gaussian or Slater is specified, then other parameters would be
required to specify the type-dependent parameters of the distribution. The
good news is that these variations can be accounted for entirely in the
real-space (short range electrostatics) calculation, and so the k-space
calculation does not need to know about it.

-Steve Stuart

Hi Steve,

your Example 1 is related to the issue I raised, regarding different choices for the charge shape (Point, Gaussian, Slater). I suggested allowing the Model to request of the Simulator a particular flavor of electrostatics:

ElectroStaticsMethod1 := LongRange
ElectroStaticsShape1 := Slater

I like your solution better: have the Simulator compute the energy for Point charges, and then the Model can compute the short-range (J® - 1/r) correction. The advantage of this is KIM no longer has to worry about defining all the Shape atomtype-dependent Shape parameters in the simulator. This complexity will be owned by the Model. Note, if KIM is keeping track of different energy categories, then this correction should be booked under electrostatic energy, even though it is calculated by the Model.

Example 2 is more problematic. The fake 1/r interaction between a bonded pair can be quite large. If this is added by the Simulator, and subtracted by the Model, it could cause a dramatic loss of floating-point precision in the final result. But this use case is still a couple of bridges away, so maybe not worth worrying about yet.

Aidan

But having the Model do it keeps the API simpler and gives the Model the flexibility to use some form other than the versions canonized in the API.

I agree that this is the most important consideration. In most cases, any performance hit will be in the noise.

Hi Steve,

If I understand the descriptions well, "ShortRange" means that the Model is
responsible for electrostatics (which will necessarily be short-ranged) and
"LongRange" means that the Simulator is responsible for electrostatics, and
the Model is only allowed to use the charges for charge-dependent potential
terms that are not Coulomb-like. I think that's a little too restrictive.

Most (nearly all?) bonded potentials that use electrostatics would not be
happy to have the full N^2 Coulomb interaction be the only electrostatic
contribution to the energy.

Example 1: many QEq potentials(*) use some form other than 1/r for Coulomb
interactions at short range, say V_ij = q_i q_j J(r_ij), and switch over to
q_i q_j r_ij at longer distances The normal method of doing this is to use
Ewald or your favorite flavor of long-ranged periodic electrostatics to
approximate the full pairwise sum of 1/r interactions, and then include a
(J(r) - 1/r) correction for each short-ranged pair. (*) Non-QEq potentials
can do this too; it's not inherent in QEq for any reason, just something
that they tend to do.)

Example 2: many biomolecular force fields exclude Coulomb interactions
between bonded (1,2) and (1,3) and perhaps (1,4) neighbors. This is like the
previous example, but with J(r) = 0 for some pairs. The implementation is
often the same: do the full Ewald or whatever, then subtract out the bonded
Coulomb contributions that you didn't really want in the periodic sum.

In each of these examples, the Model needs to do some short-ranged
electrostatics, in addition the long-ranged electrostatics interactions
being calculated by the Simulator. This would seem to require the use of
"LongRange" but violates the restriction that "the Model must use [charge]
values only to compute its 'standard' short-range potential". Perhaps I'm
overinterpreting what is meant by the "standard" short-ranged potential, but
in any case it seems like the Model should not be prevented from doing its
own short-ranged electrostatics calculation, as long as it recognizes that
the Simulator is also doing the full pairwise 1/r sum.

Our thinking was that the short-range calculation (V_ij = q_i q_j J(r_ij)) would be handled by the simulator as well, since the short and long parts must be compatible. This would prevent the simulator and model having to agree on the details of the short-range approximation. This will significantly reduce the complexity of the API and leave the simulator with maximum flexibility.

We also considered the possibility of having the simulator provide a pointer to a call-back routine for computing J(r) that the model would call in its inner loop. This could lead to improved efficiency since the simulator would not need to repeat the loop over neighbors that the model is already doing, however there would be significant overhead due to repeated function calls. Our feeling was that since the gains in efficiency would probably be minor (if any), it would be best to avoid the added complexity to the API and model developers and leave the calculation to the simulator.

-Ellad

Ellad,

I agree that it’s good to minimize negotiation between Model and Simulator when we can. But we appear to be making different assumptions about other parts of the problem. In particular, I don’t agree that the having the Simulator do both the short- and long-ranged electrostatics eliminates the need for negotiation between the Simulator and Model.

For one thing, the flavor of short-ranged potential is defined by the Model. Any Model that does any short-ranged electrostatics different than point-charge Coulomb has been parameterized specifically for that short-ranged interaction, and would be incorrect if simulated with a different short-ranged interaction style. So if the Simulator performs the short-ranged electrostatics calculation, this requires negotiation.

For another, I don’t necessarily agree that the short- and long-ranged interactions should be compatible. If the Model applies whatever short-ranged interactions it knows are correct, and the Simulator doesn’t need to care what they are. The Model does need to know what interaction was used for the long-ranged interaction (so it can remove it for the short-ranged pairs), but the long-ranged interaction will be 1/r, and does not need to be negotiated.

Just in case that last part was not clear: Typical long-ranged electrostatics methods compute a periodic sum of pointwise 1/r interactions (using some approximation like Ewald or PPPM, etc). It’s usually impossible to exclude individual short-ranged charge-charge interactions from this sum, because it is periodic. So the short-ranged calculation usually includes a [ J® - 1/r ] term to include the short-ranged J® contribution and cancel the long-ranged 1/r contribution that has been included by Ewald. This full [ J® - 1/r ] term could be applied in the Model without negotiation, since the Model already knows the form of J®. It could also be applied in the Simulator, but then the Model would have to tell the Simulator which J® to use.

Note: I don’t have any particularly strong objections to having the Simulator do the short-ranged electrostatics calculation. (In fact, it makes my life easier, as a Model buider.) But if the main rationale is really to simplify the API, I don’t think that’s the way to do it.

Steve,

I also agree with minimizing negotiation. However, there are two important additional things to consider.

  1. The long-range part does require at least 1 parameter. That is the value alpha that appears in the Gaussian exp(-alpha^2 r^2). This defines the charge distribution used in the k-space calculation performed by the Simulator, and must also be accounted for in the real-space calculation, whether that is done by the Simulator or the Model. Fortunately, in most widely used codes (such as LAMMPS), the k-space charge distribution is always Gaussian and alpha is the same for all ion pairs. More good news is that the issue of excluding bonded interactions is handled entirely by the real-space calculation; the k-space calculation does not know about exclusions.

  2. The accuracy and computational cost of the k-space and real-space electrostatic calculations are sensitive to the values of:

a. alpha
b. rcut, the real-space cutoff for electrostatic interactions, same for all i-j pairs
c. periodic cell size
d. resolution of k-space discretization (Ewald k-space vectors, PPPM mesh size, etc)
e. some other stuff

Given all of that, and Steve’s concern about minimizing negotiation, I suggest dividing the work as follows:

  1. Somebody, somewhere defines alpha, rcut, and accuracy target
  2. Simulator performs k-space calculation for 1/r interactions between infinite periodic array of Gaussian-shaped charges, consistent with alpha and accuracy target.
  3. Model performs real-space calculation consistent with alpha, and its own specifications for J® that defines the short-range interactions.

Aidan

Hello, Sorry for my delayed participation in this thread...

So, I think we need some clarifications.

First, let's just consider the energy/force calculations (separate from QEq considerations). Let's see if we can come to a consensus on this aspect of the specification and then we can continue the discussion in regard to QEq...

Let's define E_coulomb(coords[], charges[]) = A * sum_{i,j} (q_i*q_j)/r_ij

This sum over all pairs of particles is part of any electrostatics energy. In addition there can be other contributions to the electrostatic energy (involving J, etc...). The energy can always be written so that the E_columb() term is present (simply by adding and subtracting appropriate contributions, as necessary in order to achieve a term of the form E_coulomb()).

We propose that the KIM API require that the Model is ALWAYS responsible for all contributions to the electrostatic energy except possibly E_coulomb.

Thus, negotiation between the Model and Simulator only involves deciding who will compute and include the E_coulomb contribution. (Clearly, the Model may change the details of its contributions based on which entity is contributing the E_coulomb terms.)

If the Model is to perform the E_coulomb() contribution, then necessairly a truncated summation approximation to E_coulomb must be used by the model.

If the Simulator is to perform the E_coulomb() contribution, it can use any method it likes but it must account for the entire contribution. That is, the E_coulomb term cannot be split into a "short-range" contribution that will be computed by the Model, and a "k-space" contribution that will be computed by the Simulator. (The Simulator can split the E_coulomb term into such parts, but it (the simulator) must compute both parts.)

It seems to me that this represents a clean separation of the computations with minimal negotiation and no parameters must be specified and communicated between Simulator and Model.

So, does that make things more clear? Are there any issues/problems with this that I am missing?

Thanks,

Ryan

Hi Ryan,

This approach is definitely workable, and has the merit of requiring the
least amount of negotiation. I am 99% sure that the Model does not need to
know the shape of the charge distribution used by the Simulator to do the
the charge splitting. The main disadvantages that I see are:

1. Duplicated calculations of short-range electrostatics in Model and
Simulator. Worst case this is a 2x increase in cost, but in practice will
be much less.
2. Loss of precision due to adding and subtracting quite large
contributions. Probably the worst case for this is the Coulomb interaction
between O and H within a H20 molecule.

Aidan

I’ve been reading this thread, but keep in mind that I’m writing as someone who has thought about, but never implemented, versatile and high performance electrostatics interactions (I have implemented ones that aren’t terribly well optimized and ones that were specific in detail). My first thought was that the separation that Ryan suggested is exactly what I was thinking of (leave the expensive and universal part delta-function to the simulator, and let the model handle any differences from that), and Aidan’s most recent post with the disadvantages (speed, accuracy) is a useful complement to that line of thought.

I have a followup question that might be useful to think about, directed at those who do understand implementations in large scale, production codes:

  1. Aidan earlier noted that if the charges are really Gaussians, you can short-cut some of the tricks used (by Ewald) to make the full 1/r problem computationally feasible, and directly do the problem for Gaussians, not delta functions. My question is whether this is specific to Ewald, or whether it applies to other methods as well (PME, FMM, etc)? If it doesn’t apply to other methods, I would be that much more reluctant to consider exposing this, which is taking advantage of what is essentially a simulator implementation detail, to the model-simulator interface.

Noam

Hi Noam,

I think you are referring to my statement that in LAMMPS, the reciprocal space interaction is always calculated for an infinite periodic array of charges, with Gaussian shape exp(-alpha^2 r^2). I think this is generally true for implementations of Ewald, PME, PPPM. However, it is not true for the MSM method in LAMMPS. As a result, MSM requires a real-space interaction different from that required by Ewald and PPPM. FMM is a whole different beast again, and I don’t know much about it. After flip-flopping several times, I now agree with you and Ryan. The API should not attempt to expose this complexity. Let the Simulator handle the full (long and short-range) 1/r interaction for delta-function charges , and as you say, "let the model handle any differences from that.”

Aidan

Dear all,

I have some experience in implementing a reasonably generic Coulomb solver that works with standard chemical potential equilibration techniques and charge self-consistent tight-binding (SCC-TB).

Having the simulator compute the 1/r part and having the model handle the short-ranged corrections, as suggested, seems to be the most flexible approach. Here are a couple of comments from my experience in implementing this:

  • SCC-TB leads to a nonlinear expression for the energy of E(q), and efficient solution usually requires a specific solver. (We use Anderson mixing in the case of SCC-TB.) I just want to point out that one should not expect E(q) to be quadratic in q.

  • We have a callback that returns just the electrostatic potential as a function of charge. To clarify, we have the Coulomb module handle the short ranged part of the potential as well as the 1/r part and the callback returns dE/dq rather than the electrostatic potential at a point in space. (This can be different if charges have shapes. dE/dq can then be used for conjugate gradient or Anderson optimization of the charges.) Having the model handle all the short range part of the interaction is more general then this, but leads to reimplementation of Gaussian/Slater type charge overlap that is used in many cases.

  • There needs to be (MPI) communication within the model to enforce global charge neutrality. Not sure if this communication could be transparently handled through properties of the ghost atoms.

Lars

Ryan,

I think your proposal is completely reasonable, and the clarification is helpful.

In particular: just because an Ewald or other electrostatics method chooses to decompose the electrostatic energy into a real-space and reciprocal-space component does not mean that it is computing anything “short-ranged” in the sense of your initial post. It’s up to the Simulator to implement the 1/r Coulomb interactions any way it sees fit, and it will remain a “long-ranged” interaction no matter how it is implemented. I think that was your original intent all along, but there is definitely confusion on this issue.

The discussion about Gaussians is also a bit of a red herring, in my opinion. The actual charge distribution used by a model may well use Guassians. And the Ewald sum may use Gaussians. But these are not necessarily the same Gaussians. The Gaussian width that optimizes the convergence of the Ewald sum does not necessarily have anything to do with the physical width of an atomic distribution. These widths are usually quite different on a well-tuned code. So I’m personally not worried about doing Gaussians twice.

-Steve

Hi Steve and everyone.

I agree with what you say above and I feel we have pretty much converged on this issue. Next, (in a separate message to come later) I'll update our proposal with clarifications and we'll shift the discussion to charge equilibration.

Thanks!

Ryan

Hi Kaushik,

(I hope you don't mind me reposting this to the list...)

Hi Ryan and Ellad,

I don't know much about the structure of the codes so I can't say much
about this issue. But given the focus on Ewald, I wonder if your plan is
to confine it to periodic geometries?

No, our plan is to support periodic and non-periodic. I think the discussion has "focused" on Ewald simply due to the common/popularity of that type of simulation.

Basically, the Simulator is responsible for the entire 1/r term. For simplicity, we're relegated all of the responsibility for methodology choice to the Simulator (who is, after all, the only "entity" that knows if the global system is periodic, non-periodic, etc....).

Thus, if the computation is a periodic one, then the Simulator could use Ewald (or other periodic method). If it is not periodic, then another method must be used.

Hi,

After heaving read the thread, I'd like to come back to this once more. (Sorry.) I'd really replace this by:

E_coulomb(coords[], charges[]) = A * sum_{i,j} (q_i*q_j)*erf(alpha*r_ij))/r_ij

and have the Simulator compute this in the case of "ElectroStaticsMethodX := LongRange ". It adds just a little extra complexity to the API: the model needs to retrieve this value of alpha from the Simulator to know what short-range corrections to apply. This definition also includes point charges as a limiting case: if something like a point-charge FMM is used by the Simulator, it can still report alpha=inf to the model. That said, I don't think it is that crazy to implemented FMM with Gaussian charge distributions.

The only reason to use this generalized definition is to avoid a catastrophic precision loss, not only in the energy but also in the forces, hessian, ... Therefore, some models should be allowed to insist on a finite value of alpha for the sake of getting a meaningful result.

Best Regards,

Toon

Hi,

I'd like to add some extra comments to the non-QE part of the proposal. (I'll give some QE comments in a separate mail.)

* In the case of a fixed-charge model, I don't see how the Simulator can know the values of the charges. Usually, this information is part of the model, e.g. oxygen has a charge of -0.834 in the TIP3P model. In the case of `ShortRange', the simulator never needs to know that charges are used in the first place. In that sense, there is no need to distinguish between `ShortRange' and `None'. They might as well be merged into one `ShortRangeOrNone' type. In the case of `LongRange', the Simulator needs to retrieve the charges from the Model.

* I wouldn't worry too much about a special electrostaticsCutoff. If multiple cutoffs are planned, that is good enough. It is not such a big waste of time to use one cutoff for all.

Best Regards,

Toon