MPAqueousCompatibility applies hydrate correction to non-hydrate materials

I have a question of the hydrate correction used in MaterialsProjectAqueousCompatibility. I looked at the source code, and it seems any H and O containing compounds are classified as hydrate, i.e., containing H2O in its structure. Here is the code:
Screen Shot 2023-03-08 at 3.44.22 PM

I realized that it would also treat non-hydrate materials as hydrate, such as LiOH. I suppose the correction will treat LiOH as Li2O + H2O, but I am not sure if it physically makes sense. Why is the hydrate correction set up in this way?

Thank you.

1 Like

Hi @johan28 , you are correct that the correction is supposed to be applied to any composition that has H2O in its structure. However, it only applies when there are whole H2O units in the formula. LiOH would not receive this correction because int(0.5)=0.

It is definitely possible that this correction will be erroneously triggered on some materials that contain H2O units but are not really hydrates. However there is really no way to tell from the composition, we would have to do much more sophisticated (and computationally slower) inspection of the Structure to make a more educated guess about when we have a hydrate vs. not.

When developing MaterialsProjectAqueousCompatibility, we decided it was importnant to keep the correction scheme composition-based (rather than structure-based), and we made a judgment call, based on looking at predicted vs. experimental Pourbaix diagrams, that it was better to err on the side of applying the correction whenever we saw H2O in the formula and risking some non-hydrates getting it than to risk missing applying it to hydrated structures.

I hope that makes sense.

Thank you for the clarification! Just to add one more comment that for LiOH that I was looking at , it has composition Li2H2O2 (mp-23856-GGA), and therefore, the hydrate correction will be added.

Thank you for the clarification! Just to add one more comment that for LiOH that I was looking at , it has composition Li2H2O2 (mp-23856-GGA), and therefore, the hydrate correction will be added.

Ah, I see. You bring up a good point. So it might be more robust to this check against the reduced_composition rather than the composition, e.g.

nH2O = int(min(comp.reduced_composition["H"] / 2.0, comp.reduced_composition["O"]))

That way nH2O would be zero for a structure like LiOH. I’m going to open a PR to change this; thank you for reporting this bug!

Hi, I have additional questions on how this hydrate correction is implemented. It looks like the code does not agree with what the comments is trying to explain (please correct me if I am wrong).
The comments above the piece of code regarding hydrate correction says

# For any compound except water, check to see if it is a hydrate (contains)
# H2O in its structure. If so, adjust the energy to remove MU_H2O ev per
# embedded water molecule.
# in other words, we assume that the DFT energy of such a compound is really
# a superposition of the "real" solid DFT energy (FeO in this case) and the free
# energy of some water molecules
# e.g. that E_FeO.nH2O = E_FeO + n * g_H2O
# so, to get the most accurate Gibbs free energy, we want to replace
# g_FeO.nH2O = E_FeO.nH2O + dE_Fe + (n+1) * dE_O + 2n dE_H
# with
 # g_FeO = E_FeO.nH2O + dE_Fe + dE_O + n g_H2O
# where E is DFT energy, dE is an energy correction, and g is Gibbs free energy
# This means we have to 1) remove energy corrections associated with H and O in water
# and then 2) remove the free energy of the water molecules

The implemented hydrate correction removes “self.h2o_adjustments * 3 + MU_H2O” per number of embedded water, wouldn’t that result in the following free energy?

# Adding hydrate correction, i.e., - n dE_O - 2n dE_H - n MU_H2O
g_FeO.nH2O = E_FeO.nH2O + dE_Fe + (n+1) * dE_O + 2n dE_H - n dE_O - 2n dE_H - n MU_H2O
           = E_FeO.nH2O + dE_Fe + dE_O - n MU_H2O
# Since MU_H2O = g_H2O - 2 g_H - g_O
g_FeO.nH2O = E_FeO.nH2O + dE_Fe + dE_O - n g_H2O + 2n g_H + n g_O
# Since E_FeO.nH2O = E_FeO + n * g_H2O is assumed
g_FeO.nH2O = E_FeO + dE_Fe + dE_O + 2n g_H + n g_O
           = g_FeO + 2n g_H + ng_O

How does it lead to g_FeO = E_FeO.nH2O + dE_Fe + dE_O + n g_H2O ?

The implemented hydrate correction removes “self.h2o_adjustments * 3 + MU_H2O” per number of embedded water, wouldn’t that result in the following free energy?

Hi @johan28 , I think there are two subtle items you’re misunderstanding here. First, self.h2o_adjustments is in eV per atom, so subtracting 3x the adjustments is just a way of getting back the total energy correction per water molecule. Second, MU_H2O and g_H2O are the same thing. Chemical potential and free energy are interchangeable when you have a single molecule.

Does that make more sense?

Hi rkingsbury,

I do think I might be misunderstanding how the correction is implemented, but please let me confirm something.

  1. As you mentioned, “self.h2o_adjustments” is in eV/atom. Therefore, I was interpreting self.h2o_adjustments * 3 as the same as dE_O + 2 dE_H as used in the comment of the code.
  2. According to the citation “Persson, et. al., Phys. Rev. B, 85, 235438 (2012).”, the chemical potential is defined as the formation free energy delta_g, rather than Gibbs free energy, and MU_H2O = -2.46 eV/H2O is the value of former definition (equation 36 in the reference). This definition also agrees with how the H energy is corrected as shown below:
# Free energy of H2 in eV/atom, fitted using Eq. 40 of Persson et al. PRB 2012 85(23)
# https://journals.aps.org/prb/abstract/10.1103/PhysRevB.85.235438
# for this calculation ONLY, we need the (corrected) DFT energy of water
  self.fit_h2_energy = round(
      0.5
      * (
          3 * (self.h2o_energy - self.cpd_entropies["H2O"]) - (self.o2_energy - self.cpd_entropies["O2"]) - MU_H2O
      ),
      6,
  )
  1. If MU_H2O is considered to be the same as g_H2O in the comment, then the physical meaning of the correction is still unclear to me
# Starting from g_FeO.nH2O = E_FeO.nH2O + dE_Fe + (n+1) * dE_O + 2n dE_H
# Assuming MU_H2O = g_H2O, and self.h2o_adjustments * 3 = dE_O + 2 dE_H
# Adding correction, i.e., - dE_O - 2 dE_H - g_H2O per H2O molecule, leads to
g_FeO.nH2O = E_FeO.nH2O + dE_Fe + (n+1) * dE_O + 2n dE_H - n dE_O - 2n dE_H - n g_H2O
           = E_FeO.nH2O + dE_Fe + dE_O - n g_H2O
# Since E_FeO.nH2O = E_FeO + n * g_H2O is assumed
g_FeO.nH2O = E_FeO + n * g_H2O + dE_Fe + dE_O - n g_H2O
           = E_FeO + dE_Fe + dE_O

This means after the correction, hydrated compound (FeO.nH2O) has the same DFT energy as the “real” solid (FeO), which would give a wrong per atom energy after the correction.

  1. As you mentioned, “self.h2o_adjustments” is in eV/atom. Therefore, I was interpreting self.h2o_adjustments * 3 as the same as dE_O + 2 dE_H as used in the comment of the code.

Yes, that’s correct!

According to the citation “Persson, et. al., Phys. Rev. B, 85, 235438 (2012).”, the chemical potential is defined as the formation free energy delta_g, rather than Gibbs free energy,

Formation free energy means Gibbs free energy of formation. There is no other definition of free energy besides Gibbs Free energy, right? Let me know if I’m misreading your statement there.

  1. If MU_H2O is considered to be the same as g_H2O in the comment, then the physical meaning of the correction is still unclear to me

I think of it this way. You can use DFT to compute the (Gibbs) free energy of formation of H2O, from H2O, H2, and O2. If you do that, you will not get the experimental value of 2.46 eV/atom. So the meaning of this correction is to replace the DFT-computed free energy with the experimental one.

We implement this based on the assumption that materials containing a lot of H2O are in fact hydrate phases, in which the H2O exists as discrete water molecules (as opposed to H’s and O’s that are bound to the structure in other ways). So for every one of those internal water molecules, we subtract the DFT formation energy and add back the experimental formation energy.

This means after the correction, hydrated compound (FeO.nH2O) has the same DFT energy as the “real” solid (FeO), which would give a wrong per atom energy after the correction.

That’s exactly right, except I don’t think this is “wrong” per se. This “real” solid energy represents the energy of the FeO in an aqueous medium. If we know that the material contains nH2O number of distinct water molecules and we are constructing our framework such that each H2O has a formation energy MU_H2O, then we need to subtract this from the total hydrate energy in order to get the energy of what’s left.

Does this make sense? In the pull request I’ve opened to fix the bug you identified, I’ve also made a few changes to the comments to help clarify some of this. Other suggestions welcome.

In the pull request I’ve opened to fix the bug you identified, I’ve also made a few changes to the comments to help clarify some of this. Other suggestions welcome.

Thank you so much!

Formation free energy means Gibbs free energy of formation.

Yes, I agree, they are the same, and they are denoted by delta_g in the paper. However, “Gibbs free energy” ,i.e., small g, is defined as g = h- Ts, where h = E_DFT for solid, and s is entropy. Thus, Gibbs free energy of water g_H2O = E_DFT_H2O - T*S_H2O should not be equivalent to the formation Gibbs free energy delta_g_H2O = MU_H2O.

I think one of the confusions comes from the fact that if “g” stands for formation free energy in the comment, it disagrees with the comment saying

g is Gibbs free energy

and the reference paper using ‘delta_g’ for the formation energy. Also, if g_FeO.nH2O is the formation Gibbs free energy, rather than Gibbs free energy, then it means the correction "d"s in g_FeO.nH2O = E_FeO.nH2O + dE_Fe + (n+1) * dE_O + 2n dE_H also include the subtracted energy of reference states of each element. Is that the case?

we subtract the DFT formation energy and add back the experimental formation energy.

The implemented “hydrate_adjustment” removes, rather than adds back, experimental formation energy MU_H2O and any correction originally applied to the water entry provided to the compatibility class. I am confused by when the DFT formation energy of water is subtracted, and when MU_H2O is added back. Is it somehow related to the fact that H2 energy is corrected (MP Aqueous H2 / H2O referencing) beforehand?

That’s exactly right, except I don’t think this is “wrong” per se.

If E_DFT of FeO.nH2O is corrected to be the same as FeO, it means the energy_per_atom of FeO.nH2O will be E_DFT / (2 + 3n), whereas the energy_per_atom of FeO will be E_DFT / 2. This means FeO.nH2O is thought to be more stable than FeO for any n > 0. Also, any hydrated compound will be more stable than the equivalent ‘real’ solid after the correction. But I am not sure why it is physically meaningful. I think the problem here is from the assumption E_FeO.nH2O = E_FeO + n * g_H2O. If E is DFT energy, then it should not be summed with g, which is the formation energy, to get another DFT energy.

It makes more sense if g stands for Gibbs free energy itself as the comment already says. Then, I will get


g_FeO.nH2O = E_FeO + dE_Fe + dE_O + 2n g_H + n g_O

= g_FeO + 2n g_H + ng_O

after the correction as I derived before. In this case, FeO.nH2O and FeO will have the same formation energy, and H2O in FeO.nH2O will have ZERO formation energy. But then, it is not the same as the initial intension to say that the H2O in FeO.nH2O should have the experimental formation energy MU_H2O.

Sorry for a long comment, but we want to make sure that we understand how exactly pymatgen Pourbaix diagram is constructed.

Hi @johan28 , sorry for my slow response and thank you for the deep thought you’ve put into this. In general I agree with you; I think I misunderstood you a few posts back and caused some confusion about formation vs. free energy. After thinking about this for a while, I also think you have found an error in how this correction is implemented.

Let me re-state my understanding of how MaterialsProjectAqueousCompatibility works:

Let E denote DFT (electronic energy)
Let dE denote an energy correction applied to DFT energy
Let g denote free energy (as in chemical potential, not formation free energy)
Let dG note the formation free energy change

First, we add room-temperature entropies to the corrected DFT energies of water and O2, (E_H2O + dE_H2O) and (E_O2+dE_O2) , to get g_H2O and g_O2

Next, we use these free energies to calculate the corrected free energy of H2, such that the formation free energy of water matches the experimental value MU_H2O:

g_H2O - 1/2 g_O2 - 2 g_H2= MU_H2O = dG_H2O

Now for the hydrate adjustment. The purpose of this adjustment is that when you calculate the formation free energy of FeO.nH2O, the H2O part “contributes” the correct amount of free energy, equal to n dG_H2O or n MU_H2O. Another way of thinking about this is that we want to treat the free energy (g_FeO.nH2O) as a combination of g_FeO (hyd) and n g_H2O:

g_FeO.nH2O = g_FeO + n g_H2O

But the solid compound FeO.nH2O may have had other energy corrections applied that are not consistent with this objective. For example, in the MP2020 compatibility scheme there are corrections for hydrides that really aren’t appropriate if the compound is a hydrate. The compound above may have had its energy adjusted by n * hydride correction.

So we need to get rid of any previous energy corrections on the H and O that are associated with water. After that, we need to ADD any energy corrections that were applied to a single water molecule, e.g.

g_FeO.nH2O = g_FeO.nH2O - n dE_O - 2n dE_H + n dE_H2O

Note that if there are no corrections applied to hydrides, oxides, or water, then this energy adjustment is zero. But that won’t be the case when using MP data because H2O will not receive a hydride correction (see our corrections file).

Now that we have adjusted the free energy of the embedded H2O, we can verify that dG_FeO.nH2O = dG_FeO + n dG_H2O (as intended) as follows:

dG_FeO.nH2O = g_FeO.nH2O - g_Fe - (n+1)/2 g_O2 - 2n g_H2

dG_FeO.nH2O = g_FeO.nH2O - g_Fe - 1/2 g_O2 - n ( 1/2 g_O2 + 2 g_H2 )

Note that ( 1/2 g_O2 + 2 g_H2 ) = dG_H2O - g_H2O, from above, so

dG_FeO.nH2O = g_FeO.nH2O - g_Fe - 1/2 g_O2 - n ( dG_H2O - g_H2O )

Now, substitute the definition that we started with: g_FeO.nH2O = g_FeO (hyd) + n g_H2O

dG_FeO.nH2O = g_FeO (hyd) + n g_H2O - g_Fe - 1/2 g_O2 - n ( dG_H2O - g_H2O )

dG_FeO.nH2O = [ g_FeO (hyd) - g_Fe - 1/2 g_O2 ] + n [ g_H2O - n ( dG_H2O - g_H2O ) ]

dG_FeO.nH2O = dG_FeO (hyd) + n dG_H2O

Now this is not what currently happens in pymatgen, and I think that’s an error. Specifically, I think this line

hydrate_adjustment = -1 * (self.h2o_adjustments * 3 + MU_H2O)

Should be something like

hydrate_adjustment = - dE_O - 2 dE_H + self.h2o_adjustments * 3

Where dE_O and dE_H are the corrections applied to oxides or hydrides in the material, and self.h2o_adjustments = dE_H2O, and all values are in eV/atom.

For MP data, dE_H2O = dE_O / 3 (i.e., a water molecule receives one oxide correction), so the line would simplify to

hydrate_adjustment = - 2 ( dE_H )

So basically, the hydrate correction boils down to just removing any hydride energy adjustments from the solid phase.

Does the above make sense to you? If so, would you be able to do some testing to see how correcting this adjustment affects some representative Pourbaix diagrams? Flagging @Joseph_Montoya in case he has any thoughts as well.

If this looks like it makes sense, I can open an issue in pymatgen and we can start working on a fix.

I thought hydride correction will not be added to the original FeO.nH2O entry, since hydride correction is only applied with H as an anion according to the yaml file. But, then - n dE_O - 2n dE_H + n dE_H2O = 0 for default MP setting, correct?

The another way could be that just to assign energy of FeO.nH2O as the summation of FeO energy and nH2O energy.

In the end, the current algorithm does not clearly distinguish hydrate and hydroxide. So, I am not sure how worth it is to correct hydrate energy, while sacrificing the accuracy for the hydroxide energy. Maybe, could it set hydrate correction as an option for MPAqueousCompatibility?