Correcting R2SCAN data to find Ehull with respect to GGA_GGA_U_R2SCAN phase diagram

If I have R2SCAN total energy on a material, how do I apply corrections/adjustments so that I can get the Ehull for this material with respect to a combined GGA/GGA+U/R2SCAN phase diagram? I obtain the phase diagram with the following code:

from mp_api.client import MPRester as mp_MPRester
from emmet.core.thermo import ThermoType

with mp_MPRester(API_KEY) as mp_mpr:
pd_mixed = mp_mpr.materials.thermo.get_phase_diagram_from_chemsys(
chemsys=“F-Fe”, thermo_type=ThermoType.GGA_GGA_U_R2SCAN # mixed (site default)
)

Is there example code for correcting R2SCAN formation energy so it aligns woth GGA/GGA+U?

This discussion was the closest to my question, but it doesn’t address the question of how to corect a fresh DFT R2SCAN data point to check for stability.

Thanks.

Hi @Suresh, just to clarify, we don’t apply corrections to r2SCAN calculations, but we do shift its energies to align with the GGA / GGA+U mixed/corrected hull

There are some caveats while using the full correction scheme:

  • You’ll need to provide a pymatgen ComputedStructureEntry representing your calculation. To ensure it’s seen as r2SCAN, the parameters['run_type'] field should be set to r2SCAN
  • If there’s no GGA/GGA+U structure at this composition in our databse, or in your calculations the fully mixed hull will exclude your r2SCAN calc

mp_api has a get_stability function to get the energies above hull for user submitted ComputedStructureEntry and handles the corrections

Thank you Aaron. Let me try this and get back to you with the results.

@Aaron_Kaplan, I am trying to understand the process step by step. So I thought I would create a fake R2SCAN entry using data from a real R2SCAN entry from MP and try to shift its energy using MaterialsProjectDFTMixingScheme and confirm everything was working correctly. While trying this, the first question I have is:

from mp_api.client import MPRester as mp_MPRester
mp_mpr = mp_MPRester(API_KEY)
mp_mpr.get_entry_by_material_id(material_id=‘mp-1030309’) returns

[mp-1030309-GGA ComputedStructureEntry - Te4 Mo2 W2 S4 (Te2MoWS2)
Energy (Uncorrected) = -81.8747 eV (-6.8229 eV/atom)
Correction = -3.7000 eV (-0.3083 eV/atom)
Energy (Final) = -85.5747 eV (-7.1312 eV/atom)
Energy Adjustments:
MP2020 anion correction (S): -2.0120 eV (-0.1677 eV/atom)
MP2020 anion correction (Te): -1.6880 eV (-0.1407 eV/atom)
Parameters:
potcar_spec = [{‘titel’: ‘PAW_PBE Te 08Apr2002’, ‘hash’: ‘72719856e22fb1d3032df6f96d98a0f2’, ‘summary_stats’: None}, {‘titel’: ‘PAW_PBE Mo_pv 08Apr2002’, ‘hash’: ‘84e18fd84a98e3d7fa8f055952410df0’, ‘summary_stats’: None}, {‘titel’: ‘PAW_PBE W_pv 06Sep2000’, ‘hash’: ‘2a33e0d5c700640535f60ac0a12177ab’, ‘summary_stats’: None}, {‘titel’: ‘PAW_PBE S 17Jan2003’, ‘hash’: ‘d368db6899d8839859bbee4811a42a88’, ‘summary_stats’: None}]
run_type = GGA
is_hubbard = False
hubbards = None
Data:
oxide_type = None
aspherical = True
last_updated = 2024-11-21 22:53:38.549166+00:00
task_id = mp-1333121
material_id = mp-1030309
oxidation_states = {‘Te’: -2.0, ‘Mo’: 6.0, ‘W’: 2.0, ‘S’: -2.0}
license = BY-C
run_type = GGA,
mp-1030309-r2SCAN ComputedStructureEntry - Te4 Mo2 W2 S4 (Te2MoWS2)
Energy (Uncorrected) = -294.2673 eV (-24.5223 eV/atom)
Correction = 0.0000 eV (0.0000 eV/atom)
Energy (Final) = -294.2673 eV (-24.5223 eV/atom)
Energy Adjustments:
None
Parameters:
potcar_spec = [{‘titel’: ‘PAW_PBE Te 08Apr2002’, ‘hash’: ‘11ac22d05d08fb6be6beb8cf78b0127c’, ‘summary_stats’: None}, {‘titel’: ‘PAW_PBE Mo_pv 04Feb2005’, ‘hash’: ‘7e3938837b3a62c07c50f33dda12a02c’, ‘summary_stats’: None}, {‘titel’: ‘PAW_PBE W_sv 04Sep2015’, ‘hash’: ‘50a52a9b0eb3b676ba9e5e5568c8fcb2’, ‘summary_stats’: None}, {‘titel’: ‘PAW_PBE S 06Sep2000’, ‘hash’: ‘a627a17058051408d471730e77821d4e’, ‘summary_stats’: None}]
run_type = r2SCAN
is_hubbard = False
hubbards = None
Data:
oxide_type = None
aspherical = True
last_updated = 2024-11-21 22:53:38.560046+00:00
task_id = mp-2899017
material_id = mp-1030309
oxidation_states = {‘Te’: -2.0, ‘Mo’: 6.0, ‘W’: 2.0, ‘S’: -2.0}
license = BY-C
run_type = R2SCAN,
mp-1030309-r2SCAN ComputedStructureEntry - Te4 Mo2 W2 S4 (Te2MoWS2)
Energy (Uncorrected) = -294.2673 eV (-24.5223 eV/atom)
Correction = 0.0000 eV (0.0000 eV/atom)
Energy (Final) = -294.2673 eV (-24.5223 eV/atom)
Energy Adjustments:
None
Parameters:
potcar_spec = [{‘titel’: ‘PAW_PBE Te 08Apr2002’, ‘hash’: ‘11ac22d05d08fb6be6beb8cf78b0127c’, ‘summary_stats’: None}, {‘titel’: ‘PAW_PBE Mo_pv 04Feb2005’, ‘hash’: ‘7e3938837b3a62c07c50f33dda12a02c’, ‘summary_stats’: None}, {‘titel’: ‘PAW_PBE W_sv 04Sep2015’, ‘hash’: ‘50a52a9b0eb3b676ba9e5e5568c8fcb2’, ‘summary_stats’: None}, {‘titel’: ‘PAW_PBE S 06Sep2000’, ‘hash’: ‘a627a17058051408d471730e77821d4e’, ‘summary_stats’: None}]
run_type = r2SCAN
is_hubbard = False
hubbards = None
Data:
oxide_type = None
aspherical = True
last_updated = 2024-11-21 22:53:38.560046+00:00
task_id = mp-2899017
material_id = mp-1030309
oxidation_states = {‘Te’: -2.0, ‘Mo’: 6.0, ‘W’: 2.0, ‘S’: -2.0}
license = BY-C
run_type = R2SCAN]

This is a list with 3 items, items 2 and 3 are identical.
The uncorrected energies for items 1 and 2 are very different. The run_types are also different.

If I do:

pd_mixed = mp_mpr.materials.thermo.get_phase_diagram_from_chemsys(
chemsys=chemsys, thermo_type=ThermoType.GGA_GGA_U_R2SCAN)
entries_1030309 = [e for e in entries if ‘1030309’ in e.entry_id]

I get a single entry:
[mp-1030309-r2SCAN ComputedStructureEntry - Te4 Mo2 W2 S4 (Te2MoWS2)
Energy (Uncorrected) = -294.2673 eV (-24.5223 eV/atom)
Correction = 0.0000 eV (0.0000 eV/atom)
Energy (Final) = -294.2673 eV (-24.5223 eV/atom)
Energy Adjustments:
None
Parameters:
potcar_spec = [{‘titel’: ‘PAW_PBE Te 08Apr2002’, ‘hash’: ‘11ac22d05d08fb6be6beb8cf78b0127c’, ‘summary_stats’: None}, {‘titel’: ‘PAW_PBE Mo_pv 04Feb2005’, ‘hash’: ‘7e3938837b3a62c07c50f33dda12a02c’, ‘summary_stats’: None}, {‘titel’: ‘PAW_PBE W_sv 04Sep2015’, ‘hash’: ‘50a52a9b0eb3b676ba9e5e5568c8fcb2’, ‘summary_stats’: None}, {‘titel’: ‘PAW_PBE S 06Sep2000’, ‘hash’: ‘a627a17058051408d471730e77821d4e’, ‘summary_stats’: None}]
run_type = r2SCAN
is_hubbard = False
hubbards = None
Data:
oxide_type = None
aspherical = True
last_updated = 2024-11-21 22:53:38.560046+00:00
task_id = mp-2899017
material_id = mp-1030309
oxidation_states = {‘Te’: -2.0, ‘Mo’: 6.0, ‘W’: 2.0, ‘S’: -2.0}
license = BY-C
run_type = R2SCAN]

You can see from the potcar_spec list that the first item is also present in the 3 item list from before. The other items are different.

I am quite confused as to what is going on. Perhaps you could explain or point to an explanation on the web. Thank you.

This is a list with 3 items, items 2 and 3 are identical.

The reason for the duplicates is because of how we build thermodynamic data across three different thermo_type’s: GGA_GGA_U (mixed and corrected PBE GGA and GGA+U hull), R2SCAN (r2SCAN only), and GGA_GGA_U_R2SCAN (the mixed GGA, GGA+U, r2SCAN hull). A change to remove duplicate entries is in the works for the API client, but you can filter these out on your side using set(entries)

The uncorrected energies for items 1 and 2 are very different. The run_types are also different.

The run_type indicates a different functional - absolute total energies in a DFT pseudopotential calculation don’t have meaning on their own, but energy differences like formation energies do. So this is expected

You can see from the potcar_spec list that the first item is also present in the 3 item list from before. The other items are different.

The POTCARs (pseudopotentials) that we use are listed here for GGA/GGA+U and here for r2SCAN calculations

@Aaron_Kaplan, thank you.

Here is what I have tried and the error I got. I hope you will be able to give me additional suggestions.

I used the following code, adapted from mp-api.client.MPRester.get_stability() to apply the mixing correction:
composition = structure.composition
chemsys = composition.chemical_system
sample = ComputedStructureEntry(structure=structure, energy=-100.1, parameters=param, entry_id=‘mine’)
mp_entries = mp_mpr.get_entries_in_chemsys(chemsys, additional_criteria={“thermo_types”: [“GGA_GGA+U”, “R2SCAN”, “GGA_GGA+U_R2SCAN”]}, compatible_only=False)
mp_entries = list(set(mp_entries))
corrector = MaterialsProjectDFTMixingScheme(run_type_2=“r2SCAN”)
corrected_entries = corrector.process_entries([sample] + mp_entries)

I set a breakpoint after line 213 in pymatgen/entries/mixing_scheme.py,
“for entry in entries_type_1 + entries_type_2:” like so:

if entry.entry_id == ‘mine’:
 pdb.set_trace()

I then stepped through the code until line 217, adjustments = self.get_adjustments(entry, mixing_state_data).

When I tried to run this line, I got:
pymatgen.entries.compatibility.CompatibilityError: Discarding r2SCAN entry mine for ##entry.formula## because there is no matching GGA(+U) entry and no r2SCAN ground state at this composition.

If this is a new generated sample, I expect there to be no matching GGA+U or r2SCAN entries at that composition. How then does one compare the r2SCAN calculated Eform for a new composition with respect to a combined GGA_GGA+U_R2SCAN phase diagram?

I assume the benefit of mixing is that you may not have all the stable compositions’ Eforms calculated using R2SCAN; the mixing scheme helps us obtain a complete PD using GGA/GGA+U estimates as well. So, does the mixing scheme adjust all the GGA/GGA+U estimates to R2SCAN and therefore my sample’s R2SCAN estimate may be directly compared against the phase diagram without any adjustments to its energy? In other words, does it accomplish what is shown in Fig 2f.

Thanks again!

@Aaron_Kaplan, the simplest solution for me would be to use only R2SCAN PDs. At this point, are complete PDs based on only R2SCAN available in MP for all materials?

Also, what is the status with Yb? Have energies and PDs been computed with the new, correct pseudopotential - and is it possible to access them in MP? I get errors using MP-API for materials containing Yb and the phase diagram app on the website does not include Yb containing materials either.

Thanks.

If this is a new generated sample, I expect there to be no matching GGA+U or r2SCAN entries at that composition. How then does one compare the r2SCAN calculated Eform for a new composition with respect to a combined GGA_GGA+U_R2SCAN phase diagram?

Right now, this is a limitation of the mixing scheme - if there is no corresponding GGA/+U calculation, then it discards the r2SCAN calculation from the final set of mixed entries

So, does the mixing scheme adjust all the GGA/GGA+U estimates to R2SCAN and therefore my sample’s R2SCAN estimate may be directly compared against the phase diagram without any adjustments to its energy?

The mixing scheme should follow the rules in the paper you linked - if there are stoichiometries which are not computed with r2SCAN in the set of entries, then the r2SCAN data will be shifted to the mixed GGA/+U hull

At this point, are complete PDs based on only R2SCAN available in MP for all materials?

Not yet, we’re recomputing these now

Also, what is the status with Yb? Have energies and PDs been computed with the new, correct pseudopotential - and is it possible to access them in MP? I get errors using MP-API for materials containing Yb and the phase diagram app on the website does not include Yb containing materials either.

These have already been recomputed with r2SCAN and are available on the r2SCAN-only hulls, see this issue. There’s currently no GGA/+U data for Yb and no plans to recalculate it

Thank you @Aaron_Kaplan.

The mixing scheme should follow the rules in the paper you linked - if there are stoichiometries which are not computed with r2SCAN in the set of entries, then the r2SCAN data will be shifted to the mixed GGA/+U hull

To be absolutely clear, it appears that the only way I can find the Ehull of a new composition, whose energy is computed using r2SCAN and which is not already present in the phase diagram, is to ensure that its decomposition products’ Eforms are available in MP and have been calculated using r2SCAN. In other words, I could find the decomposition products using the presumably complete, combined GGA+U/R2SCAN PD, ensure that the pure R2SCAN PD also includes these decomposition products, and then use the R2SCAN PD to estimate the Ehull, since there is no way I can adjust my R2SCAN calculated Eform. Could you please confirm this?

You don’t need to ensure that the r2SCAN decomposition products are in MP, the r2SCAN only phase diagram is sufficient for estimating E_hull for a new composition on which you have r2SCAN data

If you have denser hull information (=more compounds in your phase diagram) then the estimates of E_hull and the decomposition products will improve but not strictly necessary when comparing an r2SCAN calc to the r2SCAN hull

If you want to compare with the GGA/GGA+U hull, then you need to ensure there’s at least one composition computed with PBE GGA or PBE+U (for certain transition metal oxides/fluorides) with the same reduced stoichiometry as your r2SCAN calculation

Thank you, @Aaron_Kaplan. I think I understand the requirements for getting a mixed hull now.

To clarify, I have the r2SCAN Eform for ONLY the new stoichiometry, nothing else. Hence I asked whether its decomposition products’ r2SCAN based Eform data have to be available in MP, since there is no other way I can get the r2SCAN only PD for the system around the vicinity of the new stoichiometery.

The decomposition products you get from the phase diagram are limited to the entries that go into the phase diagram:

from mp_api.client import MPRester

with MPRester() as mpr:
  pd = mpr.materials.thermo.get_phase_diagram_from_chemsys('F-Fe',thermo_type='R2SCAN')
print(pd.entries)
>>>

So if the phase diagram lacks the “true” decomposition products (not the ones estimated from a sparser convex hull), then you will get an incorrect estimate of the decomposition. Including more phase info is generally better for improving the convex hull estimate