Query data from a specific database version

Hi,

I’ve noticed that the results of the hull energies could vary when calculated by entries from the (locally saved) v2021.11.10 and the updated v2022.10.28 (through API), although the same criteria were set for retrieving the data.

It might be caused by the newly introduced (R2)SCAN results but manually removing these data didn’t seem to resolve the issue. Therefore, can I ask if there is a method in the new API to query ComputedStructureEntry from a specific database version to ensure data consistency for my ongoing project?

Best,
Jiaxin

continuing to the post above:

I think I have partially figured out the problem - the inconsistencies in formation energies come from applying the R2SCAN elemental energies to those entries still in the GGA/GGA+U level. For instance:

from monty.serialization import loadfn
from rxn_network.entries.entry_set import GibbsEntrySet
from pymatgen.analysis.phase_diagram import PhaseDiagram
from pymatgen.core.periodic_table import Element

# pre-downloaded entries files using MPRester().get_entries_in_chemsys("Na-Mn-O-N-H")
old = loadfn("data/entries/computed_str_entries_Na_Mn_O_N_H_2021_11_10.json.gz")
new = loadfn("data/entries/computed_str_entries_Na_Mn_O_N_H_2022_10_28.json.gz")

when E_f is calculated using the old entries:

def calculate_formation_energies(entries):
    e_set = EntrySet(entries)
    pd = PhaseDiagram(e_set)
    for e in pd.all_entries:
        if e.composition.reduced_formula == "NaMnO2":
            print("="*70)
            print(f"id: {e.entry_id} | "
                  f"comp: {e.composition} | "
                  f"e_f: {pd.get_form_energy_per_atom(e)} | "
                 )
    print(pd.el_refs[Element("Mn")])
calculate_formation_energies(old)

which outputs:

======================================================================
id: mp-1016119 | comp: Na4 Mn4 O8 | e_f: -1.9337249892456905 | 
======================================================================
id: mp-21036 | comp: Na2 Mn2 O4 | e_f: -2.01750219987069 | 
======================================================================
id: mp-18957 | comp: Na4 Mn4 O8 | e_f: -2.0304508867456894 | 
======================================================================
id: mp-1016152 | comp: Na4 Mn4 O8 | e_f: -1.94778660237069 | 
======================================================================
id: mp-759533 | comp: Na4 Mn4 O8 | e_f: -1.999933541120689 | 
======================================================================
id: mp-971647 | comp: Na2 Mn2 O4 | e_f: -1.922209088620689 | 
======================================================================
id: mp-1272321 | comp: Na4 Mn4 O8 | e_f: -2.01618848049569 | 
======================================================================
id: mp-755160 | comp: Na2 Mn2 O4 | e_f: -1.9173198961206905 | 
======================================================================
id: mp-755520 | comp: Na2 Mn2 O4 | e_f: -1.9725229811206892 | 
======================================================================
id: mp-578605 | comp: Na1 Mn1 O2 | e_f: -1.933809993620689 | 
======================================================================
id: mp-1283729 | comp: Na4 Mn4 O8 | e_f: -2.0061234854956895 | 
mp-35 ComputedStructureEntry - Mn29         (Mn)
Energy (Uncorrected)     = -265.6984 eV (-9.1620  eV/atom)
Correction               = 0.0000    eV (0.0000   eV/atom)
Energy (Final)           = -265.6984 eV (-9.1620  eV/atom)
Energy Adjustments:
  None
Parameters:
  potcar_spec            = [{'titel': 'PAW_PBE Mn_pv 07Sep2000', 'hash': 'fa52f891f234d49bb4cb5ea96aae8f98'}]
  is_hubbard             = False
  hubbards               = {}
  run_type               = GGA
Data:
  oxide_type             = None
  aspherical             = True
  last_updated           = 2021-02-05 11:07:18.085000
  task_id                = mp-1814312
  oxidation_states       = {}
  run_type               = GGA

However, when using the new entries:

calculate_formation_energies(new)

# Output
======================================================================
id: mp-18957-GGA+U | comp: Na4 Mn4 O8 | e_f: 0.41287982075431096 | 
======================================================================
id: mp-578605-GGA+U | comp: Na1 Mn1 O2 | e_f: 0.5095207138793114 | 
======================================================================
id: mp-18957-R2SCAN | comp: Na1 Mn1 O2 | e_f: -2.1169164336206894 | 
======================================================================
id: mp-755160-GGA+U | comp: Na2 Mn2 O4 | e_f: 0.5260108113793098 | 
======================================================================
id: mp-1283729-GGA+U | comp: Na4 Mn4 O8 | e_f: 0.43720722200431084 | 
======================================================================
id: mp-971647-GGA+U | comp: Na2 Mn2 O4 | e_f: 0.5211216188793113 | 
======================================================================
id: mp-21036-R2SCAN | comp: Na2 Mn2 O4 | e_f: 0.42582850762931024 | 
======================================================================
id: mp-759533-GGA+U | comp: Na4 Mn4 O8 | e_f: 0.44339716637931126 | 
======================================================================
id: mp-1016152-GGA+U | comp: Na4 Mn4 O8 | e_f: 0.49554410512931035 | 
======================================================================
id: mp-1016119-GGA+U | comp: Na4 Mn4 O8 | e_f: 0.5096057182543099 | 
======================================================================
id: mp-1272321-R2SCAN | comp: Na4 Mn4 O8 | e_f: 0.42714222700431037 | 
======================================================================
id: mp-755520-GGA+U | comp: Na2 Mn2 O4 | e_f: 0.4708077263793111 | 
======================================================================
id: mp-21036-GGA+U | comp: Na2 Mn2 O4 | e_f: 0.42582850762931024 | 
======================================================================
id: mp-1272321-GGA+U | comp: Na4 Mn4 O8 | e_f: 0.42714222700431037 | 
mp-35-R2SCAN ComputedStructureEntry - Mn29         (Mn)
Energy (Uncorrected)     = -425.8167 eV (-14.6833 eV/atom)
Correction               = 0.0000    eV (0.0000   eV/atom)
Energy (Final)           = -425.8167 eV (-14.6833 eV/atom)
Energy Adjustments:
  None
Parameters:
  potcar_spec            = [{'titel': 'PAW_PBE Mn_pv 02Aug2007', 'hash': '9fc2da19948217b93232c1ff13eec3a6'}]
  is_hubbard             = False
  hubbards               = {}
  run_type               = R2SCAN
Data:
  oxide_type             = None
  aspherical             = True
  last_updated           = 2021-07-25 12:17:10.051000
  task_id                = mp-2739260
  material_id            = mp-35
  oxidation_states       = {}
  run_type               = R2SCAN

Most of the E_f turned positive as they were calculated with respect to the new Mn entry (mp-35-R2SCAN) which has much lower energy (-14.6833 eV/atom).

Since I have already done most of my calculations using the 2021 database, my temporary solution to this problem is to remove the entries with R2SCAN corrections:

filtered_new = []
for e in new:
    names = [adj.name for adj in e.energy_adjustments]
    if 'MP GGA(+U)/R2SCAN mixing adjustment' not in names and "R2SCAN" not in e.entry_id:
        filtered_new.append(e)
calculate_formation_energies(filtered_new)

The only concern I have is the numbers of entries do not match anymore:

print(len(old))
print(len(filtered_new))

# Output
504
444

So it would be great if there’s a more elegant way of handling different database versions in the new API.

Best,
Jiaxin

Hi @jx-fan,

One thing to try is to query for only thermo data associated with ThermoType.GGA_GGA_U. This should have equivalent ComputedStructureEntry objects to the old build, except for any missing data from new task deprecations. You can do so within the updated MPRester.thermo.search method.

– Jason

Hi @munrojm,

Thanks for your advice! The thermo_types does come in pretty handy in filtering the docs.

Best, Jiaxin