Build GGA/GGA+U/R2SCAN (Mixed) phase diagrams via new API

Hi @peikai, the phase diagrams shown on the website are constructed on the fly as well, and we are aware of the issue causing some of them to not appear correct. This will be fixed soon. The pre-built phase diagrams available from the API are different and should be correct.

That being said, you are doing everything right to get corrected entries locally. That issue is from a numpy update that happened recently. If you upgrade your pymatgen install, I believe it should work.

– Jason

2 Likes

Yes, the up-to-date version works. Thanks! @munrojm

When I tried to plot a pre-computed phase diagram to compare it with self-built one, an error was raised, about the elemental references in pre-computed data. I’ve opened a PR to contribute my updates on PhaseDiagram class to resolve it. However, it would better to fix it from the side of cloud data as well, once there is an opportunity to update the pre-computed phase diagrams.

1 Like

Hi, @munrojm

I found there are some conflicts between pre-computed phase diagram and self-build one.

Na-Cl-O chemical system

pre-computed phase diagram:

self-built phase diagram:

  1. Generated by entries of GGA/GGA+U/R2SCAN thermo-type:

  1. Generated via local mixing processing:

Webapp phase diagram GGA/GGA+U/R2SCAN (Mixed):


Li-Sb-Se chemical system

pre-computed phase diagram:

self-built phase diagram:

  1. Generated by entries of GGA/GGA+U/R2SCAN thermo-type:

  1. Generated via local mixing processing:

Webapp phase diagram GGA/GGA+U/R2SCAN (Mixed):


Methods:

pre-computed phase diagrams are constructed from the entries that are directly retrieved from thermo endpoint.

phase_diagram = mpr.get_phase_diagram_from_chemsys(‘Li-Sb-Se’, thermo_type = ThermoType.GGA_GGA_U_R2SCAN)

Self-built phase diagrams are generated as below:

  1. Generated by entries of GGA/GGA+U/R2SCAN thermo-type

    entryList = mpr.get_entries_in_chemsys(‘Li-Be-O’, additional_criteria={‘thermo_types’:[ThermoType.GGA_GGA_U_R2SCAN]})
    phase_diagram = PhaseDiagram(entryList)

  2. Generated via local mixing processing

    entryList1 = mpr.get_entries_in_chemsys(‘Na-Cl-O’, additional_criteria={‘thermo_types’:[ThermoType.R2SCAN]})
    entryList2 = mpr.get_entries_in_chemsys(‘Na-Cl-O’, additional_criteria={‘thermo_types’:[ThermoType.GGA_GGA_U]})
    entryList = MaterialsProjectDFTMixingScheme().process_entries(entryList1+entryList2)
    phase_diagram = PhaseDiagram(entryList)


I found that self-built phase diagrams via local mixing processing are the most reliable and consistent with the webapp. The pre-computed phase diagrams and those built without local mixing processing are different.

Kai Pei

I submitted another PR to raise exception when PhaseDiagram class gets an empty entryList as input, which might occur in some mixing processes.

@peikai, it looks like your locally built phase diagrams are missing entries. The thermo API data for each material contains the canonical data used in the phase diagram for a given thermo_type (mixing scheme). In other words, a material might have a blessed GGA, GGA+U and R2SCAN calculation associated with it, but only one is chosen to be on the mixed phase diagram and included in entries in the thermo data document.

To compute the diagrams as our builders do, you should pull all of the uncorrected GGA, GGA+U, and R2SCAN ComputedStructureEntry data from the materials endpoint for every material within a specific chemical system and its subsystems (you can request the same entries field from mpr.materials.search), decorate it with oxidation state data from mpr.oxidation_states.search (we do that here emmet/thermo.py at 1a185027d017475e6112164df50428a0b06406c8 · materialsproject/emmet · GitHub), pass everything to the mixing scheme class as you have done, and then construct the phase diagram with the corrected entries.

– Jason

Hi @munrojm,

I have shown three routes to build phase diagrams. 1. is to retrieve pre-computed phase diagrams directly. 2. is to retrieve entries with specific thermo_type, to self-build. 3. is to retrieve all entries for mixing processing, to self-build. I think 1 and 2 approaches might somehow show wrong phase diagrams. and 3 is reliable.

What I have done in route 3 is exactly to merge all GGA/GGA_U/R2SCAN entries, then process them with mixing scheme locally, see codes below. The oxidation state data should have been contained in the entries. And the graphs look reliable. (They locate under the title 2. Generated via local mixing processing) Do you think it is the right way to reproduce phase diagrams? Thanks!

However, if we agree that the route 3 is the most reliable and right, thus the pre-computed phase diagrams (route 1) should be regarded with suspicion. See conflicts in Na-Cl-O chemical system phase diagrams (under the title of pre-computed phase diagram and 2. Generated via local mixing processing).

Moreover, as you said, entries with specific thermo_type (GGA_GGA_U_R2SCAN) have been mixing processed and screened. Hence, they should be able to utilize directly to build GGA/GGA+U/R2SCAN (mixed) phase diagram (route 1). But it is not the case. The graph (under the title 1. Generated by entries of GGA/GGA+U/R2SCAN thermo-type) still shows differently from pre-computed phase diagrams (under the title pre-computed phase diagram).

Ah, okay I see what you mean. I didn’t pay close enough attention to the way you were pulling entry data. I am going to take a closer look at this ASAP. Thank you for taking the time to post all of this information.

– Jason

@peikai, there is definitely an issue with the phase diagram part of our build pipeline. I am addressing this now, and the data fix should in soon.

– Jason

@munrojm, thanks a lot!

A typo:
The webapp phase diagram of Na-Cl-O chemical system should be the graph below, not what I attached. But what I expressed were not affected. It is still different from the pre-computed phase diagram, and the same as the phase diagram built in route 3.

Kai Pei

Papers of materials project that discuss the GGA/GGA+U phase diagrams have been well known. Well, is there any paper discussing the accuracy of GGA/GGA+U/R2SCAN mixing phase diagrams? Does the mixing operation make it superior to GGA/GGA+U phase diagrams?

Thanks!

  1. Formation enthalpies by mixing GGA and GGA+U calculations, Phys. Rev. B 84, 045115
  2. Li−Fe−P−O2 Phase Diagram from First Principles Calculations, Chem. Mater. 2008, 20, 5, 1798–1807

@peikai, I have a fix for the data which should be up in the next couple days. Also, here is the publication associated with the GGA/GGA+U/R2SCAN mixing scheme: A flexible and scalable scheme for mixing computed formation energies from different levels of theory | npj Computational Materials

– Jason

1 Like

@munrojm Thanks for timely updates!

By the way, I’m wondering the difference between thermo_id and entry_id in new API. They look similar, i.e., [MPID]_[thermo_type] and [MPID]-[run_type], respectively. Is themo_id the alias for entry_id in new API?

I’m going to set index for entries locally. The index should be unique and one to one correspondence with entries. However, since I’ve ever found that an entry_id could correspond to multiple entries before your fix it.

I’m confused what ID can be unique index for entries:

  1. does an entry_id always corresponds to a unique entry?
  2. does a thermo_id always corresponds to a unique entry? I noticed that thermo_id is unique for thermoDoc, but I’m not sure whether a thermoDoc always contain a single entry?

Thanks!
Kai Pei

@peikai no problem! I’ll ping you when the data is live.

This is actually a timely question. I believe we just merged in changes to the __eq__ method for Entry objects in pymatgen in response to a discussion around unique identifiers for them. The change alters equality evaluation to not just use the entry_id (MPID + run type), but also include the correction. entry_id + correction type and amount always identifies a unique entry.

That being said, the ThermoDoc data from the API only contains a single “blessed” entry for a specific material. This corresponds to the one ComputedStructureEntry object chosen by the mixing scheme that is reflected in the thermo_type. In principle, the thermo_type is probably enough to index the entry data, unless you are going to store multiple sets of entries processed by the same mixing scheme class.

– Jason

1 Like

@peikai, the data should be updated. Sorry for the late reply. Took longer than expected to make the fix and propagate it through our pipelines.

– Jason

2 Likes

@munrojm,
I’ve checked phase diagrams above; all aforementioned conflicts have been fixed. Thanks a lot.

And I’ve been experimenting ways of indexing entries, and those updates give me more options.

1 Like

Hi @munrojm , I tried to get only GGA and GGA_U data using the method discussed above

entries = mpr.get_entries_in_chemsys(elements,additional_criteria={‘thermo_types’:[ThermoType.GGA_GGA_U]})

The error is

name ‘ThermoType’ is not defined

. I haven’t found any information about ThermoType. Should I import this? Thank you so much!

See the code snippet on this page: https://docs.materialsproject.org/methodology/materials-methodology/thermodynamic-stability/phase-diagrams-pds

– Jason

Thank you! This is really helpful!

In my codes, ThermoType class is imported by

from emmet.core.thermo import ThermoType

Thread closed due to inactivity, please open a new thread to address related issues.