Hi @peikai, the phase diagrams shown on the website are constructed on the fly as well, and we are aware of the issue causing some of them to not appear correct. This will be fixed soon. The pre-built phase diagrams available from the API are different and should be correct.
That being said, you are doing everything right to get corrected entries locally. That issue is from a numpy update that happened recently. If you upgrade your pymatgen install, I believe it should work.
Yes, the up-to-date version works. Thanks! @munrojm
When I tried to plot a pre-computed phase diagram to compare it with self-built one, an error was raised, about the elemental references in pre-computed data. I’ve opened a PR to contribute my updates on PhaseDiagram class to resolve it. However, it would better to fix it from the side of cloud data as well, once there is an opportunity to update the pre-computed phase diagrams.
I found that self-built phase diagrams via local mixing processing are the most reliable and consistent with the webapp. The pre-computed phase diagrams and those built without local mixing processing are different.
@peikai, it looks like your locally built phase diagrams are missing entries. The thermo API data for each material contains the canonical data used in the phase diagram for a given thermo_type (mixing scheme). In other words, a material might have a blessed GGA, GGA+U and R2SCAN calculation associated with it, but only one is chosen to be on the mixed phase diagram and included in entries in the thermo data document.
To compute the diagrams as our builders do, you should pull all of the uncorrected GGA, GGA+U, and R2SCAN ComputedStructureEntry data from the materials endpoint for every material within a specific chemical system and its subsystems (you can request the same entries field from mpr.materials.search), decorate it with oxidation state data from mpr.oxidation_states.search (we do that here emmet/thermo.py at 1a185027d017475e6112164df50428a0b06406c8 · materialsproject/emmet · GitHub), pass everything to the mixing scheme class as you have done, and then construct the phase diagram with the corrected entries.
I have shown three routes to build phase diagrams. 1. is to retrieve pre-computed phase diagrams directly. 2. is to retrieve entries with specific thermo_type, to self-build. 3. is to retrieve all entries for mixing processing, to self-build. I think 1 and 2 approaches might somehow show wrong phase diagrams. and 3 is reliable.
What I have done in route 3 is exactly to merge all GGA/GGA_U/R2SCAN entries, then process them with mixing scheme locally, see codes below. The oxidation state data should have been contained in the entries. And the graphs look reliable. (They locate under the title 2. Generated via local mixing processing) Do you think it is the right way to reproduce phase diagrams? Thanks!
However, if we agree that the route 3 is the most reliable and right, thus the pre-computed phase diagrams (route 1) should be regarded with suspicion. See conflicts in Na-Cl-O chemical system phase diagrams (under the title of pre-computed phase diagram and 2. Generated via local mixing processing).
Moreover, as you said, entries with specific thermo_type (GGA_GGA_U_R2SCAN) have been mixing processed and screened. Hence, they should be able to utilize directly to build GGA/GGA+U/R2SCAN (mixed) phase diagram (route 1). But it is not the case. The graph (under the title 1. Generated by entries of GGA/GGA+U/R2SCAN thermo-type) still shows differently from pre-computed phase diagrams (under the title pre-computed phase diagram).
Ah, okay I see what you mean. I didn’t pay close enough attention to the way you were pulling entry data. I am going to take a closer look at this ASAP. Thank you for taking the time to post all of this information.
A typo:
The webapp phase diagram of Na-Cl-O chemical system should be the graph below, not what I attached. But what I expressed were not affected. It is still different from the pre-computed phase diagram, and the same as the phase diagram built in route 3.
Papers of materials project that discuss the GGA/GGA+U phase diagrams have been well known. Well, is there any paper discussing the accuracy of GGA/GGA+U/R2SCAN mixing phase diagrams? Does the mixing operation make it superior to GGA/GGA+U phase diagrams?
Thanks!
Formation enthalpies by mixing GGA and GGA+U calculations, Phys. Rev. B 84, 045115
Li−Fe−P−O2 Phase Diagram from First Principles Calculations, Chem. Mater. 2008, 20, 5, 1798–1807
By the way, I’m wondering the difference between thermo_id and entry_id in new API. They look similar, i.e., [MPID]_[thermo_type] and [MPID]-[run_type], respectively. Is themo_id the alias for entry_id in new API?
I’m going to set index for entries locally. The index should be unique and one to one correspondence with entries. However, since I’ve ever found that an entry_id could correspond to multiple entries before your fix it.
I’m confused what ID can be unique index for entries:
does an entry_id always corresponds to a unique entry?
does a thermo_id always corresponds to a unique entry? I noticed that thermo_id is unique for thermoDoc, but I’m not sure whether a thermoDoc always contain a single entry?
@peikai no problem! I’ll ping you when the data is live.
This is actually a timely question. I believe we just merged in changes to the __eq__ method for Entry objects in pymatgen in response to a discussion around unique identifiers for them. The change alters equality evaluation to not just use the entry_id (MPID + run type), but also include the correction. entry_id + correction type and amount always identifies a unique entry.
That being said, the ThermoDoc data from the API only contains a single “blessed” entry for a specific material. This corresponds to the one ComputedStructureEntry object chosen by the mixing scheme that is reflected in the thermo_type. In principle, the thermo_type is probably enough to index the entry data, unless you are going to store multiple sets of entries processed by the same mixing scheme class.