I found that pre-computed phase diagram for Au-Li-Si chemical system on R2SCAN thermo-type is missing, which is retrieved via mpr.materials.thermo.get_phase_diagram_from_chemsys()
This phase diagram can be generated in webapp.
The error information:
smart_open\s3.py", line 442, in _get
raise wrapped_error from error
OSError: unable to access bucket: 'materialsproject-build' key: 'objects/2024-12-18/phase-diagrams/thermo_type=R2SCAN/chemsys=Au-Li-Si.jsonl.gz' version: None error: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist.
In addition, would the pre-computed phase diagrams be updated along with a new database release?
I would like to mention that the MaterialsProjectDFTMixingScheme() class in pymatgen would miss entries whose run_type is ‘r2SCAN’ when carrying out GGA(+U)/R2SCAN mixing scheme corrections. This would affect the construction of GGA(+U)/R2SCAN mixing phase diagrams. And maybe need to be noticed when updating pre-computed phase diagrams. I’ve debugged and opened a PR.
There is not one-to-one coverage for the pre-computed phase diagrams for every chemical system for each thermo type. We generally strive to increase coverage with each data release, but the build process can fail for a variety of reasons. Based on the phase diagram app on the website this system should have been able to generate a pre-computed entry, so I’ll be sure to check this system for the next build.
In the cases where you can’t pull the pre-computed phase diagram for a given thermo type you can try to construct the phase diagram using the available entries for the thermo type you are interested in. Example snippet:
from typing import List
import itertools
from mp_api.client import MPRester
from emmet.core.thermo import ThermoDoc, ThermoType
def get_all_chemsyses(chemsys: str) -> List[str]:
all_chemsyses = []
elements = chemsys.split("-")
for i in range(len(elements)):
for els in itertools.combinations(elements, i + 1):
all_chemsyses.append("-".join(sorted(els)))
return all_chemsyses
chemsys = "Au-Li-Si"
thermo_type="R2SCAN"
with MPRester("your-api-key") as mpr:
all_chemsyses = get_all_chemsyses(chemsys)
entry_list = mpr.materials.thermo.search(
chemsys=all_chemsyses,
thermo_types=[thermo_type],
fields=["entries"],
)
flat_entries = [
entry_item
for e in entry_list
for thermo_type, entry_item in e.entries.items()
if entry_item
]
phase_diagram = ThermoDoc.construct_phase_diagram(flat_entries)
And for the mixing scheme issue you have in Pymatgen, you can override the thermo types you expect in the entries. e.g., MaterialsProjectDFTMixingScheme(run_type_2="r2SCAN").
I’m not sure if that change you proposed is strictly needed in Pymatgen, but we can follow up further in that PR for that discussion.
I want to build phase diagrams on thermo types in high-throughput, and the local processing methods have not been such reliable as pre-computed phase diagram on result consistency. I used to think the pre-computed phase diagrams are comprehensive thus would be the best choice, however I found that it have coverage for all GGA(+U) thermo type but not for R2SCAN thermo type.
It is expected that pre-computed phase diagrams for all the chemical systems that can be retrieved from three thermo type endpoints would be accessible. Did the build process fail because of bugs in pymatgen mixing scheme class? It is a chance to find bugs.
Oh, I found out the reason that certain entries inside the pre-computed phase diagram have been deprecated with the new database release, like mp-3203471. As a result, even the GGA calculated properties of them would be missing from endpoints.
This also indicates pre-computed phase diagrams have already been out-of-date, since the mp-3203471 is found in a pre-computed phase diagram.
Now that there is only way to get up-to-date phase diagrams which is locally build.
It is expected that we should have at least one pre computed phase diagram for every chemical system in database, not one phase diagram per thermo type per chemical system.
I can’t say offhand if this is the case, we build the data releases + pre-computed objects in high-throughput so it is difficult to identify in the moment if an entry/chemical system/thermo type should or should not fail to build. I have noted this chemical system you brought up though (Au-Li-Si, R2SCAN), because it may be an indication of an error that we can correct in the next release.
mp-3203471 is not deprecated. This material is part of the GNoME dataset. Since you can view this material via the web interface you must have accepted the terms of use for that dataset. Are you not able to pull that material via the api?
with MPRester("your_api_key_here") as mpr:
docs = mpr.materials.summary.search(material_ids=["mp-3203471"])
My fault with expression. I mean I expect that there could be three pre-computed phase diagrams for every chemical system as the webapp given, based on thermo type. Considering there are still some missing ones currently, a check on coverage and makeup for missing ones could be carried out.
It’s weird, the webpage said it is deprecated at the time I posted. Now the webpage has content normally. https://next-gen.materialsproject.org/materials/mp-3203471.
And the API used to show that 0 Doc was returned before (checked for several times/days), but now it has 1 Doc returned normally. Something happened.
Anyway, it’s good news. Thanks for reply and I will proceed to test the coverage of pre-computed phase diagrams, and consider migrating to local processing methods when meet missing ones.
@peikai Since mp-3203471 is part of the GNoME dataset, you have to be logged in to see its detail page. The message hinting at deprecation is misleading, unfortunately. In this case it simply meant that the material is not accessible to an anonymous / un-authenticated user.
Hi @peikai, thanks again for pointing out the missing pre-computed phase diagram for Au-Li-Si (r2SCAN). With today’s build (v2025.02.12) the missing pre-computed entries are now present.
Going forward though, as we add more data using MP’s r2SCAN workflow the distribution chemical systems of pre-computed phase diagrams with all three thermo types (GGA/GGA+U, GGA/GGA+U/r2SCAN, r2SCAN) will begin decreasing, in favor of increasing quantities of systems with only r2SCAN entries.
What is the principle behind reducing chemical systems? Does it mean that if a pure r2SCAN phase diagram exists for a particular chemical system, phase diagrams on other thermodynamic types (e.g., GGA/GGA+U or GGA/GGA+U/r2SCAN) would be eliminated?