How to get icsd_ids via mp-api

I am trying to use mp-api and MPR.query(formula=‘Fe2O3’, fields=[…]) to query Fe2O3-related information from the Materials Project. May I ask how to retrieve the icsd_ids and corresponding BibTex Citation? They are displayed in the web app as follows but I didn’t find a relevant keyword to put in ‘fields’. Thanks!

Hi, I think the screenshot that you have shared is from the legacy site (legacy.materialsproject.org), while the mp-api is primarily designed to interact with the new Materials Project site (next-gen.materialsproject.org). Most of the information will likely be the same, but the new site is constantly being updated while the legacy site is now frozen.

Link to documentation for the API: Materials Project API - Swagger UI

For using the mp-api:

If you know the material project id (MP ID) for the Fe2O3 material of interest (for example, mp-19770), then you can query the provenance doc directly. The two key fields are “database_IDs” which will retrieve the ICSD IDs and the “references” field will retrieve the BibTex citations.

# Import 
from mp_api import MPRester

with MPRester(API_KEY) as mpr:
     doc = mpr.provenance.get_data_by_id("mp-19770", field = ["database_IDs", "references"] )

# To get the ICSD ids
doc.database_IDs

# To get the BibTex citations
doc.references

In the case that you don’t know the MP ID of the Fe2O3 material of interest, you can also use the mp-api to get all the MP IDs associated with Fe2O3 as shown below:

from mp_api import MPRester

with MPRester(API_KEY) as mpr:

    # Query the summary doc to get the MP IDs of all Fe2O3 materials in the MP
    # You could filter by additional field arguments to reduce the amount entries returned
    # Fe2O3_docs is a list of SummaryDoc objects
    Fe2O3_docs = mpr.summary.search(formula="Fe2O3", fields = ["material_id"])

# Create a list of the ids
mp_ids = [i.material_id for i in Fe2O3_docs]

# To then get the icsd_ids and references, we would loop over the list of MP IDs, and get the provenance doc for each material
# provenance_docs will be a list of ProvenanceDoc objects.
with MPRester(API_KEY) as mpr:
        provenance_docs =[mpr.provenance.get_data_by_id(mp_id, fields =["material_id", "database_IDs", "references"]) for mp_id in mp_ids]

# We could then look at one example from our query and print the MP ID, database IDs (if any) and references
# Let's print out the first results from provenance_docs

print(provenance_docs[0].material_id)
print(provenance_docs[0].database_IDs)
print(provenance_docs[0].references)

Hope my responses have helped!

3 Likes

Thank you very much! It is very clear!

1 Like

Thanks @AntObi, great post!

1 Like

Hi @munrojm ,

I would like to know if it is possible to return icsd_ids when requesting the ComputedEntry in the new api. In the legacy api, it works if I simply run:

entries = mpr.get_entries_in_chemsys(["Mn","O"], property_data=["icsd_ids"])
for entry in entries:
    print(entry.data)

which returns:

{'oxide_type': 'None', 'icsd_ids': [163245]}
{'oxide_type': 'None', 'icsd_ids': [41775, 44762, 163411, 642937, 56134, 163412, 163413, 163414, 655052, 642934]}
{'oxide_type': 'None', 'icsd_ids': [655106, 44761, 642935, 246889, 174037, 44932, 618255, 426954, 42774, 56133, 164349, 246890, 642933, 42743, 642938, 43058]}
{'oxide_type': 'None', 'icsd_ids': [5392, 5393, 5248]}
...

However there’s no input parameter similar to the property_data in the same method in the new api:

    def get_entries_in_chemsys(
        self,
        elements: Union[str, List[str]],
        use_gibbs: Optional[int] = None,
    ):

So based on @AntObi 's answer, I have tried to specify it in fields:

docs = mpr.thermo.search(chemsys="Mn-O", fields=["entries", "database_IDs"])

But there’s no return id values in either docs[0].entries['GGA+U'].data or docs[0].database_IDs.

Does that mean I have to always additionally call mpr.provenance.get_data_by_id to get the info of icsd ids then attach back to the ComputedEntry.data if I want to maintain the same data structure?

Hi @jx-fan, unfortunately the only place to get ICSD IDs with the new API is through the provenance endpoint. Currently, yes either that or you could pre-process any set of entries you like by pulling all the provenance data with the search method, filtering it for specific MPIDs, and then pulling the relevant entries and attaching the data. This type of process will be easier once the ability to search provenance data with a list of MPIDs is released. I am hoping to have that available in the next week or so.

– Jason

Thanks Jason for your quick response! Really looking forward to the new feature. Cheers, Jiaxin