Saving MPRester/MPDataDoc objects

Is there a recommended way of saving MPDataDoc objects returned by MPRester offline?

I tried pickling it but that returns an error:
PicklingError: Can't pickle <class 'pydantic.main.MPDataDoc'>: attribute lookup MPDataDoc on pydantic.main failed

pydantic documentation suggests using the model.json() method but that returns:
TypeError: Object of type 'Structure' is not JSON serializable

Is there a convenience function to store a query locally as is.

Hi @R_Walser, sorry for the late reply – hope you were able to figure this out. In case this helps anyone else in the future, I prefer to use the saving functionality from the monty package, which we use in pymatgen to serialize/deserialize objects and dump objects to JSON files. For example:

from mp_api.client import MPRester
from monty.serialization import dumpfn, loadfn

with MPRester() as mpr:
    example = mpr.summary.search(material_ids=["mp-49"], fields=["structure", "material_id"])[0]

dumpfn(example, "example_mp_doc.json.gz")

You can then reload this object later by using the corresponding loading function:

example = loadfn("example_mp_doc.json.gz")
2 Likes

Hi Matt,

I have successfully executed the following code to retrieve all the XAS docs (and other properties) in the MP database:

from mp_api.client import MPRester

from emmet.core.summary import HasProps

req_fields = ["composition", "formula_pretty", "density_atomic", "symmetry", "structure","is_stable", "xas", "band_gap", "efermi", "bandstructure", "dos_energy_up"]

with MPRester("MP_API_KEY") as mpr:
    docs = mpr.summary.search(
        has_props = [HasProps.xas], fields = req_fields)
XAS_doc = [doc.xas for doc in docs]
Struct_doc = [doc.structure for doc in docs]
ch_formula = [doc.formula_pretty for doc in docs]

I have tried saving the XAS_doc into a JSON file as follows:

from monty.serialization import dumpfn, loadfn

dumpfn(XAS_doc, "master_query_doc.json.gz")

However, I obtain the following error message:

TypeError: 'str' object does not support item assignment

Would you be able to tell me what’s wrong? Is it a syntax issue?

@rmngunji Not sure what changed in the past few months, but using monty to serialize docs from MPRester is not working at the moment.

For now, try changing your MPRester to include the argument:

MPRester(use_document_model=False)

You might have to change some of your other lines, but this should allow you to dump the file now.

This has to due with the types of keys that are supported by MontyEncoder. One way around this for now is to first use emmet.core.utils.jsanitize before dumping:

from emmet.core.utils import jsanitize
sanitized_docs = jsanitize(XAS_docs)
dumpfn(sanitized_docs, "master_query_doc.json.gz")

A better solution would be for the API document models to overload the dict() method to use jsanitize intrinsically. I will look at making that change.

– Jason

1 Like

Thank you Jason! It works