Hello,
I am trying to get a list of all the material ids and their structures using
with MPRester(api_key) as mpr:
docs = mpr.summary.search(fields=[“material_id”, “structure”])
I end up getting 210579 materials instead of the 154879 listed in the MP database! Can someone explain how I am querying the dataset incorrectly? Thank you!
When running search()
without any query parameters, the data is downloaded directly from our AWS Open Data repositories without the default filtering that would happen automatically through our API. You should also get a warning that the fields
parameter is ignored. There’ll be deprecated materials and licensed GNoMe materials in the response. Please filter your docs
by excluding documents where doc.deprecated
is True
and - if you don’t want to accept the GNoMe license - builder_meta.batch_id
is gnome_r2scan_statics
. Also see this issue. HTH