I’m attempting to calculate the electron and hole effective masses of several thousand materials from the MP database, using sumo, which requires the bandstructure. I start by using a mongo-like query to find materials which have bandstructures and meet my other criteria, but putting “bandstructure” in the “properties” argument to MPRester.query() returns None, probably because “bandstructure” isn’t a supported property. For a small number of materials I could simply call get_bandstructure_by_material_id() for each, but I don’t want to hit the API thousands of times every time I run this query.
Is there a good way to do this, or is it beyond the current capabilities of the API?
from pymatgen.ext.matproj import MPRester
mpr = MPRester("your_MPkey")
for id_ in range(1,1000):
mp_id = "mp-" + str(id_)
has_bs = False
bs = mpr.get_bandstructure_by_material_id(material_id=mp_id)
if bs is not None: has_bs = True
do_something with bs (e.g., save it to a json file)
That method works well for a small number of materials, but each call of get_bandstructure_by_material_id takes about 5 seconds for me, so it isn’t practical at large scale. I think it would also degrade the performance of the MP database for other users, as well, due to the large number of API calls.
Does anyone know a solution for this? Alternatively, I could call get_bandstructure_by_material_id for each material, and let the script run overnight or something, but I don’t want to strain the Materials Project server.
Unfortunately, there is no good solution for this. The MP API, originally developed in 2012, was never designed for this. We are developing a new API that will be able to handle distributing all Bandstructures to a user.
For now, @Steven_Hartman’s solution is the best if you rate-limit yourself. We will try and build a google drive package of band-structures for everyone, but there are numerous backend restructuring tasks which are higher priority as our use and load continue to increase. The new API might be done before this.