Material Project is a very excellent project, and I get some useful data from it conveniently.
But I seemingly find that there’s a little point which can be improve.
The specific situation is like this:
I want to get all of the crystal_system corresponding to different mp-id in my dataframe.
So I use the python API to get crystal_system like this:
crystal_system = mpr.materials.summary.search(material_ids=df['mp-id'].tolist(),
fields='symmetry'[0].symmetry.crystal_system
df['crystal_system'] = crystal_system
But I get a error, the length of crystal_system didn’t equal to the length of dataframe df. What I think is that when the api didn’t find a crystal_system for the mp-id, the api just ignores it and don’t return anything.
So I changed my code like this:
# add crystal_system for every mp-id in df
with MPRester(api_key) as mpr:
for i, mpid in enumerate(df['mp-id']):
try:
crystal_system = mpr.materials.summary.search(material_ids=mpid, fields='symmetry')[0].symmetry.crystal_system
df.loc[i, 'crystal_system'] = crystal_system
except:
df.loc[i, 'crystal_system'] = None
But there is also a problem, that is, the efficiency of the second query is much slower. The first way just spend 30 seconds, but the second way spend about 13 minutes, that’s terrible.
So I suggests that when didn’t find the corresponding crystal_system, the api should return something(maybe a None).
Or If you have a better solution for this, please let’s me know, because I really can’t think of a better way.
Thank you very much!!!