Inconsistency between Materials Project Database bandgap and pymatgen calculated bandgap

This has to do with a bug we’re aware, and should be fixed in a forthcoming data release.

Some tasks (=single DFT calculations) are labelled with the wrong task type (tells you about the kind of calculation performed). In this case, a line-mode scan of the band structure is mislabeled and the corresponding band structure doesn’t get pulled into the summary document.

You can see all the tasks that build a materials document like this:

mat_doc = mpr.materials.search(material_ids=['mp-753512'])[0]
tasks = mpr.materials.tasks.search(task_ids=[task_id for task_id in mat_doc.task_ids if task_id not in mat_doc.deprecated_tasks])
for task in tasks:
  print(task.task_id, task.task_type, task.output.bandgap)

should print something like this (the order doesn’t matter):

mp-809180 NSCF Uniform 0.5586000000000002
mp-1785566 Static 0.6380999999999997
mp-1333251 Static 0.0
mp-801798 Static 0.7089999999999996
mp-765868 Structure Optimization 0.7275999999999998
mp-1685581 NSCF Uniform 0.0

Clearly the first task, mp-809180, corresponds to the band structure you obtained with get_bandstructure_by_material_id, and checking by hand, it is a line-mode band structure calculation. However it is mislabeled as uniform and does not correctly populate the band_gap field in the summary doc.

2 Likes