Download grain boundary dataset

Hallo,

I cannot download a datasets for grain boundaries+ all MP. Is there a way access it via python in a workflow. So that data can be used to train SevenNet ( GNN+SO3) model.

There is MVRL website that takes down to MP but it goes nowhere to inform where it is located.

Thanks for any information on this issue.

Regards,
A

The easiest way is by installing our python client mp_api. See the documentation for it here. For example, if you want all of the grain boundary data in MP, it’s a simple python script:

from mp_api.client import MPRester

with MPRester() as mpr:
   grain_bdy = mpr.materials.grain_boundaries.search()

Also note that for large data downloads, you should cache it locally (our data schemas are currently defined around JSON).

Thanks for the information.

So with new pymatgen, when i run the above code it gives this error “ModuleNotFoundError: No module named ‘pymatgen.analysis.gb’”

Is it compatible with older version?

Probably not - can you pip install --upgrade pymatgen?

I did the upgrade to the latest version but still it says module not find. I saw on GitHub in src/pymatgen/analysis there is no gb folder.

Sorry for the slow response. There’s a small problem with our data because of the change in pymatgen packages. You can still access this data by doing this:

with MPRester(monty_decode=False,use_document_model=False) as mpr:
    grain_bdy = mpr.materials.grain_boundaries.search()

In the meantime, I’ll look into fixes for this

1 Like

Hey @Asif-Iqbal-Bhatti the monty issues should be fixed now thanks to @tsmathis! You can run:

with MPRester() as mpr:
    grain_bdy = mpr.materials.grain_boundaries.search()

to download all entries in the grain boundaries database. No client updates needed.

1 Like

Indeed, thank you for the effort.