List of mandatory attributes such as 'elements' and 'lattice_vectors'

Hi everybody
First of all, a huge thank you for all the developers contributing to this project!

I’ve read the paper and now I’m trying to get along with pymatgen’s OptimadeRester. So, I was wondering:

Is there an official list with the mandatory attributes for all optimade providers?

I ask that because it seems that MP has the ‘species’ attribute while OQMD doesn’t, and this breaks the get_snls_with_filter method. I’ll work around this, but I was curious about it.

Kind regards,
Elton.

Dear Elton,

You can find the full description of Optimade here: https://github.com/Materials-Consortia/OPTIMADE/blob/aaafe44842cdf1bf53c0f542111ec10f085c9799/optimade.rst

The species field is listed as SHOULD be present. Meaning that it is highly recommanded but not mandatory. Personally, I do think that OQMD should have implemented it though.

I noticed that they do list the “species” field at “/info/structures/”, although in the description they say they have not implemented it. I can imagine that this makes the OptimadeRester think that the field is present and thus causes the error.

1 Like

I see, JPBergsma.

From reading the paper I understood that the only thing varying on different databases APIs were the properties that begin with _ (such as ‘_aflow_crystal_system_orig’ for example). But comparing MP, OQMD, and Aflow “info/structures” showed me that there is still freedom due to this SHOULD vs MUST situation.

Thank you for your reply and the resource! It helped me a lot!

Just to add, you can track how closely databases are following the specification at https://www.optimade.org/providers-dashboard/ (with validation performed on every child database listed by a provider). Hopefully implementations will continue to improve over time (I will be bringing up some issues like the one you raised with specific databases at our OPTIMADE meeting this week).

The SHOULD/MUST distinction was a necessary evil to ensure compatibility with as many use cases as possible (e.g. experimental databases that do not have coordinates for all sites), but I hope that client code will also improve to handle/infer missing data more cleanly as we go forward.

1 Like

Curious, have you been able to work around this?

Sorry for the late response! I’m not sure if OQMD has updated their optimade api with the ‘species’ field, but I’ve found easier to just download the OQMD dump file and query their database for the data I needed.

They (well, I!) added a species response for their API (Fixes for OPTIMADE structures and info by ml-evs · Pull Request #129 · wolverton-research-group/qmpy · GitHub) but I do not think it is returned by default (i.e., needs to be explicitly listed in repsonse_fields). The pymatgen OptimadeRester now hard codes the list of response fields required to make SNLs (since Add ability to request additional OPTIMADE fields by ml-evs · Pull Request #2315 · materialsproject/pymatgen · GitHub) so it should also work there now.

1 Like

Thank you very much, @ml-evs! I’ll surely use it in future projects!