How is the 'theoretical' tag determined?

Hi,

I found that a new tag ‘theoretical’ is given for all entries to tell if the structure originated from an experimental work, or if the structure is purely theoretical. How is this tag determined?

For example, if an ICSD experimental entry has partial occupancies and MP entry does not due to ordering, would this MP entry get True or False?

When comparing the MP structure to ICSD entries, is it done by using the structure matcher of pymatgen or just the composition is compared? I am curious what kind if properties get compared to determine this.

Lastly, I also find that in the query, you have a tag called “icsd_ids”. Would a structure which have nothing listed for this ‘icsd_ids’ tag be always a hypothetical structure? This is because for some materials project entries, I would like to systematically determine whether this material has been synthesized (having an ICSD ID) or it is a hypothetical structure that is derived from other structures (for example substituting Na to Li in Na-oxides). For example, among ~1,200 MP structures that I queried, only 61 have “icsd_ids” tag, so I was wondering if many entries are indeed derived from ICSD structure, but do not have ‘icsd_ids’ in the database.

Thank you for helping me out!

Best,
KyuJung

1 Like

Hi @KyuJung_Jun

The theoretical tag comes from the ICSD database. If any of the matching ICSD structures is tagged as experimental, then we tag our material as theoretical: False.

Right now, we don’t match to partial occupancy ICSD, although that is in the works. We also plan on adding more databases such as TCOD and PCOD, as well as the Pauling Files.

For ordered structures, we use the structure matcher on the final MP structure as well as the initial structures to determine if they match to an ICSD entry. This way, we capture any structures that might change significantly in DFT relaxation.

Yes, a structure with no icsd_ids should always be theoretical: False for now. In the future, that will change as we add more databases that we match to.

1 Like

Hi @shyamd

Thank you for your reply! Then for MP entries with partial occupancy, would there be a way to check if the provenance of a certain entry comes from an ordering of ICSD cif file? (any tags that comment on the origin of the structure) In principle, if a MP entry is just an ordering of a disordered ICSD entry, then in some sense it may represent the disordered ICSD entry.

Unfortunately not for now. A lot of the orderings come from external collaborators who didn’t keep that provenance. As i mentioned we are working on adding that in using structure matching to the disordered structures.

1 Like

I see. Thank you for helping me out. For now, perhaps I will do composition matching to be conservative.

Sorry to necro this but I just want to clarify. Did you mean

Yes, a structure with no icsd_ids should always be theoretical: True for now.

?

Thanks,
Kyle

Yes, that is correct. Basically, the only way we can determine if something has been synthesized is from the ICSD for now. If they tag the material as experimental we tag it as theoretical = False and otherwise it defaults to True.