It’s been ~4 years since How to query date of entry addition to the Materials Project? and this is pretty critical to an upcoming manuscript on generative models, so hoping to get a bit more visibility. In particular, I’m hoping to unconditionally generate some number of candidate structures (e.g. 10 million) and see how many match the second half time-split of Materials Project entries, e.g. all entries that were added after YYYY/MM/DD (whatever date approximately splits it in half). Since this is more in a materials discovery context, I’d be interested in probing the first date it ever appeared on Materials Project, so created_at looks promising (similar to what was mentioned in the linked matsci post above). Would be great to get some confirmation if that’s the case.
Any recommendations on going about this? Feedback/alternative suggestions on a higher-level welcome too.
Hi @sgbaird, one of the easiest ways to do this would be to look at the mp-id. We assign those sequentially, so higher mp-ids are newer materials. You could use that to figure out where to split the materials. I’m not as familiar with the specifics of how to determine when a material was first added though.
As an FYI, you’ll probably want to make sure to pull data from our new api rather than the legacy API in mapidoc. You might be able to look at the dates of the individual tasks listed in origins to see the first appearance of a material.
FYI, I frequently generate splits in time using the ICSD information, you can do this via the references attribute of the snl object that can be returned by the legacy MPRester.query. You can then parse the references attribute using pybtex and get the years. I have a script somewhere to do this - I’ll take a look and see if I can post it later today.
@rkingsbury that makes a lot of sense that MPIDs would be assigned sequentially and would certainly be the most straightforward approach. Thanks! Great suggestion about converting over to the new API. The directions in the " Accessing Data" section seem pretty straightforward.