As per @Aaron_Kaplan 's suggestion, I downloaded all the tasks with
aws s3 cp s3://materialsproject-parsed/tasks/ mp_tasks --recursive --no-sign-request
However, mp-505260
(which is in mptrj) is not present. I can pass that material ID to mpr.materials.thermo.search
, or the Mo-Te-I
chemsys (or the web site), and it definitely exists as a material ID for an entry. But the string mp-505260
is not present in the tasks manifest.jsonl
file, nor is it in any of the downloaded jsonl task data files.
mptrj includes 3 task IDs for this material ID
task_id=mp-1364910
task_id=mp-505260
task_id=mp-705076
Is this just a situation of mptrj containing a task that’s been removed for some reason?
HI @noam.bernstein, can you try using
s3://materialsproject-parsed/tasks_atomate2/format=jsonl/
as your s3 uri?
We recently updated the tasks endpoint, some related details are in in the release notes for our latest database release 2024.11.14 release notes
Thanks - I’ll initiate another download and let you know what I find
Yes - the fresh download contains the mp that I couldn’t find before. Next I’ll run my real script and see if anything else is missing
Great! Do let us know if you run into any other issues.
I do have some issues with mptrj, but unrelated to this issue. I haven’t checked every material yet, but so far so good in terms of missing data.
Glad to hear there are no further missing data issues, going to mark this issue as resolved then.
Re: your issues w/ MPTrj, would you mind opening a separate topic if you need further assistance?