I am trying to find a way to download randomized data instead of all materials with a specific “field” search. This is to obtain random molecules with specific fields to train a machine learning model. Is there a command for this? Thanks in advance!
HI @ipatel830, maybe I am misunderstanding, is it not as simple as just taking a random sampling of entries from the full molecules dataset?
Regardless of how you chose to sample the dataset, I would recommend pulling the full molecules dataset and storing it locally to make your life simpler for trying different sampling methods. List item 4 here: Tips for Large Downloads | Materials Project Documentation
Yes that was an option that I saw, but I wanted to see if there was a command to download it randomly from the start. If this is not available then yes I will just do that.
No, there is no random retrieval functionality in the python client, so doing the sampling on your end will be your best.