Download single property for all compounds in the database

The current API seem to be mostly tailored toward downloading all information about a single compound. Would it be possible to download a single piece of information about all the compounds?

In particular the request from this site

with empty material parameter would return all materials in the entire database, but it fails since the request is too large. How would I go about creating this request? Is it possible to download the backend database in its entirety?

My applications is machine learning and I need a rather large data set to build up my models

For future reference, I did find a workable solution:

from pymatgen import MPRester                                                     
import urllib.request                                                             
import json                                                                       
if __name__ == "__main__":                                                        
    MAPI_KEY = "XXXXX"  # You must change this to your Materials API key! (or set MAPI_KEY env variable)
    # fetch list of a list of all available materials                             
    with urllib.request.urlopen('') as myurl:
        data = json.loads(                                  
        material_ids = data['response'] # 75,000'ish material IDs are returned 
    with MPRester(MAPI_KEY) as m: # object for connecting to MP Rest interface 
        criteria={'material_id': {'$in':material_ids[:4]}} # to avoid straining the servers, this is only using the first 4 materials
        properties=['energy', 'pretty_formula']            # list a few quanteties of interest
        data = m.query(criteria, properties)                                      

Hi Vikingscientist,

You were hitting the size limit on returned results, which keeps the API from getting overloaded. The API is well-suited to return the information you’re looking for, but you have to break you query up into smaller batches to avoid this limit.

Whenever I need to do something similar to what you’re trying to do, I first query for all the mp-id’s using the MPRester and store them in a python list. After that, I iterate through the list of mp-id’s and query for the properties of interest about 1000 materials at a time, depending on the property.

r = MPRester():
mp_ids = r.query({}, [“material_id”])
chunk_size = 1000
sublists = [mp_ids[i:i+chunk_size] for i in range(0, len(mp_ids), chunk_size)]

Then you can query for each sublist:

results = []
for sublist in sublists:
results = results + r.query({“material_id”:{"$in": sublist}}, [“pretty_formula”, “structure”])


@Vikingscientist Hello, sorry if this question is a bit late. How do I properly use the code you just posted in the database. Any help would be greatly appreciated