Just wanted to update this thread with latest examples for retrieving MPContribs datasets. Pagination is now automatically handled by the client (make sure to update to the latest version of mpcontribs-client
) and configuration parameters are easier to set using the client.query_contributions()
function. The client can also be initialized with a specific project. The previous example becomes
from mpcontribs.client import Client
client = Client(apikey="your-api-key-here", project="carrier_transport")
client.available_query_params() # print list of available query parameters
query = {"formula__contains": "Au", "data__PF__p__value__lt": 10}
fields = ["identifier", "formula", "data.metal", "data.S.n.value"]
client.query_contributions(
query=query, fields=fields, sort="-data.S.n.value", paginate=True
)
By default, paginate
is False
which will only retrieve the first page of results and should be used to test the query
, fields
and sort
parameters before paginating through all results.
If entire projects or large subsets of contributed data are downloaded for later used, it is often more efficient to use the client.download_contributions()
function. It also takes a query
as argument and downloads all results as json.gz
files behind the scenes. Only locally missing data is downloaded when download_contributions
is run and contributions are loaded from disk. This function always retrieves all fields included in the data
component, so the fields
argument is not available/needed. Additional components (i.e. structures
, tables
, and attachments
) can be included in the downloads through the include
argument:
client.download_contributions(query=query, include=["tables"])