What is an `initial_structure` when multiple tasks are carried out?

I am interested in getting the pre-relaxation structures for some materials in MP for a project idea, it appears as though these may be available through the API as initial_structures but there is some ambiguity in that MP runs many calculations for different materials, this initial structure might therefore be the initial structure for a later stage workflow i.e. elastic constant and therefore the initial structure could actually be the relaxed structure from the earlier relaxation. Is this the case and if so is there anyway to work out which task would provide me with the least relaxed initial structure?

If you want the unrelaxed structure, that would be initial_structure for the relaxation calculation. Then, the relaxed structure is used as starting point for static, NSCF, etc.

When I query the original_task_id from the API for all the trial materials I queried it returns the material_id, as far as I understand the material_id is the most recent task_id for a material and so this behaviour will not allow me to get the initial structure of the structure optimization.

EDIT - actually for the entirety of MP original_task_id =material_id, is this a bug or have I got the pattern for material_id mixed up and it’s always the first i.e. the GGA structure opt?

The material_id should be the first (i.e. smallest) task_id for a given material.

so the initial_structure from the api should be the pre-relaxation GGA structure opt input in most cases?

edit- it appears that you need to individual query the task url to get the true initial structure, I haven’t found a way to get this information from the material card on the query url

For this query, can you specify whether you’re using the new API or the legacy API?

The initial_structures in the /materials endpoint is a list of all initial structures for relaxations that match to that material_id.

For a specific calculation, once you have the task_id, you can use the /tasks endpoint to retrieve the input structure.

This information is for the new API.

Hope that helps,

Matt

I think I am probably using the old API via MPRester? this is the code snippet I would want to use if not for the fact it will require so many requests.

m = MPRester()
data = [m.get_task_data(
    chemsys_formula_id=idx,
    prop="initial_structure",
) for idx in tqdm(df["material_id"].tolist())]

Essentially I’ve been asked for an experiment that requires the initial structures and querying them with this snippet appears to require 100k queries. I emailed the heavy api email and also toyed around with whether I could use wildcards to query the url directly but could only do that with compositions not ids. The direct url I was making requests to is https://materialsproject.org/rest/v2/tasks/{material_id/task_id}/initial_structure