Download ABINIT input files from NOMAD

Ryotaro_Okabe · February 28, 2023, 8:10pm

I am contacting you to request the ABINIT input files from NOMAD database for calculating phonon.
Using the following command with optimade we have tried downloading the required files.
optimade-get --filter ‘_nmd_origin HAS “AUTHOR_NAME”’ --max-results-per-provider 100

By this command we can get the structure data from Nomad with the AUTHOR_NAME. However, we still do not know how to get the input files for ABINIT’s anharmonic phonon calculation.
I am so happy if you can instruct us how to download the data we want through the command line. We would like to do the following:

Could you tell us general method we can download the ABINIT input files from nomad in command line.
How could we select which material data to download? With the command I showed above, I could select how many materials we want but I couldn’t select which material to download from NOMAD database. For example, we first want to get structure/input files of crystals with single atomic species (e.g. Si, Ge, …). Afterward we want to access materials with up to 5~10 atoms per unit cell. Hope to know the best way to get data of specific materials.

mscheidgen · March 1, 2023, 4:43pm

The OPTIMADE API only allows you to get the data standardised JSON data, but no files. NOMAD has its own dedicated API, which includes endpoints to download files.

I am not in expert on ABINIT. You can download all the files that were uploaded. If this includes “ABINIT’s anharmonic phonon calculation”, I can’t tell. Each ABINIT calculation comes with multiple files. NOMAD only links the main output file to its entires. This is what we call the mainfile, e.g. run.abo. But you can download all files in the same directory (e.g. run.log or run.abi). I hope what you are looking for is in the run.abi? So every time you download raw files from a NOMAD entry, you’ll download all the files in the mainfile directory.

This is the endpoint you want to use. Click and see its documentation in our API dashboard. Notice the json_query parameter. You can copy the JSON from NOMAD Search interface via the <> button on top of the filters menu. Your query should look like this:

{
  "results.material.n_elements": {
    "gte": 2,
    "lte": 2
  },
  "results.method.simulation.program_name:any": [
    "ABINIT"
  ],
  "authors.name:any": [
    "AUTHOR NAME"
  ]
}

If you are only looking for calculations that have a particular file in their directories, you can add another criteria to your query: e.g. "files.path": "run.abi", to only find entries that also have a file named run.abi.

To make it easier on CURL you can use the GET endpoint. Downside, you have to encode the JSON in the json_query query parameter (-G, --data-urlencode). Looks a bit ugly. With python and requests, I would recommend to use the corresponding POST endpoint.

curl -G https://nomad-lab.eu/prod/v1/api/v1/entries/raw --data-urlencode 'json_query={"results.material.n_elements": {"gte": 2, "lte": 2},"results.method.simulation.program_name:any": ["ABINIT"],"authors.name:any": ["AUTHOR NAME"]}' -o download.zip

This will give a .zip file with all the files that where uploaded in the same directories as the run.abo files. The .zip is organized in the same folder structure that the uploader used. Only difference is the top-level dirs, which use the upload ids of the uploads that the files are in. There is also a manifest.json in the .zip with more details. For this query with a certain someone, it is ~790MB.

Feel free to explore the rest of the API as well. I hope this helps.