Download data from NOMAD repository

I try to download band gap data of materials from NOMAD library with API python code.
I use the online tutorial but it get error in cell of download and create DataFrame:

"AttributeError Traceback (most recent call last)
in
17 calc = results[entry].workflow[0].calculation_result_ref
18 convergence_energy_diff = results[entry].workflow[0].geometry_optimization.final_energy_difference.m
—> 19 formula_red = calc.system_ref.chemical_composition_reduced
20 space_group = results[entry].results.material.symmetry.space_group_number
21 elements = np.sort(calc.system_ref.atoms.species)

AttributeError: ‘NoneType’ object has no attribute ‘system_ref’"

Also, I test the previous threads on this topic Bandgap values on NOMAD Archive and its output doesn’t contain any useful data.

How can I solve these problems?

Hi @Samira_Baninajarian!

It looks like at least one of the calculations is missing a piece of information (results[entry].workflow[0].calculation_result_ref is undefined), which makes the query fail in query_nomad_archive.ipynb. A simple solution might be to ignore AttributeErrors in the code. We can see how to fix this in the original tutorial.

But maybe we can help you more directly to formulate the query you are interested in. The example query you linked about querying band gap values in NOMAD does seem to work as expected: it returns a list of band gap values and types for the 10 first entries with a band gap between 0-5 eV, ordered by the entry_id. Maybe you could tell us more about what type of information you are interested in?

Hi Lauri Himanen,
Thanks for your response.
I need to set a dataframe of materials band gap information. For this, I need to have some data such as material’s name, space group, band gap value and so on.
I installed nomad-lab, and tested this example too. But I did not succeed.
Thank you if you can help me solve this problem.

Hi @Samira_Baninajarian,

You are looking at an old version of our documentation, that is why the examples don’t work. Here is a link to the documentation corresponding to the newest production release. The basics are covered in the API How-to-guide, but here is something that might get you started:

import requests
import json

response = requests.post(
    'http://nomad-lab.eu/prod/v1/api/v1/entries/query',
    json={
        'query': {
            'results.properties.electronic.band_structure_electronic.band_gap.value': {
                'gte': 0,
            },
            "results.material.structural_type": 'bulk'
        },
        'pagination': {
            'page_size': 1000
        },
        'required': {
            'include': [
                'entry_id',
                'results.properties.electronic.band_structure_electronic.band_gap.value',
                'results.properties.electronic.band_structure_electronic.band_gap.type',
                'results.material.symmetry.space_group_number',
                'results.material.chemical_formula_hill'
            ]
        }
    })

response_json = response.json()
print(json.dumps(response.json(), indent=2))

There are several things you should take into account in this example:

  • These search results correspond to individual calculations, not unique materials. This means that the results will contain band gaps measured with different programs and approximations for the same material.
  • You should modify the query part to fit your needs: currently, it only makes sure that you get bulk structures where the band gap has been reported. You probably want to add several other filters (level of theory, program name, etc.).
  • This script only returns the 1000 first search hits. Depending on how many results your query results in, you may have to paginate through them in a loop to retrieve all.