Download DOS of a large number of elements from Material Project

Does it allowed to download an extensive database from Material Project at once?
In other words, can I download DOS of at least 5000 elements?

I can use the API code:

with MPRester(api_key=MP_API_KEY) as mpr:
    DOS = mpr.get_dos_by_material_id(Material_id)

to download DOS for a specific material. But it will be very tedious to repeat it 5000 times!

@Ragab_Abdelghany Retrieving DOSs on a material-by-material basis is still the recommended way. Feel free to loop through a list of material IDs that has been filtered through mpr.summary.search() with the has_props parameter. Please see Tips for Large Downloads - Materials Project Documentation.

Alternatively, you can retrieve the DOSs from our AWS OpenData repository directly:

1 Like

Dear @tschaume,
Thank you for your reply and for sharing the AWS OpenData repository.
I wrote the following API code to loop through non-magnetic materials that have DOS:

from mp_api.client import MPRester
from pymatgen.electronic_structure.core import Spin 
from pymatgen.electronic_structure.plotter import DosPlotter
import matplotlib.pyplot as plt
import numpy as np

# Be sure that you have an API key:
MP_API_KEY="###################"



with MPRester(MP_API_KEY) as mpr:
    Mate_List = mpr.summary.search(fields=["material_id","formula_pretty","chemsys","is_metal","is_magnetic","total_magnetization","dos"])



# Initialize empty lists for the three categories
Mate_without_dos = []
Mate_with_dos_mag = []
Mate_with_dos_NoMag = []



# Devide the whole material list into three categories:
for index in range(0, len(Mate_List)):
    if Mate_List[index].dos is None:
        Mate_without_dos.append(Mate_List[index].material_id)
    elif Mate_List[index].is_magnetic:
        Mate_with_dos_mag.append(Mate_List[index].material_id)
    else:
        Mate_with_dos_NoMag.append(Mate_List[index].material_id)


#####
#len(Mate_List) == len(Mate_without_dos) + len(Mate_with_dos_NoMag) + len(Mate_with_dos_mag)
#>> True



## Getting DOS of the non-magnetic materials



# Initialize the Materials Project API client
with MPRester(api_key) as mpr:

    # Specify the material ID or formula you want to retrieve DOS for
    # for i in range(0, len(Mate_with_dos_NoMag)):
    for i in range(0,5000):
        material_ID = Mate_with_dos_NoMag[i]
        material_id = material_ID.split('(')[0].split(')')[0]
 
        # Get the DOS data for the material
        dos_data = mpr.get_dos_by_material_id(material_id)

        # Get chemical_formula of the material
        chemical_formula = mpr.materials.search(material_ids=[material_id], fields=["formula_pretty"]);
        Formula = chemical_formula[0].formula_pretty
    
        # Access DOS properties
        energies = dos_data.energies
        dos = dos_data.get_densities()  # Get the total DOS
        fermi_energy = dos_data.efermi  # Fermi energy
    
        # Specify the output file name
        output_file = f"{Formula}_{material_ID}.txt"
    
        # Write the DOS data to the output file
        with open(output_file, 'w') as file:
            # Write the Fermi energy as a comment on the first line
            file.write("#  E (eV)                DOS          Fermi Energy = {:.3f}\n".format(fermi_energy))
    
            # Write the energy and DOS data in columns
            for energy, dos_value in zip(energies, dos):
                file.write("{:.3f}             {:.4e}\n".format(energy, dos_value))
        print(f"DOS data saved to {output_file}")


I think I chose a long way to get the data, but it works well for me.

I have another question:
Is there a way to know which ab initio method (DFT, DFT+U, GW, …) used to compute the data?

thanks,

Ragab.

You should be able to replace most of your code with a single query, sth like (not checked for typos or bugs). This uses the internal _search method due to is_magnetic not being available in search. We will make it available in the next release of the client.

mpr.summary._search(
    is_magnetic=True,
    has_props=["dos"],
    fields=[
        "material_id", "formula_pretty", "chemsys", "is_metal",
        "is_magnetic", "total_magnetization", "dos"
    ]
)

You also don’t need the second call to mpr.materials.search() just to retrieve the formula for every material separately (slow and inefficient). The formula is already in the results of your first search. Same for get_dos_by_material_id().

As for the calculation type used, return and take a look at the calc_types field in your search.

Tagging @munrojm for additional details / corrections if needed.

PS: looks like you should also be able to query via total_magnetization_min to identify magnetic materials.

Thread closed due to inactivity, please open a new thread to address related issues.