Get Cohesive energy from Materials Project

Hello.
I’m trying to extract cohesive energy from the materials project database by using API.
I tried reading answers to similar questions and following those methods, but unfortunately, I couldn’t succeed. It’s because the pymatgen was updated to the latest version, and the get_cohesive_energy function that used to be in the old version has disappeared.

Cohesive energy of Albite from Materials Project for LAMMPS comparison - Materials Project / Materials Project Data/API - Materials Science Community Discourse (matsci.org)

Currently, with the materials project being renewed, it seems that using the existing pymatgen.ext.matproj for get_cohesive_energy isn’t possible. I’ve actually tried several times, and I keep running into the following error:

cohesive = mpr.get_cohesive_energy(self, ‘mp-1206874’, per_atom=True)
^^^^^^^^^^^^^^^^^^^^^^^
File “/home/{my_folder_name}/anaconda3/lib/python3.11/site-packages/mp_api/client/mprester.py”, line 399, in getattr
raise AttributeError(
AttributeError: ‘MPRester’ object has no attribute ‘get_cohesive_energy’

It seems that the newest version of pymatgen (24.10.3) no longer have the get_cohesive_~ part.

Is there any other way to retrieve the cohesive energy? I think there’s also a method using entry (pulling total energy and dividing by the number of atoms), but this wouldn’t give an accurate calculation for isolated atoms, making the reliability of the results questionable.

I’ve finished calculating the isolated energy for each atom, but there’s a big problem. There’s a huge difference between the total energy from the Materials Project data calculated with VASP and my data calculated with Quantum ESPRESSO. I found out that the total energy can change depending on the calculation method, so it doesn’t have physical significance, but I still don’t understand how this difference occurs. Also, if I can’t directly calculate the difference in total energy, I have no idea how to compare the calculated values between VASP and QE.

Please let me know if there’s a good method available. I really debated whether to call back this topic for several days and tried various things, but it’s just not working out as I hoped…

One sure way to handle this would be to downgrade pymatgen version which contain that codes, but unfortunately, due to some errors, downgrading is currently not an option.

[Addtional]
Can anyone explain the options for predicted stable and Energy_above_hull? From what I understand, to be predicted stable, the energy above hull should be 0. I’ve checked materials like Co, Ru, and Cu, and it turns out that only those with an energy above hull of 0 are considered predicted stable. I’m curious if there’s a specific reason for separating these two options.

Best regards.

Code that I used.

import os
from mp_api.client import MPRester as MP_API_MPRester
from pymatgen.ext.matproj import MPRester as Pymatgen_MPRester

API_KEY = “”

with MP_API_MPRester(API_KEY) as mpr:
structures = mpr.materials.summary.search(
elements=[“Co”],
chemsys=“Co-*”,
energy_above_hull=(0, 0),
is_metal=True,
is_stable=True
)

with open(“2cohesive_test.txt”, “w”) as file:
file.write(f"Number of documents: {len(structures)}\n\n")

if not structures:  
    file.write("No results found.\n")  
else:  
    with Pymatgen_MPRester(API_KEY) as pmg_mpr:  
        for idx, structure_summary in enumerate(structures):  
            material_id = structure_summary.material_id  
            formula = structure_summary.formula_pretty  
            sym = structure_summary.symmetry  

            structure = pmg_mpr.get_structure_by_material_id(material_id)  
            primitive_cell = structure.get_primitive_structure()  

            cif_filename = f"{formula}.cif"  
            primitive_cell.to(filename=cif_filename)  

            try:  
                bulk_entry = pmg_mpr.get_entry_by_material_id(material_id, conventional_unit_cell=True)  

                if isinstance(bulk_entry, list):  
                    bulk_entry = min(bulk_entry, key=lambda e: e.energy_per_atom)  

                bulk_energy = bulk_entry.energy  

                isolated_atom_energies = {}  
                for element in bulk_entry.composition.elements:  
                    if element.symbol == 'O':  
                        o2_entries = pmg_mpr.get_entries('O2')  
                        if not o2_entries:  
                            raise ValueError("Unable to find O2 molecule entry.")  
                        o2_entry = min(o2_entries, key=lambda e: e.energy_per_atom)  
                        isolated_atom_energy = o2_entry.energy_per_atom  
                        isolated_atom_energies[element.symbol] = isolated_atom_energy  
                    elif element.symbol == 'H':  
                        h2_entries = pmg_mpr.get_entries('H2')  
                        if not h2_entries:  
                            raise ValueError("Unable to find H2 molecule entry.")  
                        h2_entry = min(h2_entries, key=lambda e: e.energy_per_atom)  
                        isolated_atom_energy = h2_entry.energy_per_atom  
                        isolated_atom_energies[element.symbol] = isolated_atom_energy  
                    else:  
                        element_entries = pmg_mpr.get_entries_in_chemsys([element.symbol])  
                        bulk_element_entries = [  
                            e for e in element_entries  
                            if e.composition.reduced_formula == element.symbol  
                        ]  
                        if not bulk_element_entries:  
                            raise ValueError(f"Unable to find bulk element entry for element {element.symbol}")  
                        bulk_element_entry = min(bulk_element_entries, key=lambda e: e.energy_per_atom)  
                        isolated_atom_energies[element.symbol] = bulk_element_entry.energy_per_atom  

                total_isolated_energy = sum([  
                    bulk_entry.composition[el] * isolated_atom_energies[el.symbol]  
                    for el in bulk_entry.composition.elements  
                ])  
                cohesive_energy = total_isolated_energy - bulk_energy  
                cohesive_energy_per_atom = cohesive_energy / bulk_entry.composition.num_atoms  

            except Exception as e:  
                cohesive_energy_per_atom = None  
                print(f"Can't calculated {material_id} ({formula})'s cohesive energy: {e}")  

            file.write(f"--- Document {idx + 1} ---\n")  
            file.write(f"Material ID: {material_id}\n")  
            file.write(f"Chemical Formula: {formula}\n")  
            file.write(f"Symmetry: {sym}\n")  
            file.write(f"Primitive Structure:\n{primitive_cell}\n")  
            file.write("\n")  
            if cohesive_energy_per_atom is not None:  
                file.write(f"Cohesive Energy per atom: {cohesive_energy_per_atom:.2f} eV/atom\n\n")  
                file.write(f"Bulk Energy: {bulk_energy:.4f} eV\n")  
                file.write("Isolated Atom Energies (DFT calculations):\n")  
                for element_symbol, energy in isolated_atom_energies.items():  
                    file.write(f" - {element_symbol}: {energy:.4f} eV per atom\n")  
                file.write("\n")  
            else:  
                file.write("Can't calculate cohesive energy.\n\n")  
            file.write('-' * 40 + "\n")  
            file.write("\n\n")  

Hey @Minju,

  • You can still use the legacy pymatgen API and get_cohesive_energy function with an older API key (should be 16 characters rather than the 21 characters in the new API). You can check your old API key here
  • I pulled down the PBE GGA atomic energies from our legacy API here, units are eV/atom. Keep in mind that these energies cannot be used to calculate cohesive energies for solids with r2SCAN total energies on MP:
{'H': -1.11587797, 'He': 0.01038154, 'Li': -0.29720495, 'Be': -0.01764556, 'B': -0.26858985, 'C': -1.25998247, 'N': -3.11173655, 'O': -1.53343974, 'F': -0.50059069, 'Ne': -0.01160895, 'Na': -0.22867367, 'Mg': -0.09551372, 'Al': -0.21633877, 'Si': -0.8265492, 'P': -1.88794092, 'S': -0.89072159, 'Cl': -0.25877019, 'Ar': -0.04921952, 'K': -0.22735189, 'Ca': -0.09267815, 'Sc': -1.97271039, 'Ti': -2.20322453, 'V': -3.66390549, 'Cr': -5.59832953, 'Mn': -5.32596236, 'Fe': -3.41439808, 'Co': -1.9470445, 'Ni': -0.91354767, 'Cu': -0.60077969, 'Zn': -0.16418411, 'Ga': -0.32917553, 'Ge': -0.87843304, 'As': -1.68376491, 'Se': -0.76921751, 'Br': -0.22119763, 'Kr': -0.03314942, 'Rb': -0.18694104, 'Sr': -0.068011, 'Y': -2.17417537, 'Zr': -2.04279634, 'Nb': -3.13299031, 'Mo': -4.60247889, 'Tc': -3.36167381, 'Ru': -2.14870945, 'Rh': -1.35815518, 'Pd': -1.4765485, 'Ag': -0.33901873, 'Cd': -0.16683316, 'In': -0.35508443, 'Sn': -0.83605507, 'Sb': -1.41061736, 'Te': -0.65688183, 'I': -0.18905393, 'Xe': -0.00984012, 'Cs': -0.13555854, 'Ba': -0.03393307, 'La': -0.67461885, 'Ce': -1.18331004, 'Pr': -0.19451181, 'Nd': -0.20166733, 'Pm': -0.20603743, 'Sm': -0.21234603, 'Eu': -8.36550995, 'Gd': -10.27279609, 'Tb': -0.24654003, 'Dy': -0.24630703, 'Ho': -0.25218271, 'Er': -0.28255396, 'Tm': -0.25582842, 'Yb': -0.06421108, 'Lu': -0.26442917, 'Hf': -3.18577473, 'Ta': -3.42503432, 'W': -4.65706744, 'Re': -4.70114419, 'Os': -2.89756089, 'Ir': -1.62318678, 'Pt': -0.53597051, 'Au': -0.28808586, 'Hg': -0.12456359, 'Tl': -0.31762167, 'Pb': -0.77557173, 'Bi': -1.32632748, 'Ac': -0.16706447, 'Th': -0.75124164, 'U': -4.80121024, 'Np': -7.53922056, 'Pu': -10.71752857}

Energy differences calculated with the same pseudopotential are physically meaningful, so you’d just need to do the same calculation in VASP with the same pseudpotentials MP uses to get compatible total energies

Hope that helps, let me know if I missed anything

1 Like

Hi Aaron,
It seems that the API key I was using was indeed the one for the renewed materials project, which might be why that command wasn’t valid. I’ll have to give it another shot.

But I’m a bit confused about why calculating cohesive energy with that particular energy isn’t allowed. Is it because the legacy one used PBE GGA for the calculations, while the renewed MP is using the r2SCAN approach?

Thank you for your help.

There are still PBE calculations in the current database, so you can use the atomic energies to calculate cohesive energies for any PBE task in MP. Think of PBE and r2SCAN as different rules for calculating the total energy of a solid / molecule / atom - they will give you different total energies for the same system

1 Like

I understand that the PBE calculation results still exist in the current database. So, there are currently both the PBE and r2SCAN calculation results in the database, right?

If that’s the case, can you let me know which value the bulk energy data extracted via the entry command using the latest API is selected from, PBE or r2SCAN?

What’s confusing me right now is whether the get_cohesive_energy function in pymatgen can only run using the Legacy API.

You can still use the legacy pymatgen API and get_cohesive_energy function with an older API key (should be 16 characters rather than the 21 characters in the new API). You can check your old API key here

From what you mentioned earlier, it seems that using pymatgen to extract cohesive energy data only supports the Legacy version. (That’s because when I tried using the API from the renewed site, it seemed like pymatgen itself wasn’t working.)
In other words, if I obtain the bulk energy using the legacy API (with the entry command), I can find out the total energy value from the PBE calculation results(since there is only PBE data in the legacy site), and I should be able to get the cohesive energy of the material using get_cohesive_energy, whether it’s PBE or r2SCAN, right?

  • I pulled down the PBE GGA atomic energies from our legacy API here, units are eV/atom. Keep in mind that these energies cannot be used to calculate cohesive energies for solids with r2SCAN total energies on MP:

Oh, thinking back on it now, I probably misunderstood above part. What you probably meant to say is that I shouldn’t confuse the PBE calculation with the r2SCAN calculation.

Anyway, I’d appreciate it if you could let me know if you can run get_cohesive_energy using the latest API in the same way.

Thanks alot.
I’m sorry for rambling on.

So, there are currently both the PBE and r2SCAN calculation results in the database, right?

Right - there is also a mixing scheme to better align the PBE and r2SCAN hulls described in this paper.

If that’s the case, can you let me know which value the bulk energy data extracted via the entry command using the latest API is selected from, PBE or r2SCAN?

Yes, for example:

from mp_api.client import MPRester

with MPRester("your api key here") as mpr:
    si_entries = mpr.get_entries("mp-149")
for entry in si_entries:
    print(entry.data["task_id"], entry.data["run_type"]) 

should return (up to the order):

>>> mp-1947498 R2SCAN
>>> mp-1947498 R2SCAN
>>> mp-2291052 GGA

where GGA implies the PBE GGA.

What’s confusing me right now is whether the get_cohesive_energy function in pymatgen can only run using the Legacy API.

Right now, the get_cohesive_energy function only works for the legacy API. It will likely be added to the current/next-gen API.

should be able to get the cohesive energy of the material using get_cohesive_energy, whether it’s PBE or r2SCAN, right?

Not quite - you need atomic energies computed with r2SCAN to get r2SCAN cohesive energies. You can use the get_cohesive_energy function to get PBE cohesive energies, only. Don’t use the atomic energies I sent with r2SCAN total energies, the results you get would be meaningless.

Thanks for your response.
Everything that was confusing is now cleared up.

Generally, the data extracted using the get_cohesive_energy command would have been calculated using PBE GGA (since they said that only PBE GGA is used in the legacy version). So, it’s clear that the cohesive energies I’ve obtained represent the accurate cohesive energy of the materials.

However, when I actually use Quantum ESPRESSO to calculate the total energy for an **isolated atom and calculate the ***total energy of the primitive cell and compare the difference between the two, the resulting value doesn’t match the data obtained using the get_cohesive_energy command.

I really can’t understand why there’s such a difference. Could you please explain whether the data obtained through the cohesive energy is correct, or if there’s something wrong with my calculation process?

** isolated atom: placing one atom in a cubic cell within 15 x 15 x15 angstrom^3 cell
*** in the case of bulk, relaxation is completed

Here’s how I calculate cohesive energy.

Here’s some data extracted using actual code.

Cohesive.py

import os
from mp_api.client import MPRester as MP_API_MPRester
from pymatgen.ext.matproj import MPRester as Pymatgen_MPRester

API_KEY = " " # Use your own API key

API_KEY_Legacy = " " # Use your own legacy API key

Use the mp-api to fetch structural information

with MP_API_MPRester(API_KEY) as mpr:
structures = mpr.materials.summary.search(
elements=[“Co”],
chemsys=“Co-*”,
energy_above_hull=(0, 0),
is_metal=True,
is_stable=True
)

Open a file to save the results

with open(“Cohesive.txt”, “w”) as file:
file.write(f"Number of documents: {len(structures)}\n\n")

if not structures:  
    file.write("No results found.\n")  
else:  
    # Use pymatgen's MPRester to fetch additional data  
    with Pymatgen_MPRester(API_KEY_Legacy) as pmg_mpr:  
        for idx, structure_summary in enumerate(structures):  
            material_id = structure_summary.material_id       
            formula = structure_summary.formula_pretty        
            sym = structure_summary.symmetry                  

            # Fetch structural information of the material  
            structure = pmg_mpr.get_structure_by_material_id(material_id)  
            primitive_cell = structure.get_primitive_structure()           

            # Save primitive cell information in a CIF file  
            cif_filename = f"{formula}.cif"  
            primitive_cell.to(filename=cif_filename)        

            # Start cohesive energy calculation  
            try:  
                cohesive_energy_per_atom = pmg_mpr.get_cohesive_energy(material_id)  # Fetch cohesive energy  

            except Exception as e:  
                cohesive_energy_per_atom = None  
                print(f"Cannot calculate cohesive energy for {material_id} ({formula}): {e}")  

            # Save results to the file  
            file.write(f"--- Document {idx + 1} ---\n")  
            file.write(f"Material ID: {material_id}\n")  
            file.write(f"Chemical Formula: {formula}\n")  
            file.write(f"Symmetry: {sym}\n")  
            file.write(f"Primitive Structure:\n{primitive_cell}\n")  
            file.write("\n")   
            if cohesive_energy_per_atom is not None:  
                file.write(f"Cohesive Energy per atom: {cohesive_energy_per_atom:.2f} eV/atom\n\n")  
            else:  
                file.write("Cannot calculate cohesive energy.\n\n")  
            file.write('-' * 40 + "\n")  
            file.write("\n\n")  




Here are the results I got from using Quantum ESPRESSO for calculations. I used the PBE PAW type pseudopotential, which I downloaded from the library provided by QE.
PSlibrary - QUANTUMESPRESSO (quantum-espresso.org)
The Bulk total energy and Isolated energy are based on the total energy from the SCF calculation results.

Formula Real atom # of Co # of other aotm bulk total E [Ry] Isolated Co E [Ry] Isolated other E [Ry] Cohesive E [Ry] Cohesive E [eV/atom]
Co3Pt Co6Pt2 Co Pt 6 2 -3754.93016892 -376.2382828 -746.9051703 -3.6901317 6.27553022231278
Co3S4 Co6S8 Co S 6 8 -2791.06864209 -376.2382828 -66.06868109 -5.08949675 4.94590023455387
Co3Se4 Co6Se8 Co Se 6 8 -5475.019245 -376.2382828 -401.5298104 -5.35106527 5.20008878559614
Co9S8 Co S 9 8 -3921.360613 -376.2382828 -66.06868109 -6.66661938 5.33525627440601

Sorry for always leaving such long messages.

There will be differences between VASP and QE even if you use PAWs for both because they are not the same pseudopotential, and they are run with different self-consistent field settings.

But, that’s not the issue here. First: define cohesive energy per atom of a solid as:
E_\mathrm{coh} = \left(\sum_{\mathrm{atom} \in \mathrm{solid}} N_\mathrm{atom} \right)^{-1} \cdot \left( E_\mathrm{solid} - \sum_{\mathrm{atom} \in \mathrm{solid}} N_\mathrm{atom} E_\mathrm{atom} \right).
Then a negative cohesive energy says a solid is stable with respect to dissociation to gas phase (isolated atoms) and a positive cohesive energy indicates that a solid is unstable.

Your cohesive energies are all positive (unstable) and quite large, at least from the screenshots. Something is wrong there. When I calculated them using MP’s data, I find:

{'mp-865193': {'total_energy': -55.54866171,
  'total_energy_per_atom': -6.94358271375,
  'task_id': 'mp-1795800',
  'composition': {'Co': 6.0, 'Pt': 2.0},
  'cohesive_energy_per_atom': -5.34930671125},
 'mp-943': {'total_energy': -81.87787935,
  'total_energy_per_atom': -5.848419953571429,
  'task_id': 'mp-1420221',
  'composition': {'Co': 6.0, 'S': 8.0},
  'cohesive_energy_per_atom': -4.504988545000001},
 'mp-20456': {'total_energy': -74.65734002,
  'total_energy_per_atom': -5.332667144285715,
  'task_id': 'mp-1775477',
  'composition': {'Co': 6.0, 'Se': 8.0},
  'cohesive_energy_per_atom': -4.058666638571429},
 'mp-1513': {'total_energy': -105.21993485,
  'total_energy_per_atom': -6.189407932352942,
  'task_id': 'mp-1380892',
  'composition': {'Co': 9.0, 'S': 8.0},
  'cohesive_energy_per_atom': -4.739456566470587}}

which are closer to your QE values, up to a sign difference.

For reference, if atomic_energies is the dict of atomic energies I sent before, then this is how I got these cohesive energies from MP data:

from mp_api.client import MPRester

mpids = ["mp-865193","mp-943","mp-20456","mp-1513"]

total_energies = {mp_id: {"total_energy": 1e20, "total_energy_per_atom": 1e20, "task_id": None, "composition": None} for mp_id in mpids}
with MPRester() as mpr:
    thermo_docs = mpr.materials.thermo.search(mpids)
    for thermo_doc in thermo_docs:
        if (
            "GGA" in thermo_doc.entries
            and (toten_per_atom := thermo_doc.entries["GGA"].uncorrected_energy_per_atom) < total_energies[str(thermo_doc.material_id)]["total_energy_per_atom"]
        ):
            task_id = [prop for prop in thermo_doc.origins if prop.name == "energy"][0].task_id
            total_energies[str(thermo_doc.material_id)].update(
                total_energy_per_atom = toten_per_atom,
                total_energy = thermo_doc.entries["GGA"].uncorrected_energy,
                task_id = str(task_id),
                composition = thermo_doc.composition.as_dict()
            )

for mp_id, entry in total_energies.items():
    total_energies[mp_id]["cohesive_energy_per_atom"] = (
        entry["total_energy"]
        - sum(coeff * atomic_energies[ele] for ele, coeff in entry["composition"].items())
    )/sum(entry["composition"].values())
1 Like

There are no search results for the Materials Project for the IDs that come up when using the entry.data you provided. What’s the difference between mp-149, which represents a material ID, and a task_id? Is it a log about the calculation method, just like it says?

>>> mp-1947498 R2SCAN
>>> mp-1947498 R2SCAN
>>> mp-2291052 GGA

I’m not sure if extracting the total energy in this way is the right approach.

Code

Use the mp-api to fetch structural information

with MP_API_MPRester(API_KEY) as mpr:
structures = mpr.materials.summary.search(
elements=[“Co”],
chemsys=“Co”,
energy_above_hull=(0, 0),
is_metal=True,
is_stable=True
)

Open the file to save the results

with open(output_filename, “w”) as file:
file.write(f"Number of documents: {len(structures)}\n\n")

if not structures:  
    file.write("No results found.\n")  
else:  
    # Use pymatgen's MPRester to fetch additional data  
    with Pymatgen_MPRester(API_KEY_Legacy) as pmg_mpr:  
        for idx, structure_summary in enumerate(structures):  
            material_id = structure_summary.material_id  
            formula = structure_summary.formula_pretty  
            sym = structure_summary.symmetry  

            # Fetch the structural information of the material  
            structure = pmg_mpr.get_structure_by_material_id(material_id)  
            primitive_cell = structure.get_primitive_structure()  

            # Save primitive cell information to a CIF file  
            cif_filename = f"{formula}.cif"  
            primitive_cell.to(filename=cif_filename)  

            # Start calculating cohesive energy  
            try:  
                cohesive_energy_per_atom = pmg_mpr.get_cohesive_energy(material_id)  # Get cohesive energy  

            except Exception as e:  
                cohesive_energy_per_atom = None  
                print(f"Cannot calculate cohesive energy for {material_id} ({formula}): {e}")  

            try:  
                with MP_API_MPRester(API_KEY) as mpr_entries:  
                    entries = mpr_entries.get_entries(material_id)  
                    for entry in entries:  
                        print(entry.data["task_id"], entry.data["run_type"])  

                        energy = entry.energy  
                        run_type = entry.data.get("run_type", "N/A")  
            except Exception as e:  
                print(f"Error occurred while fetching entries for {material_id} ({formula}): {e}")  
                entries = []  

            # Save results to the file  
            file.write(f"--- Document {idx + 1} ---\n")  
            file.write(f"Material ID: {material_id}\n")  
            file.write(f"Chemical Formula: {formula}\n")  
            file.write(f"Symmetry: {sym}\n")  
            file.write(f"Primitive Structure:\n{primitive_cell}\n")  
            file.write("\n")  
            if cohesive_energy_per_atom is not None:  
                file.write(f"Cohesive Energy per atom: {cohesive_energy_per_atom:.2f} eV/atom\n\n")  
            else:  
                file.write("Cannot calculate cohesive energy.\n\n")  
            
            # Write total energy and run type  
            file.write(f"Total Energy: {energy:.2f} eV, Type: {run_type}\n\n")  
            file.write('-' * 40 + "\n")  
            file.write("\n\n")

Usually, the form I used involves bringing in all the materials using the Renew API (in the case of testing, only bringing in Co, but during actual use, sorting all materials that fit the options) and extracting the cohesive energy for each fetched material using the Legacy API.

            try:  
                with MP_API_MPRester(API_KEY) as mpr_entries:  
                    entries = mpr_entries.get_entries(material_id)  
                    for entry in entries:  
                        print(entry.data["task_id"], entry.data["run_type"])  

                        energy = entry.energy  
                        run_type = entry.data.get("run_type", "N/A")  
            except Exception as e:  
                print(f"Error occurred while fetching entries for {material_id} ({formula}): {e}")  
                entries = []  

I added the get_entries command you mentioned to fetch the task_id and run_type, but the task_id data extracted is from an unknown source.

Additionally, when saving data to a file, I used the energy variable to store the Bulk total energy with entry.energy, and used the entry.data.get("run_type") for outputting the run_type.
The value obtained in this way is displayed as follows, but is there a way to check if the bulk total energy has been calculated correctly? Of course, I understand that bulk total energy doesn’t have physical meaning, but since I’m currently using the same pseudopotential in both the QE system and the VASP system, I wanted to check because it seems like the total energy should be the same.

Total Energy: -13.23 eV, Type: R2SCAN

The need to distinguish whether the data is R2SCAN or GGA, is generally because I think QE and VASP’s PBE PAW calculation results should be compared among GGA data.
Of course, I totally trust your opinion that the pseudopotential differs due to the different self-consistent field settings of VASP and QE. From what I’ve seen before and what I know, both VASP and QE use the PBE PAW format. Should I understand that even if it’s the same PBE, the pseudopotential differs due to different self-consistent fields? And unfortunately, my advisor is also asking for an explanation of how the two are different… I don’t have enough knowledge to explain that yet.

Your response is really, really helpful for me. I can’t even imagine what kind of despair I’d be in without your help. I’m truly grateful for you dedicating your precious time like this.

image

Well, I’m not sure about the difference between the equation I attached and the equation you showed me.
It seems that
(multiplying the inverse of the sum of the number of atoms) x (the energy of the total solid minus the energy of the atoms constituting the whole solid) is the same as dividing the total number of atoms by the total energy of the bulk minus sum of all isolated atoms.
Was I thinking correctly?

Let’s go over how you calculated the cohesive energy one more time.

We create a dictionary stored in the total_energies variable for each data.

Understand (perhaps)

It seems that the thermo_docs variable stores the thermo energy of the materials. The if condition uses “and” to ensure that the GGA calculation results exist and the uncorrected energy from those results is smaller than the currently stored energy. The task_id is probably a code to find and store the task ID used for the energy calculation. I understand that the new values calculated above are updated in the total_energies[str~] part, and these values are then called and printed using the for loop.

I’m not really sure about the part where we calculate the cohesive_energy_per_atom. I don’t quite understand the part where we fetch the total energy and use the command sum(coeff * atomic_energies[ele] for ele, coeff in entry["composition"].items()). Here, is the coeff in coeff*atomic_energies[ele] the number of each atom, and does atomic_energies refer to the cohesive energy of each atom obtained using the previous Legacy API?

Hi @Minju, minor typo in what I sent before, corrected in the code snippets.

You cannot at this time use the same pseudopotential in QE and VASP. The VASP PAW pseudopotentials are in a format specific to VASP, are copyright protected, and cannot be read by QE. There are multiple ways to make a pseudopotential for the same element.

See this paper for a discussion.

The material IDs uniquely identify a material whereas a task ID identifies single calculations that build up properties in a material. For example, a band structure calculation could represent a single task.

Please see our documentation for more examples.

The definition I gave of cohesive energy is a standard solid-state definition that holds for arbitrary solids. Ashcroft and Mermin’s solid state physics textbook, or Kittel’s, are good references for this.

Your expression for cohesive energy assumes that every atom besides Co has the same total energy, which isn’t true.

1 Like

Hi Aaron Kaplan,

Hope you’re having a nice day.

I read you fixed a minor typo in the code above, so I took another look at it.
It seems like that the new code is now refreshing total_energy_per_atom instead of just total_energy.
I can’t quite grasp what difference this actually makes.
Is it because the part where uncorrected_energy_per_atom is fetched through GGA was missing, so you decided to update it?

Also, I’m not really sure why it’s assumed that all atoms except for Co have the same total energy in my equation.
I don’t doubt that the equation you provided is a standard definition used in the textbook you mentioned. However, I need to check if the expanded version of that equation matches with mine, for the sake of mathematical calculations.

I calculated the bulk’s total energy by actually computing the total energy for all the listed materials, and I also calculated the isolated energy for each atom separately. There might have been some confusion, but in reality, the isolated energy of the other atoms isn’t a constant value. Each atom, when combined with Co, has a different variable value.

If that’s the case, couldn’t it be interpreted in the same way as subtracting the sum of the energies of the constituent atoms from the total energy of the entire bulk?
Please feel free to let me know if my understanding is still off.

And also, I already read the documentation about API but hard to figure out what kind of options in the API call options.
The example only mentions get_phase_diagram as an option in mpr.materials.thermo, but your code shows that a search option is also available. And I can also set sorting filters (like material_id, is_stable, etc., in materials.summary) in materials.summary options. However, the site seems to offer limited information on these option lists. Is there no separate list available for these options?

Best regards,
Minju.

Elementary_atom_cohesive.py
import os  
from pymatgen.ext.matproj import MPRester  

API_KEY = ""  # Use your API key  
API_KEY_LEGACY = ""  # Use your API key  


if API_KEY_LEGACY is None:  
    print("Legacy API key is not set. Please set the environment variable 'MP_API_KEY_LEGACY'.")  
    exit()  

def main():  
    with MPRester(API_KEY_LEGACY) as mpr:  
        # Set search criteria for single element materials  
        criteria = {  
            "nelements": 1,  # Single element  
            "e_above_hull": {"$lte": 0},  # Thermodynamically stable materials  
        }  
        # Define required properties  
        properties = ["material_id", "pretty_formula"]  

        # Query the database  
        entries = mpr.query(criteria=criteria, properties=properties)  

        # Open a file to save the results  
        with open("elemental_cohesive_energies.txt", "w") as file:  
            file.write(f"Number of elemental materials: {len(entries)}\n\n")  
            if not entries:  
                file.write("No elemental materials found.\n")  
                return  

            for idx, entry in enumerate(entries):  
                material_id = entry["material_id"]  
                formula = entry["pretty_formula"]  

                try:  
                    # Calculate cohesive energy  
                    cohesive_energy = mpr.get_cohesive_energy(material_id)  

                    # Write results to the file  
                    file.write(f"--- Material {idx + 1} ---\n")  
                    file.write(f"Material ID: {material_id}\n")  
                    file.write(f"Formula: {formula}\n")  
                    file.write(f"Cohesive Energy: {cohesive_energy:.6f} eV/atom\n")  
                    file.write("-" * 40 + "\n")  
                except Exception as e:  
                    # Print error message and write to file if an error occurs  
                    error_message = f"Cannot retrieve cohesive energy for {material_id} ({formula}): {e}"  
                    print(error_message)  
                    file.write(f"--- Material {idx + 1} ---\n")  
                    file.write(f"Material ID: {material_id}\n")  
                    file.write(f"Formula: {formula}\n")  
                    file.write("Cohesive Energy: Error retrieving data\n")  
                    file.write("-" * 40 + "\n")  
                    continue  

    print("All elemental materials' cohesive energies have been saved to 'elemental_cohesive_energies.txt'.")  

if __name__ == "__main__":  
    main()

Why doesn’t the calculation with get_cohesive using the legacy API yield the same values as the data you provided?

Even in the case of Pa, it is not possible to calculate at all.

{‘H’: -1.11587797, ‘He’: 0.01038154, ‘Li’: -0.29720495, ‘Be’: -0.01764556, ‘B’: -0.26858985, ‘C’: -1.25998247, ‘N’: -3.11173655, ‘O’: -1.53343974, ‘F’: -0.50059069, ‘Ne’: -0.01160895, ‘Na’: -0.22867367, ‘Mg’: -0.09551372, ‘Al’: -0.21633877, ‘Si’: -0.8265492, ‘P’: -1.88794092, ‘S’: -0.89072159, ‘Cl’: -0.25877019, ‘Ar’: -0.04921952, ‘K’: -0.22735189, ‘Ca’: -0.09267815, ‘Sc’: -1.97271039, ‘Ti’: -2.20322453, ‘V’: -3.66390549, ‘Cr’: -5.59832953, ‘Mn’: -5.32596236, ‘Fe’: -3.41439808, ‘Co’: -1.9470445, ‘Ni’: -0.91354767, ‘Cu’: -0.60077969, ‘Zn’: -0.16418411, ‘Ga’: -0.32917553, ‘Ge’: -0.87843304, ‘As’: -1.68376491, ‘Se’: -0.76921751, ‘Br’: -0.22119763, ‘Kr’: -0.03314942, ‘Rb’: -0.18694104, ‘Sr’: -0.068011, ‘Y’: -2.17417537, ‘Zr’: -2.04279634, ‘Nb’: -3.13299031, ‘Mo’: -4.60247889, ‘Tc’: -3.36167381, ‘Ru’: -2.14870945, ‘Rh’: -1.35815518, ‘Pd’: -1.4765485, ‘Ag’: -0.33901873, ‘Cd’: -0.16683316, ‘In’: -0.35508443, ‘Sn’: -0.83605507, ‘Sb’: -1.41061736, ‘Te’: -0.65688183, ‘I’: -0.18905393, ‘Xe’: -0.00984012, ‘Cs’: -0.13555854, ‘Ba’: -0.03393307, ‘La’: -0.67461885, ‘Ce’: -1.18331004, ‘Pr’: -0.19451181, ‘Nd’: -0.20166733, ‘Pm’: -0.20603743, ‘Sm’: -0.21234603, ‘Eu’: -8.36550995, ‘Gd’: -10.27279609, ‘Tb’: -0.24654003, ‘Dy’: -0.24630703, ‘Ho’: -0.25218271, ‘Er’: -0.28255396, ‘Tm’: -0.25582842, ‘Yb’: -0.06421108, ‘Lu’: -0.26442917, ‘Hf’: -3.18577473, ‘Ta’: -3.42503432, ‘W’: -4.65706744, ‘Re’: -4.70114419, ‘Os’: -2.89756089, ‘Ir’: -1.62318678, ‘Pt’: -0.53597051, ‘Au’: -0.28808586, ‘Hg’: -0.12456359, ‘Tl’: -0.31762167, ‘Pb’: -0.77557173, ‘Bi’: -1.32632748, ‘Ac’: -0.16706447, ‘Th’: -0.75124164, ‘U’: -4.80121024, ‘Np’: -7.53922056, ‘Pu’: -10.71752857}

I want to extract this data you told me about myself, but the data I extracted seems to show really strange values.

My OUTPUT

Number of elemental materials: 95

— Material 1 —
Material ID: mp-862690
Formula: Ac
Cohesive Energy: 3.954111 eV/atom

— Material 2 —
Material ID: mp-989737
Formula: Ag
Cohesive Energy: 2.493510 eV/atom

— Material 3 —
Material ID: mp-134
Formula: Al
Cohesive Energy: 3.529237 eV/atom

— Material 4 —
Material ID: mp-23155
Formula: Ar
Cohesive Energy: 0.019589 eV/atom

— Material 5 —
Material ID: mp-11
Formula: As
Cohesive Energy: 2.974743 eV/atom

— Material 6 —
Material ID: mp-81
Formula: Au
Cohesive Energy: 2.985796 eV/atom

— Material 7 —
Material ID: mp-160
Formula: B
Cohesive Energy: 6.410801 eV/atom

— Material 8 —
Material ID: mp-122
Formula: Ba
Cohesive Energy: 1.885038 eV/atom

— Material 9 —
Material ID: mp-87
Formula: Be
Cohesive Energy: 3.721767 eV/atom

And the data you provided is missing Pa, so when I actually run Elementary_atom_cohesive.py, an error occurs saying it cannot calculate the cohesive energy for Pa (although the other calculated values are also strange). What is the problem with Pa?

Even if I try to directly extract cohesive energy using get_entries as below, an error occurs and it does not work.

from pymatgen.ext.matproj import MPRester  

# Set API key - fill in if necessary  
API_KEY = ""  
API_KEY_Legacy = ""  

# Initialize MPRester  
with MPRester(API_KEY_Legacy) as mpr:  
    # Search for single element (nelements=1) systems  
    structures = mpr.get_entries_in_chemsys(["H", "He", "Li", "Be", "B", "C", "N", "O", "F", "Ne",   
                                             "Na", "Mg", "Al", "Si", "P", "S", "Cl", "Ar",   
                                             "K", "Ca", "Sc", "Ti", "V", "Cr", "Mn", "Fe",   
                                             "Co", "Ni", "Cu", "Zn", "Ga", "Ge", "As", "Se",   
                                             "Br", "Kr", "Rb", "Sr", "Y", "Zr", "Nb", "Mo",   
                                             "Tc", "Ru", "Rh", "Pd", "Ag", "Cd", "In", "Sn",   
                                             "Sb", "Te", "I", "Xe", "Cs", "Ba", "La", "Ce",   
                                             "Pr", "Nd", "Pm", "Sm", "Eu", "Gd", "Tb", "Dy",   
                                             "Ho", "Er", "Tm", "Yb", "Lu", "Hf", "Ta", "W",   
                                             "Re", "Os", "Ir", "Pt", "Au", "Hg", "Tl", "Pb",   
                                             "Bi", "Po", "At", "Rn", "Fr", "Ra", "Ac", "Th",   
                                             "Pa", "U", "Np", "Pu"]  
    )  

with open("isolated.txt", "w") as file:  
    file.write(f"Number of documents: {len(structures)}\n\n")  

    # Initialize dictionary to store cohesive energies  
    cohesive_energies = {}  

    # Calculate cohesive energy for each structure using material ID  
    for entry in structures:  
        material_id = entry.entry_id  
        formula = entry.composition.reduced_formula  
        # Calculate cohesive energy and store in dictionary  
        cohesive_energy_per_atom = mpr.get_cohesive_energy(material_id, per_atom=True)  
        cohesive_energies[material_id] = (formula, cohesive_energy_per_atom)  

    # Output results  
    for idx, (material_id, (formula, energy)) in enumerate(cohesive_energies.items()):  
        print(f"Material ID: {material_id}, Formula: {formula}, Cohesive Energy: {energy:.2f} eV/atom")  

        file.write(f"--- Document {idx + 1} ---\n")  
        file.write(f"Material ID: {material_id}\n")  
        file.write(f"Chemical Formula: {formula}\n")  
        file.write(f"Cohesive Energy: {energy:.2f}\n")  
        file.write('-' * 40 + "\n")  
        file.write("\n\n")

/~/.local/lib/python3.11/site-packages/pymatgen/ext/matproj_legacy.py:168: UserWarning: You are using the legacy MPRester. This version of the MPRester will no longer be updated. To access the latest data with the new MPRester, obtain a new API key from https://materialsproject.org/api and consult the docs at https://docs.materialsproject.org/ for more information.
warnings.warn(
Killed

or

Traceback (most recent call last):
File “/~/MP_API/API/Cohesive/Test/Pa.py”, line 23, in
cohesive_energy_per_atom = mpr.get_cohesive_energy(material_id, per_atom=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/~/.local/lib/python3.11/site-packages/pymatgen/ext/matproj_legacy.py”, line 1284, in get_cohesive_energy
ent = self._make_request(f"/element/{el}/tasks/isolated_atom", mp_decode=False)[0]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range

Hi, Aaron Kaplan.

I know that you are going through very busy days.
Moreover, I understand that my question might be quite bothersome.

However, I still do not know the identity of the code that can obtain the energy of the isolated atom.
Could you please let me know which code you used to plot those values?

Sincerely,

It seems like that the new code is now refreshing total_energy_per_atom instead of just total_energy.
I can’t quite grasp what difference this actually makes.

To find the lowest energy structure within a chemical space, you need to compare the energy per atom (intensive) rather than the total energy (extensive). Different computational cells can contain different numbers of atoms

If that’s the case, couldn’t it be interpreted in the same way as subtracting the sum of the energies of the constituent atoms from the total energy of the entire bulk?

Yes this is the equation I sent you. May have been a misunderstanding in our previous messages, glad this is resolved

And I can also set sorting filters (like material_id, is_stable, etc., in materials.summary) in materials.summary options. However, the site seems to offer limited information on these option lists. Is there no separate list available for these options?

Through the API, you can call MPRester().materials.summary.available_search_fields to see all available fields, however not all of them are queryable. Here’s the longer documentation which explains which fields can be queried. (That link should pull up the materials/summary endpoint, in case the page nav doesn’t work)

Why doesn’t the calculation with get_cohesive using the legacy API yield the same values as the data you provided?

The get_cohesive_energy function uses the PBE-GGA energies plus a correction fit to experimental data, the code I sent uses uncorrected energies. In this case, I would use the uncorrected bulk energies to calculate cohesive energy

What is the problem with Pa?

There’s probably no isolated atom calculation for Pa. The atomic energies dataset is an older static dataset that hasn’t been updated. If there’s enough community interest, we’ll probably revive this method / dataset in the new API