Inquiry Regarding Computational Parameter Settings in the MP Database

WoGaho · February 9, 2025, 3:17pm

Dear Materials Project Team,

Greetings!
I would like to inquire about the computational parameter settings (including EDIFF, Kpoints, and MAGMOM) for the materials in the MP database. Could you kindly provide the specific standards for these settings? What I am seeking may be the exact computational formulas (if available) for these parameters, as I have noticed that these parameters vary across different materials.
Additionally, for certain elements, two pseudopotentials are considered. In this case, how do you select between them for different materials? Which version of the pseudopotential set (by year) is used?
We are currently planning to use the MP database to train a deep learning generative model. In order to evaluate the accuracy of the generated samples, it is essential to ensure that the VASP computational parameters/standards align with the database’s computational parameters/standards.
It would be greatly appreciated if you could provide the specific code used when creating the INCAR, KPOINTS, and POTCAR files for high-throughput calculations of MP database.

We will ensure proper acknowledgment and citation. Thank you!

Aaron_Kaplan · February 10, 2025, 4:36pm

Hi @WoGaho, all of the input sets the Materials Project uses are defined in the pymatgen python package, specifically pymatgen.io.vasp.sets. For data currently on MP, we use the MPRelaxSet for PBE / PBE+U (GGA / GGA+U) relaxations, and the MPScanRelaxSet for r²SCAN relaxations. (For single-point / static calculations, just change Relax → Static in the set name).

You can find documentation about these legacy sets on our site. Please cite this paper when using data from MP

We are currently in the process of recomputing the entirety of MP’s structures using the r²SCAN functional and an updated input set. To see an example of how we carefully select INCAR / k-point density / pseudopotential settings, see this discussion

WoGaho · February 10, 2025, 5:30pm

@Aaron_Kaplan Thanks for your guidance! Your response is like a ray of sunshine, warming my research journey.

WoGaho · February 10, 2025, 5:55pm

@Aaron_Kaplan I suddenly have another small question. The material IDs obtained through the following API are all computed using PBE / PBE+U (Optimize: MPRelaxSet, Static: MPScanRelaxSet), correct?
‘’‘’‘’
with MPRester(“API_KEY”) as mpr:
docs = mpr.materials.summary.search(
has_props=[“dos”], fields=[
“material_id”, ‘elements’, ‘nsites’, ‘composition’, ‘formula_pretty’, ‘formula_anonymous’, ‘structure’,
‘dos’, ‘dos_energy_up’, ‘dos_energy_down’, ‘symmetry’]
)
‘’‘’‘’
If not, how can I identify which materials (currently on MP) are based on PBE / PBE+U, and which are based on r2SCAN?

Aaron_Kaplan · February 10, 2025, 6:43pm

The material IDs obtained through the following API are all computed using PBE / PBE+U (Optimize: MPRelaxSet, Static: MPScanRelaxSet), correct?

Not quite - MPRelaxSet and MPScanRelaxSet are both used for optimizations, whereas MPStaticSet and MPScanStaticSet are both used for statics. The difference is that MPRelaxSet and MPStaticSet are used for PBE GGA and PBE+U calculations, whereas MPScanRelaxSet and MPScanStaticSet are used for r²SCAN calculations.

Originally, MP did only PBE and PBE+U calculations. Around 2022, we switched to performing r²SCAN calculations preferentially. That’s why both input sets are used

If not, how can I identify which materials (currently on MP) are based on PBE / PBE+U, and which are based on r2SCAN?

Does the answer here help with this?

WoGaho · February 10, 2025, 6:58pm

@Aaron_Kaplan Oh sorry, I had confused vision ha ha. I understood it, what I originally meant to write was ‘(Optimize: MPRelaxSet, Static: MPStaticSet)’. But thanks for your patient response.

Yes, ‘Pulling structures only relaxed with PBE or only relaxed using r2SCAN - #2 by Aaron_Kaplan’ is the answer I was looking for." Thank you a lot!