I need some help with how fill in properly the above mention section with a parser I’m writing.

The LOBSTER code calculates this, however I’m not sure how to fill the data to this section properly. The main issue is that the number_of_lm_atom_projected_dos is not the same for all atoms (for example for W atom I have 5s 6s 5p_y 5p_z 5p_x 5d_xy 5d_yz 5d_z^2 5d_xz 5d_x^2-y^2 and for C only 2s 2p_y 2p_z 2p_x), however several other quantities there assume this is the same for all atoms, for example the shape of atom_projected_dos_values_lm is [number_of_lm_atom_projected_dos, number_of_spin_channels, number_of_atoms, number_of_atom_projected_dos_values].

I’ve come upon this already in the past when I did some work on Wien2k parser. There one can calculate just a selected lm components of selected atoms, so the number_of_lm_atom_projected_dos might not be even consistent for the same atom type.

Right now I see just one option to define all of the projections for every atom. For example here it would be (s, p_y p_z p_x d_xy d_yz d_z^2 d_xz d_x^2-y^2) and just leave the ones which are not actually calculated at zero. This would be OK for LOBSTER because there all projections are calculated implicitly (this is with atomic basis set), so the other projections would be indeed zero. This would not work for the Wien2k though when only some of the projected DOS is calculated because we just don’t know about the others.

Any advice would be appreciated.

I would still appreciate some comments here, and I also have one further issue besides the ones described in the first post. It is not clear where the projected DOS should actually go, there is section_atom_projected_dos, however the section_dos has all the same fields as section_atom_projected_dos as well. What is the difference?

Hi Pavel!

I’m not too familiar with how the projected versions work, but let’s try to figure this out. Maybe @ladinesa has additional input on this as well.

After looking at how our current metainfo works, I agree that it is not flexible enough to support these cases where the number of (l,m) projections changes per atom or species. Instead of storing zeros in these possibly large arrays, we should refactor both section_atom_projected_dos and section_species_projected_dos. To me the simplest option would be to specify a new section for each atom/species, in which you also define the atom index or atomic number of the species. This way each new section can have a varying size for the actual data. Either way you should definitely not use section_dos: it is reserved for total DOS, and if I’m not competely mistaken has some similar quantities beacause you can also project the total dos to different (l,m)-channels.

Do you think this would work for LOBSTER and WIEN2K? How urgently do you need this feature? I think these quantities are very rarely filled in central nomad, so we can quite easily adapt them to a new layout once we agree on a suitable alternative.

Yeah, new section for each atom/species (with the new entry specifying atom index or species atomic number) would work, I’ll try to make some patches.

Just to get the terminology right, atom is just single specific atom (for example one C atom), while species is “all C atoms in the structure”, i.e., do I get it right that the original intention was

  • section_atom_projected_dos dos lm-projections for individual atoms
  • section_species_projected_dos dos lm-projections for species (as in sum for all C atoms, sum for all O atoms, etc.)
  • section_dos/dos_*_lm dos lm-projections summed over all atoms

So technically section_species_projected_dos would be sum of selected section_atom_projected_dos and section_dos/dos_*_lm would be sum over all section_species_projected_dos.

Regarding the section_dos I got confused, because if your comment is right and this should be the lm-projection summed over all atoms, than it seems the VASP parser is misusing it and also the shapes of the arrays are not correct: for example shape of dos_values_lm in section_dos is [number_of_dos_lms, number_of_spin_channels, number_of_atoms, number_of_dos_values], so I believe the third dimension should definitely not be number of atoms if this is supposed to be integrated lm-projected dos over all atoms…

Ok, great. Yes, your terminology is correct.

If all atoms are not included in the section_atom_projected_dos or all species are not included in section_species_projected_dos, the summing does not work out.

Regarding section_dos: Yes you are right in that the metainfo description for dos_values_lm (“Array containing the density (electronic-energy) of states values projected on the various spherical harmonics (integrated on all atoms)”) is not in line with the output shapes and it’s usage by the VASP parser. I have to check if there is a good reason for this, or if it should be fixed as well.

0001-Make-the-section_atom_projected_dos-more-generic.patch (3.0 KB)
0002-Make-the-section_species_projected_dos-more-generic.patch (4.0 KB)

So something like this would work? If so I can start hacking on the lobster support.

I did not touch the section_dos/*_lm stuff yet as that is used by the vasp parser.

What should be done is to have only one section_dos and create a section for each set of dos_values be it atom projected species projected or total. The shape of the dos_values would simply be nspin, n_dos_values. We should remove atom_projected and species_projected all together and simply put dos_kind=(total, atom, species), dos_lm, dos_species. I think that the original intention of dos_values_lm is not to sum the atom contributions but rather to really have atom/species projections. It is just that atom_projected and species_projected were introduced along the way rather than simply adding the dos_kind tag. I hope my understanding is correct.

That would be fine for me as well…

Oh no there is a problem, section_dos is also used by phonon dos so dos_kind is actually electronic or vibrational. I am actually not so sure now if we have to separate electronic dos and phonon dos.

BTW another slight problem I’ve encountered are the units of dos_values. Shouldn’t it be energy-1 (using the definition that integrating over the whole energy range gives the number of states in the unit cell)? It is currently unitless. Similarly for dos_values_normalized (energy-1 length-3).

Yes, the units should be added as well. We need to experiment a bit with different solutions and see how they work with different DFT codes. The search index also complicates things somewhat as we have to somehow easily be able to search for different types of dos (phonon, electronic total, electronic projected).

I would suggest that Pavel for now introduces a temporary fix that works for his installation, and we start refactoring the electronic structure metainfo for central NOMAD.

OK, I’ll just go with x_lobster_section_atom_projected_dos for now.

Another small question, the dos_energies_normalized and dos_values_normalized is something which should be populated by the parser or by some later normalizer? Specifically, the energies in lobster DOS output are already normalized so 0 is at the Fermi level, so where should I put that?

I’ve spotted one more thing (for the dos_integrated_values and its only user, the vasp parser, where it is also unclear if the parser is incorrect or the definition needs fixing), see: Incorrect dos_integrated_values · Issue #11 · nomad-coe/nomad-parser-vasp · GitHub

Lobster also has this integrated dos output (as its dos format closely mirrors that one of VASP).