On our Nomad Oasis we have multiple custom parsers installed. I have 2 problems:
For every test upload I make, the words
SinglePoint simulation are being falsely added to the file entry’s name, and also to the
workflow name. In almost none of the parser’s code does the word
SinglePoint even get mentioned, in one parser (the apbs parser)
SinglePoint gets mentioned but only in some code comment I think. Why does Nomad keep adding
SinglePoint to every upload? How do I stop this?
Two parsers, the battery-parser and graphene-parser, are interfering with each other. The interfering can be seen after I upload test files for the battery parser:
run.calculation.concentrations, if you select any of the molecules, the subsection that describes those molecules should be called
Concentrations_Battery, but it’s called
Concentrations_Graphene instead, so somehow the battery parser is accessing the graphene parser. Also, if you click on the
name of any of the molecules, and activate
definitions in the NOMAD GUI, you can see the description that goes something like
..."e" means already attached to graphene edge, which is a description from the graphene parser.
Also, Nomad clearly shows in the
Files tab of the GUI that the correct parser is being selected (
battery for battery simulation uploads), which makes my problem so confusing to me.
Here is the link to our gitlab repo: Sign in · GitLab
If I have to grant you some access rights, tell me please.
For the battery parser, there are test files in the
tests/data folder, mainfile is
input_battery.yml. All parsers work perfectly when tested locally with the
nomad parse command.
We are so happy you are keeping working on NOMAD, even developing your own parsers. This is really an achievement
Regarding your questions:
In the computational side of NOMAD, we have the following convention for defining
workflow_name. Both namings depend on defined sections, i.e., normalization takes care of setting
method_name = 'dft' if there is a populated secion
method_name = 'gw' if there exists
run.method.gw (and so on). Something similar happens with
workflow_name, but in that case, we distinguish a basic workflow from pre-defined workflows (for example,
GeometryOptimization). The basic workflow is called
SinglePoint and happens whenever you populate a single section
run.calculation, independently whether you define
workflow2 or not in your archive.
Then the naming
SinglePoint simulation is defined in a quantity called
entry_name. The logic is a bit involved but it can be summarized in: if
workflow_name != 'SinglePoint' it takes the
method_name, but if
method_name does not exist, is still uses
workflow_name. The name also includes the chemical formula and the program name. You can check out one of the latest entries in the central NOMAD archive to see more in detail what I meant.
If you want, you can share with us how you would like to name entries, and we can take a look more in depth and help improving the naming of entries, or at least, give flexibility enough for your two parsers.
Without taking a look into the custom parsers, it is a bit complicated to know what is going on. Maybe you can give me temporary access to the KIT Gitlab and I can take a look. But, a first idea I am having that might be causing troubles is how you identify the parsers that has to be executed, i.e., how you run the
MatchingParser. Did you define some sort of identifier in
input_battery.yml to pass one or another parser? Again, this might not be the source of the issue, so I would have to take a look on the repo; let me know which info you need to invite me.
All the best,
Thanks for quick reply. @JosePizarro Please send me your gitlab username or gitlab email-adress, so I can invite you.
I helped @fabian_li with the error 2., and found out something interesting that I am not sure why it is not being resolved in the front-end.
@fabian_li here has the two parsers plugins,
battery_parser, and in both of them, there are two MSections which share the same path in the archive. These are named differently in the
metainfo.py files of each parser,
Concentrations_Battery. They share the same path as they are added as extensions of Calculation like:
m_def = Section(extends_base_section=True)
concentrations = SubSection(sub_section=Concentrations_Graphene.m_def, repeats=True)
m_def = Section(extends_base_section=True)
concentrations = SubSection(sub_section=Concentrations_Battery.m_def, repeats=True)
Thus, the problem is that both sections share the same path,
archive.run.0.calculation.0.concentrations, and my guess is that the front-end is having issues properly getting the m_def of the section; in the back-end everything works well. Is this something expected or could be somehow fixed?
@fabian_li and I talked and agree that the best practice should be to define a single MSection
Concentrations as a schema plugin and then use it for both parsers. As a side note, this can also be fixed if the sub-sections are renamed to
concentrations_battery, but I will go with the schema plugin option.