I have one more idea regarding the workflows which I wanted to discuss (considering the pending workflow changes/discussion).
So lot of stuff in our oasis are some series/workflows of calculations (EV curves calculations, energy strain calculations, diffusion calculations, converge tests, etc.) Most of the time the calculations are not using any external software to generate the different structures/steps, and users do this by hand, as the input preparation and output evaluation is usually straightforward. I had been thinking about trying to detect such cases and group them with a workflow.
The motivation for this is that when user uploads such calculations right now, it is not clear that his intention was to do for example energy-strain fit. It is visible that he uploaded bunch of calculation which all have the same composition and settings and slightly different structure, but the ultimate goal is not clear until one looks closer.
Lets take the most common case, Birch-Murnaghan fit as an example. In theory, it shouldn’t be too hard to detect such calculations. They folders for the calculations at different volumes will be likely in the same upper-level folder, names of folders for the different volumes usually keeps consistent except for random numbers, like (vol0.99, vol1, vol1.01, …), they will have the same input settings, same number and types of atoms and the atomic positions will be also quite similar (dunno if there is some tool to judge the similarity of structures). And looking at the differences one can usually guess if this is for example series for BM-fit, some deformations for specific Cxy elastic tensor component fit, stress-strain method, diffusion pathway, etc… So the “parser” would detect such cases, create a new entry with the correct workflow to group the calculations together and do the appropriate post processing (there it would be BM-fit to calculate the bulk modulus, equilibrium volume, etc…) If we can later group the search entries by the workflow (and make it searchable by the ultimate property obtained from the workflow), it should make it much easier to find stuff.
Now the thing is, there is no mainfile, and as I understand it, the 1 mainfile = 1 entry scheme is quite hardcoded in Nomad right now.
The easier solution would be to instruct users to just put some blank custom-named file like “nomad-workflow-parser-mainfile” in the folder containing the calculation which would trigger the right parser, but I’m not sure I can expect this from my users
I’ll be grateful for any comments, if you think this could reasonably work or not.