Hello,
first of all I’m not sure whether this is the right subforum to post this topic, please feel free to move it somewhere else accordingly.
We are confronted with the necessity of truncating a cluster expansion capturing the energetics of adsorption patterns on a surface in the scope of modeling desorption processes. We came to the conclusion that MAPS could be a wonderful tool to achieve this by taking adsorption patterns for binary alloys consisting of hydrogen and vacancies.
We thus generated appropriate lat.in, ref_energy.in, and vasp.wrap files for our surface unit cell, and then invoked MAPS via "nohup maps -d -c0=0.0 -c1=0.49 -2d &". Thus, we calculated a stock of adsorption energies as proposed by MAPS while systematically neglecting adsorption patterns with concentrations >= 0.5 by marking those proposals with an error file in their respective subdirectories. This was done because we are, for now, suspecting only coverages < 0.5 to be relevant to our desorption model. Although we were surprised to be confronted with structure proposals outside of the ground state checking range, we considered it save to neglect them in that way. (However, we do wonder by this time whether this might have had caused side effects we have missed?)
Proceeding this way led us to the following very peculiar state.
According to the redirected MAPS stdout in nohup.out, a vast majority of cluster expansion truncations yielded a CV score of "3.40282e+38" (MAXFLOAT). We traced this down to the function "Real calc_cv(const Array2d<Real> &x, const Array<Real> &y)", implemented in atat/src/lstsqr.c++. This function returns MAXFLOAT, if it finds the denominator to be (too) close to zero for a given structure i with a given weight vector.
This left us puzzled, as thereby a vast majority of truncations are discarded for having a very bad prediction power - completely independent of the actual differences between the DFT energies and the estimated energies using these particular truncations.
We thus added some additional code to MAPS in order to find out which particular structures are those that cause the CV to explode. Deploying the thus patched MAPS and exhaustively removing all those "offending" structures, we finally arrive in a state without any MAXFLOAT CVs.
Those structures however don’t seem to have any obvious features in common, e.g. particularly large relaxations compared to the other, non-offending structures.
Therefore, we’d kindly ask how we could make sure that the "naive" O(n^2) expression (paper on MAPS, page 5, first expression) for the CV and its actually deployed O(n) version (paper on MAPS, page 5, second expression) really are equivalent in the hope that this might help us to find out whether there is a deeper reason for those structures responsible for this issue.
So we still wonder, could there be a deeper reason for those structures to produce a 0 denominator?
Please find attached all the files necessary to reproduce the described behavior:
- the MAPS working directory "prob_repro/", including the "cv_tnt.log" file containing the blocks of offending structures removed per run at once
- the altered src/refine.c++ and src/lstsqr.c++ files introducing our additional code into maps from the atat-3.04 toolkit.
Compiling and (un)install should work just as it does for the regular atat-3.04 toolkit. If there are any further questions, I’ll gladly answer them.
Thank you very much in advance!