I am currently carrying out CE of a bcc system. Approx. 50 structures have been fully relaxed. However, when I use checkrelax, it shows some values which are more than 0.1…something like
0.0532 172/str_relax.out
0.0559 35/str_relax.out
0.0621 11/str_relax.out
0.0676 300/str_relax.out
0.0688 90/str_relax.out
0.0858 108/str_relax.out
0.0948 422/str_relax.out
0.0954 2141/str_relax.out
0.0976 173/str_relax.out
0.1087 97/str_relax.out
0.1094 44/str_relax.out
0.1170 944/str_relax.out
0.1212 40/str_relax.out
0.1212 98/str_relax.out
0.1418 109/str_relax.out
Do we need to use, say structures 40, 98 or 109 in the CE? Does the CV score get influenced by these structures…
Structures that relax a lot are no longer really bcc, so it is advisable to exclude them from the fit of a bcc CE. The exact cutoff is not universal. I usually try to find a "gap" in the relaxation values (because that would correspond the "hill" between two local energy minima). In your example 0.08 seems like a good cutoff.
BTW if all structures that relax too much are in the same concentration range, it may be better to simply exclude based on concentration.
I have a related question to this. I have noticed in my calculations that most of the structures will relax to somewhat larger volumes than given apriori in the input. Especially when the concentration of a certain atomic species on a given sublattice increases the c-lattice vector expands, which leads to much higher relaxation values. Actually it would seem to be possible to start the cluster expansion over again with a new "guess" of the c-lattice vector so as to achieve smaller relaxation values for the respective structures. Of course I anticipate other structures to relax to smaller c-lattice vectors. The relaxation values would then turn out negative in these cases. Do I assume correctly?
What I am trying to say is, that by doing this, on average the relaxation values will be more equally distributed among the structures, hence my guess is that the cluster expansion will be more accurate.
The checkrelax is only sensitive to the nonisotropic component of the cell shape change. So plain volume change counts as "0", but c/a ratio changes would yield nonzero reported distortions.
Note that the initial guess of the c/a ratio has no effect on maps or on the cluster expansion - it would only affect your decision regarding which structure as relaxed "too much" and that decision would affect the cluster expansion.
It is true that if the initial guess of the c/a ratio is poor, this may tend to inflate the reported distortion: if the symmetry of the lattice (lat.in) allows c/a change, then even large relaxations changing the c/a ratio may be harmless (it’s still the same lattice). Still, if there are very large changes in c/a ratio, then one has to be cautious, which is why the checkrelax script is set up this way.
If you dig into the checkrelax script, you will see it calls the code checkcell and, with the command checkcell -p you will get the full strain between str.out and str_relax.out which may help you make a better decision on which structure to exclude.
BTW, don’t change the */str.out files - maps will likely no be able to read them back.
You can give alternate geometries in a str_hint.out file, if you want to give a better guess of the initial geometry for the ab initio code. This file is not read by checkrelax, however.