Cross-Validation Score using 58 structures

Dear Axel,
Using your suggestions for my previous mail, I used only those structures (58 in all) which gave values of 0.08 using ‘checkrelax’. In all, about 143 structures have been relaxed out of which I have used only 58 for the CE. The maps.log says
Among structures of known energy, true and predicted ground states agree
New ground states with at most 0 atoms/unit cell predicted , see predstr.out
Concentration range used for ground state checking: [0,1].
Crossvalidation score: 0.0464436

even after using 58 structures, the CV score seems like it has not reduced as much I would have liked. The fit.out shows
0.000000 0.000000 -0.015777 0.015777 1.000000 0
1.000000 0.000000 -0.005833 0.005833 1.000000 1
0.500000 0.373912 0.366556 0.007356 1.000000 3
0.333333 -0.039216 -0.042080 0.002864 1.000000 6
0.666667 0.131494 0.123255 0.008240 1.000000 7
0.333333 -0.102284 -0.101480 -0.000804 1.000000 8
0.666667 0.091199 0.088110 0.003089 1.000000 9
0.500000 0.015790 -0.008680 0.024470 1.000000 11
0.750000 0.091187 0.095591 -0.004404 1.000000 12
0.500000 -0.069199 -0.057440 -0.011759 1.000000 14
0.750000 0.098053 0.109002 -0.010949 1.000000 15
0.500000 -0.068468 -0.055483 -0.012984 1.000000 17
0.750000 0.077433 0.074583 0.002849 1.000000 18
0.750000 0.064057 0.062470 0.001588 1.000000 23
0.250000 0.039165 0.033036 0.006129 1.000000 24
0.750000 0.082362 0.077444 0.004917 1.000000 25
0.250000 0.169684 0.178508 -0.008823 1.000000 26
0.500000 0.141996 0.134748 0.007249 1.000000 27
0.750000 0.124536 0.134116 -0.009580 1.000000 28
0.400000 0.095105 0.108484 -0.013379 1.000000 36
0.600000 -0.045474 -0.058986 0.013511 1.000000 39
0.600000 -0.073746 -0.070365 -0.003380 1.000000 43
0.200000 -0.017057 -0.016564 -0.000494 1.000000 51
0.400000 -0.082240 -0.075631 -0.006609 1.000000 52
0.400000 0.106191 0.102796 0.003395 1.000000 53
0.600000 0.003458 0.003689 -0.000231 1.000000 54
0.800000 0.054340 0.048493 0.005847 1.000000 56
0.666667 -0.076276 -0.075707 -0.000568 1.000000 102
0.500000 0.002447 0.028307 -0.025860 1.000000 108
0.333333 -0.084273 -0.079262 -0.005011 1.000000 59
0.500000 0.049651 -0.000300 0.049951 1.000000 60
0.666667 -0.045529 -0.034939 -0.010590 1.000000 62
0.500000 0.057443 0.058393 -0.000950 1.000000 68
0.333333 -0.054139 -0.054272 0.000134 1.000000 82
0.500000 -0.083845 -0.063126 -0.020719 1.000000 84
0.333333 0.135506 0.136919 -0.001413 1.000000 90
0.333333 0.022221 0.022215 0.000007 1.000000 91
0.714286 -0.067161 -0.068727 0.001566 1.000000 148
0.714286 -0.057404 -0.062216 0.004812 1.000000 150
0.428571 0.187993 0.182219 0.005774 1.000000 172
0.142857 -0.063480 -0.063126 -0.000354 1.000000 185
0.428571 -0.043499 -0.035537 -0.007962 1.000000 188
0.571429 0.015091 0.026033 -0.010942 1.000000 189
0.285714 -0.063546 -0.067924 0.004378 1.000000 194
0.428571 0.043980 0.047317 -0.003337 1.000000 212
0.285714 0.143917 0.140968 0.002949 1.000000 213
0.285714 -0.056177 -0.074964 0.018787 1.000000 226
0.285714 -0.044452 -0.053282 0.008830 1.000000 229
0.571429 -0.085657 -0.089284 0.003627 1.000000 235
0.500000 -0.065324 -0.082131 0.016807 1.000000 300
0.500000 0.032633 0.010886 0.021747 1.000000 339
0.625000 0.060017 0.059442 0.000576 1.000000 425
0.750000 -0.005549 -0.010996 0.005447 1.000000 426
0.500000 -0.043169 0.013139 -0.056308 1.000000 440
0.500000 -0.017156 -0.023369 0.006213 1.000000 586
0.750000 -0.006618 0.008508 -0.015126 1.000000 591
0.555556 -0.073364 -0.079564 0.006200 1.000000 690
0.111111 -0.090591 -0.062214 -0.028377 1.000000 715

Do you still recommend that more number of structures need to be used for reducing the CV score.
Moreover, in my case (bcc system), there are no compounds/intermetallics known or exits as per the established phase diagram. However gs.out shows
0.000000 0.000000 -0.015777 0
0.111111 -0.090591 -0.062214 715
0.333333 -0.102284 -0.101480 8
0.571429 -0.085657 -0.089284 235
0.666667 -0.076276 -0.075707 102
0.714286 -0.067161 -0.068727 148
1.000000 0.000000 -0.005833 1

I was only expecting 0 and 1 to be the ground states and a CE connecting only 0 and 1 structures. However, there are 5 structures which are predicted to be ground states.

My queries are therefore

  1. What does gs.out suggest in a CE…Is my CE wrong?
  2. If newgs.out suggests more structures with ‘bg’ tag, do we need to relax them for a more accurate CE. Will CV score decrease…

Please suggest
Suddhasattwa

First, it is possible that the published phase diagram reports no ordered compounds because the ground states you find would disorder at low temperature.

The problems you are experiencing converging the cluster expansion have to do with large relaxations away from ideal bcc. In some systems, a "bcc" phase is, in reality, only bcc "on average", with the atoms spending most of their times in distorted geometries. Ti and W are typical examples.
It is very difficult to do a cluster expansion in those cases. Could that be the case here?

Dear Axel,
If that is the case here, I have already excluded the structures which relax too far. Using checkrelax, I get
0.0000 0/str_relax.out
0.0000 1/str_relax.out
0.0000 26/str_relax.out
0.0000 27/str_relax.out
0.0000 28/str_relax.out
0.0000 3/str_relax.out
0.0018 25/str_relax.out
0.0020 212/str_relax.out
0.0020 433/str_relax.out
0.0024 14/str_relax.out
0.0031 934/str_relax.out
0.0045 591/str_relax.out
0.0050 12/str_relax.out
0.0050 220/str_relax.out
0.0052 54/str_relax.out
0.0055 221/str_relax.out
0.0072 586/str_relax.out
0.0077 7/str_relax.out
0.0111 211/str_relax.out
0.0111 213/str_relax.out
0.0119 188/str_relax.out
0.0123 23/str_relax.out
0.0132 60/str_relax.out
0.0134 52/str_relax.out
0.0138 36/str_relax.out
0.0147 53/str_relax.out
0.0149 590/str_relax.out
0.0158 223/str_relax.out
0.0215 56/str_relax.out
0.0225 583/str_relax.out
0.0234 185/str_relax.out
0.0248 95/str_relax.out
0.0274 189/str_relax.out
0.0277 426/str_relax.out
0.0286 81/str_relax.out
0.0290 339/str_relax.out
0.0291 425/str_relax.out
0.0309 18/str_relax.out
0.0317 235/str_relax.out
0.0320 58/str_relax.out
0.0334 6/str_relax.out
0.0335 15/str_relax.out
0.0351 51/str_relax.out
0.0357 24/str_relax.out
0.0362 82/str_relax.out
0.0367 17/str_relax.out
0.0368 39/str_relax.out
0.0394 149/str_relax.out
0.0427 91/str_relax.out
0.0428 43/str_relax.out
0.0431 533/str_relax.out
0.0435 229/str_relax.out
0.0458 194/str_relax.out
0.0516 715/str_relax.out
0.0532 172/str_relax.out
0.0537 84/str_relax.out
0.0541 440/str_relax.out
0.0559 35/str_relax.out
0.0616 231/str_relax.out
0.0621 11/str_relax.out
0.0632 690/str_relax.out
0.0643 68/str_relax.out
0.0645 102/str_relax.out
0.0650 150/str_relax.out
0.0650 62/str_relax.out
0.0667 103/str_relax.out
0.0676 300/str_relax.out
0.0685 226/str_relax.out
0.0688 90/str_relax.out
0.0691 148/str_relax.out
0.0722 59/str_relax.out
0.0858 108/str_relax.out
0.0948 422/str_relax.out
0.0954 2141/str_relax.out
0.0976 173/str_relax.out
0.1078 140/str_relax.out
0.1087 97/str_relax.out
0.1094 44/str_relax.out
0.1110 291/str_relax.out
0.1116 673/str_relax.out
0.1138 2062/str_relax.out
0.1170 944/str_relax.out
0.1209 481/str_relax.out
0.1212 40/str_relax.out
0.1212 98/str_relax.out
0.1229 142/str_relax.out
0.1232 606/str_relax.out
0.1273 137/str_relax.out
0.1273 50/str_relax.out
0.1329 628/str_relax.out
0.1371 253/str_relax.out
0.1373 527/str_relax.out
0.1395 205/str_relax.out
0.1418 109/str_relax.out
0.1448 381/str_relax.out
0.1476 13/str_relax.out
0.1499 65/str_relax.out
0.1500 256/str_relax.out
0.1520 536/str_relax.out
0.1548 526/str_relax.out
0.1555 2190/str_relax.out
0.1569 129/str_relax.out
0.1596 237/str_relax.out
0.1637 42/str_relax.out
0.1703 122/str_relax.out
0.1734 2035/str_relax.out
0.1737 105/str_relax.out
0.1743 107/str_relax.out
0.1745 2/str_relax.out
0.1762 10/str_relax.out
0.1771 57/str_relax.out
0.1775 77/str_relax.out
0.1790 347/str_relax.out
0.1793 37/str_relax.out
0.1801 20/str_relax.out
0.1828 67/str_relax.out
0.1835 365/str_relax.out
0.1838 21/str_relax.out
0.1860 83/str_relax.out
0.1872 399/str_relax.out
0.1884 193/str_relax.out
0.1923 19/str_relax.out
0.1937 121/str_relax.out
0.1943 602/str_relax.out
0.1967 600/str_relax.out
0.1980 16/str_relax.out
0.1990 952/str_relax.out
0.2006 49/str_relax.out
0.2127 175/str_relax.out
0.2130 38/str_relax.out
0.2263 46/str_relax.out
0.2348 500/str_relax.out
0.2468 199/str_relax.out
0.2653 41/str_relax.out
0.2852 196/str_relax.out
0.2892 124/str_relax.out
0.2973 33/str_relax.out
0.3094 32/str_relax.out
0.3192 30/str_relax.out
0.3574 34/str_relax.out
0.3724 191/str_relax.out
0.3960 398/str_relax.out
0.4618 22/str_relax.out
0.4663 238/str_relax.out
0.5273 127/str_relax.out
0.5889 45/str_relax.out

I only use those structures which give values below 0.08. Moreover, I managed to exclude some more structures from the fit to get a CV score of 0.0331496 which is little higher than the required CV of 0.025.
What is the solution now? predstr.out shows some more new structures either with ‘b’ or with ‘eg’ (which I excluded).

Secondly, considering the fact that bcc lattice does not remain bcc after full relaxation, can we do only volume+shape relaxation (ISIF=6) for CE… Will it give a more converged CE.

Thanks
Suddhasattwa

Hi Suddhasattwa,

It is possible to constrain the cell shape during relaxation by changing some code in main.F. Specifically, replace

    IF (DYN%ISIF<5) FACT=10*DYN%POTIM*EVTOJ/AMTOKG *1E-10_q

by

    IF (DYN%ISIF<5 .OR. DYN%ISIF==8) FACT=10*DYN%POTIM*EVTOJ/AMTOKG *1E-10_q

and replace

    IF (DYN%ISIF==7) THEN

by

    IF (DYN%ISIF==7 .OR. DYN%ISIF==8) THEN

After you recompile VASP, use ISIF=8 in INCAR. The volume and ionic positions are relaxed but not the cell shape. The distortion value should then be 0.

In case you decide to use this method, we would appreciate if you could cite our paper:
R. Sun and A. van de Walle, "Automating impurity-enhanced antiphase boundary energy calculations from ab initio Monte Carlo," Calphad 53, 20 (2016).
https://dx.doi.org/10.1016/j.calphad.2016.02.005

The CV score need not be strictly less than 0.025.

Ruoshi