Capturing high-energy clusters

PatrickJTaylor · March 25, 2022, 2:04pm

Hi All,

I am currently studying a system in which the following regularly occurs: I generate a structure which contains several clusters which (by chemical intuition) I would expect to be quite high energy. However, during DFT relaxation, these clusters effectively disappear (i.e. the relevant ions migrate in such a way as to remove these clusters). This means that my final fits look good in terms of CV score, but ultimately produce unphysical Monte Carlo trajectories due to the presence of these high-energy clusters which have not been captured in my training set (and are thus associated with 0 ECIs).

I would very much appreciate any suggestions as to how this issue can (in general) be overcome. I should add that I am currently modelling a single composition (which contains sufficient vacancies to allow the ion migrations mentioned above) and would ideally like to resolve this problem without widening the compositional space (otherwise I will likely require a much larger training set).

magnusrahm · March 25, 2022, 3:15pm

Hi @PatrickJTaylor,

Just to make sure I understand your problem: are the ECIs for these clusters zero because they never occur in your training structures and are thus “removed by regularization”? Or do you train your cluster expansion on structures as they looked before relaxation but using the relaxed energy (meaning the energy doesn’t properly reflect the energy of the training structures)?

PatrickJTaylor · March 25, 2022, 3:48pm

Hi @magnusrahm - thank you for the prompt response.

Your former guess is the correct explanation, the ECIs are zero as they technically never occur in the training set.

magnusrahm · March 25, 2022, 5:24pm

Ok, that sounds tricky then. Some random thoughts and ideas that may or may not be helpful:

If one can be sure that they have too high energy to be relevant thermodynamically one could consider artifically setting that ECI to something very high such that they are always rejected in Monte Carlo. If the structures always relax away from that cluster with DFT I would bet they will basically never be encountered “IRL” anyway.
Alternatively, it is not too difficult to implement customized Monte Carlo moves in mchammer, which could let you stay away from certain configurations.
If you still want an actual meaningful energy, I think it might be somewhat questionable to do it with a lattice-based model (such as a cluster expansion) given the long relaxation distances.
If you’ve kept full DFT relaxation trajectories you could perhaps try to train a model based on initial configurations (which should contain the problematic cluster) and see what you get for that ECI.

PatrickJTaylor · March 29, 2022, 12:41pm

Thank you kindly for the suggestions - I really appreciate it.

As to your first point, I wonder if you could help me understand how to identify orbits in ICET? The output from [ClusterSpace object].get_coordinates_of_representative_cluster() seems enough for me to identify the sites included in a representative cluster, but not which species sit on those sites? Apologies if I’m simply missing something in the documentation.

magnusrahm · April 1, 2022, 7:19am

Happy to help!

The orbits are the geometric arrangements of sites, so an orbit doesn’t in itself come with certain occupation. This means an ECI is not a measure of, for example, the A-A, B-B or A-B interaction, but rather a difference thereof (this is equivalent to the Ising model where you have only one interaction parameter). In the case of alloys with three components or more, it’s a bit more complicated (a very brief review is included in the icet paper), but the principle is the same.

While writing this I realized that maybe you are rather after something like this:

from icet import ClusterSpace
from ase.build import bulk

# Create dummy ternary cluster space
prim = bulk('Au')
cs = ClusterSpace(prim, [5.0], ['Au', 'Ag', 'Cu'])
print(cs)

# Occupy a supercell
supercell = prim.repeat((3, 1, 1))
supercell[0].symbol = 'Ag'
supercell[1].symbol = 'Cu'

# Count and print the clusters in the supercell
counts = cs.orbit_list.get_cluster_counts(supercell,
                                          fractional_position_tolerance=1e-6)  # Anything small should work
for orbit_index, counts in counts.items():
    print(orbit_index, counts)

That snippet counts the occupations of the different clusters that constitute each orbit. I think the output is rather self-explanatory, otherwise let me know.

Jianli_Cheng · June 3, 2022, 9:17pm

I am getting this error when running the code snippet:
AttributeError: 'OrbitList' object has no attribute 'get_cluster_counts'
Maybe I am using the old icet? I am using 1.4.

magnusrahm · June 7, 2022, 11:28am

@Jianli_Cheng: Yes, please upgrade to the most recent version of icet to use this function. In icet 1.4 there was something similar, which I think was placed in a separate module (cluster_counts.py) but still I would strongly recommend more recent versions as they are also much faster for MC simulations.

Jianli_Cheng · June 7, 2022, 3:06pm

Got it. Thank you so much for your reply!