Hello all!
I am working on predicting stable perovskite phases. I am trying to use Kernel Ridge to perform calculations of the energies during MC steps, instead of the typical linear models.
I previously emailed Prof Erhart about this, who gave me quite a lot of useful starting points on modifying it. After taking a look at the code, I’ve been wondering if there is a need to modify ClusterExpansionCalculator.cpp at all, or would modifying calculate_total() and calculate_change() in the ClusterExpansionCalculator.py file be sufficient?
My understanding is that ClusterExpansionCalculator.cpp calculates the correlation vectors, but Kernel Ridge uses the same correlation vectors. Only difference is that the Kernel Ridge part calculates a gram matrix based on the correlation vectors (modifying it to a dimension of n_samples x n_samples).
Any advice clarifying this would be great. Thank you and I look forward to learning from you guys!
Yes I think quickly looking at the code that changing cpp would not be necessary.
You can try to modify or write your own ClusterExpansionCalculator, and use it to predict energies for some structures, if they come out as expected then it should be fine?
Also if u run a short MC simulation u can check that for a few snapshot in the trajectory that the recorded energy is as expected.
1 Like
Hey Erik, thank you for the reply! I have modified the code and it is now using KRR to predict the energy (using scikitlearn’s .predict()). I believe other scikitlearn models can be used directly with the modifications.
One thing is that the simulation takes a much longer time, due to the need to calculate total energy instead of energy change since the model is nonlinear.
Comparing a MC run with a linear model and one with the KRR model seems to show good agreement in the energies. This is for a system where the initial R2 accuracy for both models are ~0.99. I will continue to test it against different species of compounds I have to ensure it works properly.