probably a short (maybe stupid) question, but I couldn’t find an answer in the ICET documentation or the papers.
The properties that we feed in order to train a cluster expansion, do they need to be provided as extensive properties? For example, if I want to train a CE on the total energy of a system, do I need to provide the total energy as calculated for every single training structure? Or does ICET rely on normalized quantities and I need to normalize the total energy by the number of atoms in the respective training structure first?
I ask because my first feeling was to use total energies or energies of mixing in the way that they scale with the corresponding system size. Double the system = double number of present clusters = double the predicted total energy. But using the total energies as-is, the RMSE value looks really unreasonable, and I am not sure if it is a problem of the fitting or the data itself (because not normalized). Normalizing it looks much better, but then also the numbers are much smaller and the RMSE will be lower anyways…
Could anybody clarify?
Thanks a lot,