Condition Number is large but cutoff is minimal

Hi,
I am trying to build a Cluster Expansion for a kinetic Monte Carlo. In the KMC Lithium should move via transition states. If I fit a Cluster expansion without transition states it works but as soon I take the structures with transition states into account I get ‘Condition number is large warning’ no matter the pair cutoff (tried some minimal ones; the best for the Ce without transition states and an unreasonable high one), I wanted to find the best for it.
I got 62 configurations which have one atom at one transition state and 177 with no atom at a transition State.
Could the Error be due to the fact I never have two atoms at the transition state sublattice? I have a maximum of one atom at the transition state, which is wanted since in a KMC only one Atom moves at every Event.
Would also be great to define the sublattice in a way only Clusters with a maximum of 1 atom at that Lattice would be build in the first place.

Im not very familar with CEs for KMC, could you add explain in more detail how you make a CE with without the transtion state?
Do you e.g. add a “dummy” atom at the transition state position?

If this is the case then yea could be due to not having any such pairs in the training data.

In general large condition number mean you can not trust the fit, because the fitting matrix is ill-conditioned / unstable / poorly behaved.

But in some cases this isnt necessarily a problem.
For example if you’re training structures all have the same concentration, then condition number will be large because two columns in the fitting matrix are perfectly correlated.
But the resulting CE will still work fine to predict energies for structures with exactly this concentration, but will predict nonsense for structures with a different concentration.

So possibly you’re CEs is fine to use, depends on why you get large condition number.
Small validation errors can be an indication that its fine to use.

I just dont read the Structures which have no Atom at the Transition state and take a prim without Transition States to get a Cluster Expansion without transition atates

I try to map over all Li-concentrations, so at least that is not the case.
Also, when scanning over possible cutoffs the model with transition states doesn’t have a clear trend to reduce RMSE_Validation

My plan at the moment is to find the best cutoffs for the model without Transition states then use these and try to scan again with transition states, since I don’t get the warning if I give three cutoffs, for reasons I don’t understand.

Not sure if this is helpful, but it might be worth keeping in mind.
The condition number (implemented in numpy) will not give a large condition number for underdetermined systems.
In this case, I do not even think trainstation (the fitting library that icet is using) will calculate the condition number for you.

Well I am not sure, but there is one given(the program prints a number after the warning), so I believe it does