ConvergenceWarning: Objective did not converge

Hi all,
When I’m training the model in ICET by using CrossValidationEstimator, I got this warning:
ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.090e+02, tolerance: 2.056e+02

However, I don’t know how to increase the number of iterations in CrossValidationEstimator. I checked the scripts of trainstation, there’s no parameter in CrossValidationEstimator that allows me to achieve this. Also, I want to know how to increase the ‘tol’ value by using this method.

Even though I got the warning, the R2 and RMSE value I got seems to be okay. Is there something wrong with this output as well?
============== CrossValidationEstimator ==============
seed : 42
fit_method : lasso
standardize : True
n_target_values : 4530
n_parameters : 5341
n_nonzero_parameters : 5339
parameters_norm : 8.734451
target_values_std : 0.3871816
rmse_train : 0.02852375
R2_train : 0.9945684
AIC : -21544.99
BIC : 12723.26
alpha_optimal : 1e-08
validation_method : k-fold
n_splits : 10
rmse_train_final : 0.02853506
rmse_validation : 0.03298181
R2_validation : 0.9926455
shuffle : True

Can someone tell me how to fix this warning?

Thank you so much!

For the documentation of specific fitting algorithms it is probably best to refer to the sklearn documentation, see here for lasso.
As you can see there are tol and max_iter arguments, and you can use them in the CrossValidationEstimator by passing them as kwargs.

The warning doesnt necessarily mean the solution is bad or wrong, it just didnt fully converge.

Thank you for your reply!
Now I figured that the warning is due to the large value of cutoff I set. According to the information on ICET website, it says if we get the warning: “Condition number is large”, it means that the linear problem we are solving is ill conditioned and thus we cannot trust the resulting cluster expansion for these cutoffs.

I got this warning several times, does this mean that I need to keep decreasing the cutoff values till the warning disappear? Also, when I’m doing the test of cutoff selections, the warning messages print out together, which it’s impossible for me to figure out which cutoff values caused this warning. How can I distinguish this?

Thank you so much!

Yes, I dont think you should trust a CE if you get Condition number is large (unless there is some reason like concentration restrictions why you get the warning).

In the above message you have ~5400 parameters, which sounds like a very large number for cluster-expansions. I would double check that you are using the ideal primitive cell as input to the ClusterSpace, and if that does not help then yes reduce the cutoff is a good idea.

the warning messages print out together, which it’s impossible for me to figure out which cutoff values caused this warning. How can I distinguish this?

If you just loop over cutoffs I dont think you should have this problem.

for cutoffs in cutoffs_vals:
    print('cutoffs:', cutoffs)
    cs = ClusterSpace(...)
    ...
    cve = CrossValidationEstimator(...)
    cve.validate()
    print(' --------------------------------')

Note that the CrossValidationEstimator may give rise to multiple warnings depending on how many of the CV-splits are ill-conditioned.

Thank you for the reply!

I’ve followed your advice and reduced the cutoff to [11.5, 5.5, 3.5] (I feel like the 4th cutoff is a little bit small), and the n_parameters now is 93. Is this a normal number?

Also, you mentioned that using the ideal primitive cell as input will lead to a large value of n_parameters. Actually I use the cells that were mapped from the relaxed structures by using map_structure_to_reference because when I try to add the relaxed structures directly into structure container, it will end with errors. I wonder if it’s okay to use the structures that were mapped back from the relaxed structures as input?

Thank you so much.

The cutoffs for a CE should be selected careful in order to achieve a good model, this can be done by e.g. trying many different cutoffs and checking the CV-score.
On the order of 100 parameter is normal for CEs, but if its too many or too few depends on how many training structures you have.

Also, you mentioned that using the ideal primitive cell as input will lead to a large value of n_parameters.

No using the ideal primitive as input to the ClusterSpace (not StructureContainer) leads to the correct number of parameters, if the CS is initlaized with a relaxed structure it may give you too many parameters (due to the relaxed structure not obeying all symmetries)

I wonder if it’s okay to use the structures that were mapped back from the relaxed structures as input?

Yes you are supposed to give the StructureContainer the ideal structures.