Access sklearn model in automatminer

I went through the docs, tutorials, this forum, etc. and did not find an answer to this question. Apologies in advance if I missed something really obvious.

When using MatPipe in automatminer, is it possible to access the sklearn model object that is ultimately selected (best scoring model)? For example, if a random forest is selected, can I export the model as a joblib (sklearn default) file?

As a followup, I see I can examine the best result found, but is it possible to see/export a score report for all models and hyperparameters that were tried during the TPOT AutoML process?

Thanks in advance!

1 Like

Hey there,

Yes, it is possible. You can access it via pipe.learner.best_pipeline, which will return the sklearn Pipeline object, which in turn contains the model and all needed numerical transformations.

It is also possible to see and export the “score report” using pipe.learner.best_models, which returns an OrderedDict of the pipelines which were tried and their corresponding internal CV scores.

One thing to note though is that you will not be able to access best_models from a saved/loaded matpipe, only from one which has not been deserialized from a file; this is because of pickling problems that joblib+multiprocessing+tpot causes. If you train the pipe and generate a score report without serializing the pipeline though, you should be fine!

Thanks for pointing out this is not in the docs… I will add it :slight_smile:

Thanks,
Alex

1 Like

By the way, you can see a full attribute tree (displaying all the info/names of everything the MatPipe contains) using pipe.inspect(); read more in the docs here under “Inspect your Pipeline”:

https://hackingmaterials.lbl.gov/automatminer/basic.html

Thanks for the quick reply Alex!

1 Like