Dear developers,
I’m trying to save the best automatminer pipeline after optimization for a particulat dataset using the MatPipe.save() function as the optimization takes quite some time. It dumps the pipeline fine. However, when I’m loading it using MatPipe.load() and using it to predict some unknown data, it throws the error ‘Pipeline’ object has no attribute ‘fitted_pipeline_’. I understand this has something to do with removing the backend and replacing it with the best pipeline while saving which the predict() function isn’t being able to comprehend, but was unable to solve the problem myself. The commands and the outputs from the screen are pasted below.
pipe = MatPipe.load(‘pipe.pickle’)
2019-10-04 17:44:28 INFO Loaded MatPipe from file pipe.pickle.
2019-10-04 17:44:28 WARNING Only use this model to make predictions (do not retrain!). Backend was serialzed as only the top model, not the full automl backend.pipe.predict(df)
2019-10-04 17:44:38 INFO Beginning MatPipe prediction using fitted pipeline.
2019-10-04 17:44:38 INFO AutoFeaturizer: Starting transforming.
2019-10-04 17:44:38 INFO AutoFeaturizer: composition column already exists, overwriting with composition from structure.
2019-10-04 17:44:38 INFO AutoFeaturizer: Guessing oxidation states of structures if they were not present in input.
StructureToOxidStructure: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2/2 [00:00<00:00, 11.49it/s]
StructureToComposition: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2/2 [00:00<00:00, 12.21it/s]
2019-10-04 17:44:39 INFO AutoFeaturizer: Guessing oxidation states of compositions, as they were not present in input.
CompositionToOxidComposition: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2/2 [00:00<00:00, 12.08it/s]
2019-10-04 17:44:39 INFO AutoFeaturizer: Featurizing with ElementProperty.
ElementProperty: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2/2 [00:00<00:00, 11.96it/s]
2019-10-04 17:44:39 INFO AutoFeaturizer: Guessing oxidation states of structures if they were not present in input.
StructureToOxidStructure: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2/2 [00:00<00:00, 13.03it/s]
2019-10-04 17:44:40 INFO AutoFeaturizer: Featurizing with SineCoulombMatrix.
SineCoulombMatrix: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2/2 [00:00<00:00, 12.05it/s]
2019-10-04 17:44:40 INFO AutoFeaturizer: Featurizer type bandstructure not in the dataframe. Skipping…
2019-10-04 17:44:40 INFO AutoFeaturizer: Featurizer type dos not in the dataframe. Skipping…
2019-10-04 17:44:40 INFO AutoFeaturizer: Finished transforming.
2019-10-04 17:44:40 INFO DataCleaner: Starting transforming.
2019-10-04 17:44:40 INFO DataCleaner: Cleaning with respect to samples with sample na_method ‘fill’
2019-10-04 17:44:40 INFO DataCleaner: Replacing infinite values with nan for easier screening.
2019-10-04 17:44:40 INFO DataCleaner: One-hot encoding used for columns [‘material’, ‘dir’, ‘XY’, ‘E’]
2019-10-04 17:44:40 INFO DataCleaner: Before handling na: 2 samples, 162 features
2019-10-04 17:44:40 INFO DataCleaner: 0 samples did not have target values. They were dropped.
2019-10-04 17:44:40 WARNING DataCleaner: Mismatched columns found in dataframe used for fitting and argument dataframe.
2019-10-04 17:44:40 WARNING DataCleaner: Coercing mismatched columns…
2019-10-04 17:44:40 INFO DataCleaner: After handling na: 2 samples, 143 features
2019-10-04 17:44:40 INFO DataCleaner: Reordering columns…
2019-10-04 17:44:40 INFO DataCleaner: Finished transforming.
2019-10-04 17:44:40 INFO FeatureReducer: Starting transforming.
2019-10-04 17:44:40 INFO FeatureReducer: Finished transforming.
2019-10-04 17:44:40 INFO TPOTAdaptor: Starting predicting.
Traceback (most recent call last):
File “”, line 1, in
File “/home/mag1/atomate/atomate_env/lib/python3.6/site-packages/automatminer/utils/pkg.py”, line 65, in wrapper
return func(*args, **kwargs)
File “/home/mag1/atomate/atomate_env/lib/python3.6/site-packages/automatminer/pipeline.py”, line 170, in predict
predictions = self.learner.predict(df, self.target)
File “/home/mag1/atomate/atomate_env/lib/python3.6/site-packages/automatminer/utils/pkg.py”, line 65, in wrapper
return func(*args, **kwargs)
File “/home/mag1/atomate/atomate_env/lib/python3.6/site-packages/automatminer/utils/log.py”, line 94, in wrapper
result = meth(*args, **kwargs)
File “/home/mag1/atomate/atomate_env/lib/python3.6/site-packages/automatminer/automl/base.py”, line 115, in predict
y_pred = self.best_pipeline.predict(X)
File “/home/mag1/atomate/atomate_env/lib/python3.6/site-packages/automatminer/automl/adaptors.py”, line 197, in best_pipeline
return self.backend.fitted_pipeline
AttributeError: ‘Pipeline’ object has no attribute 'fitted_pipeline’_
Please have a look and let me know how can I save the best pipeline as a file and use it later to make predictions on unseen data.
Regards,
Arnab