Bag of Bonds raises attribute error: specie

I have a dataframe with just one column comprising the structures of some materials. The dataframe is indexed with their cif_id. I was simply testing out some featurizers, such as the BagofBonds and I get a ‘specie’ error.

My code is simply:

from matminer.featurizers.structure import BagofBonds
featurizer = BagofBonds()
ftr = featurizer.fit_featurize_dataframe(df_structure.iloc[0:5, :], 'structure')

This raises the following error:

AttributeError: specie

Capture

For the record this is the dataframe.

Hi there, can you post the full stacktrace?

I am guessing this is from the sites containing elements, not species. I think there is some sort of method in pymatgen you can apply for each structure, like decorate_oxi_states or something similar which will allow you to run the BoB featurizer!

This is what I got. I couldn’t upload a pickled dataframe since I’m a new user so I’m putting it here on google drive if you need it.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Input In [58], in <cell line: 32>()
     29 #from matminer.featurizers.structure import BagofBonds
     31 featurizer = BagofBonds()
---> 32 ftr = featurizer.fit_featurize_dataframe(df_structure_copy.iloc[0:5, :], 'structure')

File ~\AppData\Roaming\Python\Python38\site-packages\matminer\featurizers\base.py:272, in BaseFeaturizer.fit_featurize_dataframe(self, df, col_id, fit_args, *args, **kwargs)
    270 if fit_args is None:
    271     fit_args = []
--> 272 return self.fit(df[col_id], *fit_args).featurize_dataframe(df, col_id, *args, **kwargs)

File ~\AppData\Roaming\Python\Python38\site-packages\matminer\featurizers\structure\bonding.py:535, in BagofBonds.fit(self, X, y)
    519 def fit(self, X, y=None):
    520     """
    521     Define the bags using a list of structures.
    522 
   (...)
    533         self
    534     """
--> 535     unpadded_bobs = [self.bag(s, return_baglens=True) for s in X]
    536     bonds = [list(bob.keys()) for bob in unpadded_bobs]
    537     bonds = np.unique(sum(bonds, []))

File ~\AppData\Roaming\Python\Python38\site-packages\matminer\featurizers\structure\bonding.py:535, in <listcomp>(.0)
    519 def fit(self, X, y=None):
    520     """
    521     Define the bags using a list of structures.
    522 
   (...)
    533         self
    534     """
--> 535     unpadded_bobs = [self.bag(s, return_baglens=True) for s in X]
    536     bonds = [list(bob.keys()) for bob in unpadded_bobs]
    537     bonds = np.unique(sum(bonds, []))

File ~\AppData\Roaming\Python\Python38\site-packages\matminer\featurizers\structure\bonding.py:574, in BagofBonds.bag(self, s, return_baglens)
    572 for i, si in enumerate(sites):
    573     for j, sj in enumerate(sites):
--> 574         el0, el1 = si.specie, sj.specie
    575         if isinstance(el0, Specie):
    576             el0 = el0.element

File ~\AppData\Roaming\Python\Python38\site-packages\pymatgen\core\sites.py:79, in Site.__getattr__(self, a)
     77 if a in p:
     78     return p[a]
---> 79 raise AttributeError(a)

AttributeError: specie

I think you’re correct. When I used .species on different members of the dataframe, I got an attribute error for the first two but not the third. Instead I get a list:

[Element Fe,
 Element Fe,
 Element Fe,
...
]

I was testing the OrbitalFieldMatrix and that gave the same attribute error for the first two but threw out a result for the next ones.

Not entirely sure how to work around it though.

For an individual structure, you can use Structure.add_oxidation_state_by_guess method: Read more here: pymatgen.core.structure module — pymatgen 2022.7.25 documentation

For many structures, use the StructureToOxidStructure featurizer class from matminer.featurizers.conversions before doing BoB.

Note you can use the PymatgenFunctionApplicator from matminer.featurizers.conversions as well except this will require a little programming on your part. However it does afford more flexibility about how the oxidation states are applied. See the source code here: matminer/conversions.py at 7f8520b97175db3c4fc6afe055cee664ebd77238 · hackingmaterials/matminer · GitHub

1 Like