How to get the “structure” feature?

Hey Matminer! I have some cif files downloaded from the recent experiment, if I want to obtain the “structure” column as shown in the image below, could you tell me how to get this column of data? Thank you very much for your reply.

I note that CompositionToStructureFromMP() only works if my compositions are ones that in the MP database,


Hi @obaica

So you can convert each of the cifs (one by one) to pymatgen structure objects using the pymatgen Structure method called from_file: pymatgen.core.structure module — pymatgen 2022.8.23 documentation

To convert a single one, you’d do:

structure_object = Structure.from_file("/path/to/my_file.cif")

If you have many of these cifs and would like to put them in a dataframe as per your picture, you can use the matminer.featurizers.conversions PymatgenFunctionApplicator method like I show below (see here for src code: matminer/ at 7f8520b97175db3c4fc6afe055cee664ebd77238 · hackingmaterials/matminer · GitHub). The requirement is that you have the cif filenames either in a python list or a dataframe column

fileconverter = PymatgenFunctionApplicator(func=Structure.from_file, target_col_id="structure")

# if you want them as a list
# assuming your filenames are in an iterable called cif_filenames
structures = fileconverter.featurize_many(cif_filenames)

# if you want them as a dataframe
# assuming your cif filenames are in a df called "df" under a column name "cif_filenames"
df_with_structures = fileconverter.featurize_dataframe(df, "cif_filenames")
1 Like

Thank you very much for your patient and detailed reply.

This answer is very helpful to me.

1 Like