pymatgen structure_prediction?

Hi Y’all,

Is it possible to use the pymatgen structure_prediction within automatminer to generate structural variables (given just the formula) to attach to the data frame? Or is there a better approach? Or is it a bad idea?Just wondered… Thank you!

Sincerely,

tom

Hi Thomas,

I think this feature would be a nice additional to matminer instead of/in addition to automatminer.

I haven’t used the structure_prediction model (specifically, I’m looking at Substitutor), so I can’t say much about how easy it would be to automate. But, the methods underlying it are very sound and it would fill a nice need for providing additional features to compounds that are crystalline but the structure is not provided in the dataset.

Do you want to take a crack at it? It would be good to add it in to the conversions.py module.

Best,

Logan

···

From: thomas heiman
Sent: Saturday, January 5, 2019 4:36 PM
To: matminer
Subject: pymatgen structure_prediction?

Hi Y’all,

Is it possible to use the pymatgen structure_prediction within automatminer to generate structural variables (given just the formula) to attach to the data frame? Or is there a better approach? Or is it a bad idea?Just wondered… Thank you!

Sincerely,

tom


You received this message because you are subscribed to the Google Groups “matminer” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
For more options, visit https://groups.google.com/d/optout.

Hi all,

In my opinion such a technique could be risky. There is really no guarantee that the structure produced by structure predictor will correspond to the one in your data set. The structure prediction could be incorrect, or even if the predictor correctly predicts the ground state structure (its job), this might not be the same structure that is in your data set. If you end up adding features for the wrong structure, it’s not clear that this is going to help your machine learning.

Also note that structure predictions are unoptimized, so even in the case of a correct prediction that actually represents the material in your data set (best case scenario), things like bond lengths, lattice parameter values, etc. will not be exactly right. It is typically expected that one will feed the results of a structure predictor into a DFT run to refine these parameters. If you do end up featurizing based on a structure prediction, I would suggest preferring features that rely more on overall topology of the structure or is more volume-independent rather than features that depend heavily on specifics of bond lengths or atom distances.

This could still be interesting to try, but I think it will require a lot of validation to see if it works. So I’d suggest some testing of course. I’m a little afraid that this might make things better on average, while making the worst cases worse than they would have been if you didn’t do this.

Another idea, rather than picking a single structure from structure predictor, is to take a weighted average of features from the top 10 structures produced by structure predictor (weighted by probability). But this will add complexity and take more time to featurize.

As for where to implement, I agree that conversions.py is a good place.

Best,

Anubhav

···

On Sunday, January 6, 2019 at 7:39:18 AM UTC-8, Logan Ward wrote:

Hi Thomas,

I think this feature would be a nice additional to matminer instead of/in addition to automatminer.

I haven’t used the structure_prediction model (specifically, I’m looking at Substitutor), so I can’t say much about how easy it would be to automate. But, the methods underlying it are very sound and it would fill a nice need for providing additional features to compounds that are crystalline but the structure is not provided in the dataset.

Do you want to take a crack at it? It would be good to add it in to the conversions.py module.

Best,

Logan

From: thomas heiman
Sent: Saturday, January 5, 2019 4:36 PM
To: matminer
Subject: pymatgen structure_prediction?

Hi Y’all,

Is it possible to use the pymatgen structure_prediction within automatminer to generate structural variables (given just the formula) to attach to the data frame? Or is there a better approach? Or is it a bad idea?Just wondered… Thank you!

Sincerely,

tom


You received this message because you are subscribed to the Google Groups “matminer” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
For more options, visit https://groups.google.com/d/optout.

Hi Logan,

Sure. It may take awhile (job, teaching on the side, family and life:))… My primary programming language is R. However, I have been using a lot of python lately. Silly question for you: Substitutor requires oxidation states… How can I get that automatically from a formula i.e. are there already existing methods for that? Thank you!

Sincerely,

tom

···

On Sunday, January 6, 2019 at 10:39:18 AM UTC-5, Logan Ward wrote:

Hi Thomas,

I think this feature would be a nice additional to matminer instead of/in addition to automatminer.

I haven’t used the structure_prediction model (specifically, I’m looking at Substitutor), so I can’t say much about how easy it would be to automate. But, the methods underlying it are very sound and it would fill a nice need for providing additional features to compounds that are crystalline but the structure is not provided in the dataset.

Do you want to take a crack at it? It would be good to add it in to the conversions.py module.

Best,

Logan

From: thomas heiman
Sent: Saturday, January 5, 2019 4:36 PM
To: matminer
Subject: pymatgen structure_prediction?

Hi Y’all,

Is it possible to use the pymatgen structure_prediction within automatminer to generate structural variables (given just the formula) to attach to the data frame? Or is there a better approach? Or is it a bad idea?Just wondered… Thank you!

Sincerely,

tom


You received this message because you are subscribed to the Google Groups “matminer” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
For more options, visit https://groups.google.com/d/optout.

Probably you already found the answer but there are two automatic methods for oxidation states:

  1. add_charges_from_oxi_state_guesses() in Composition which only works on integer compositions (you need to convert first if non-integer)

  2. BVAnalyzer which requires a structure object

···

On Monday, January 7, 2019 at 6:51:56 AM UTC-8, thomas heiman wrote:

Hi Logan,

Sure. It may take awhile (job, teaching on the side, family and life:))… My primary programming language is R. However, I have been using a lot of python lately. Silly question for you: Substitutor requires oxidation states… How can I get that automatically from a formula i.e. are there already existing methods for that? Thank you!

Sincerely,

tom

On Sunday, January 6, 2019 at 10:39:18 AM UTC-5, Logan Ward wrote:

Hi Thomas,

I think this feature would be a nice additional to matminer instead of/in addition to automatminer.

I haven’t used the structure_prediction model (specifically, I’m looking at Substitutor), so I can’t say much about how easy it would be to automate. But, the methods underlying it are very sound and it would fill a nice need for providing additional features to compounds that are crystalline but the structure is not provided in the dataset.

Do you want to take a crack at it? It would be good to add it in to the conversions.py module.

Best,

Logan

From: thomas heiman
Sent: Saturday, January 5, 2019 4:36 PM
To: matminer
Subject: pymatgen structure_prediction?

Hi Y’all,

Is it possible to use the pymatgen structure_prediction within automatminer to generate structural variables (given just the formula) to attach to the data frame? Or is there a better approach? Or is it a bad idea?Just wondered… Thank you!

Sincerely,

tom


You received this message because you are subscribed to the Google Groups “matminer” group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
For more options, visit https://groups.google.com/d/optout.

Thank you!!

···

On Saturday, January 5, 2019 at 5:36:01 PM UTC-5, thomas heiman wrote:

Hi Y’all,

Is it possible to use the pymatgen structure_prediction within automatminer to generate structural variables (given just the formula) to attach to the data frame? Or is there a better approach? Or is it a bad idea?Just wondered… Thank you!

Sincerely,

tom