I’m trying to better figure out the topic of charge-balance analysis in chemical compositions, using functions offered by pymatgen
.
In particular, I’m interested in this as I’m working with generative machine learning models for inorganic compositions, and I’m trying to assess how many chemical compositions sampled from my trained models are charge-balanced.
Something that it’s not clear to me, is that this kind of analysis (using for example oxi_state_guesses()
) seems to require chemical compositions to have integer coefficients.
As non-material-scientist, I’ve understood that in inorganic chemistry it’s very frequent to encounter compounds that are non-stoichiometric (with coefficients ranging in a continuous space) and then this charge balance analysis using automated tools seems not to be straightforward anymore.
That’s why I’m wondering if there is someway to convert non-stoichiometric compounds in my dataset to re-normalized stoichiometric ones using pymatgen
, I’m listing an example of what I would think to obtain below:
Let’s say I have in my dataset the compound Hg0.7 Cd0.3 Te1
. It is charge-balanced considering oxidation numbers (+2, +2, -2)
From basic rules of Chemistry that I remember from high school, if we divide by the smallest coefficient and then multiply by a suitable small number, we obtain the normalized compound. So in this case we would get
Hg0.7 Cd0.3 Te1
—> Hg7 Cd3 Te10
and (obviously) the resulted composition is still charge balanced with oxidation numbers (+2,+2,-2)
.
So, is there a similar way to bring my dataset into such equivalent representation in order to be able to check charge-balance in a quicker manner?