Hello pymatgen community,
I’m using pymatgen’s SQSTransformation
to generate a special quasirandom structure (SQS) for a binary Cu-Mn system, with the Cu FCC structure as the seed framework. I am using the mcsqs method underneath. After applying the transformation using apply_transformation
, I used FrameworkComparator
to check the similarity between the output structure and the original FCC seed structure. However, the output structure doesn’t match the initial FCC structure as I expected.
Is this deviation from the original FCC seed structure expected when using SQSTransformation
with the mcsqs method, or is this a red flag indicating something may be wrong with my setup or the transformation process? I would appreciate any insights into whether this is typical behavior or if there’s something I should adjust in my workflow.
Thank you in advance for your help!
Best,
Nadav
Hi @Nadav_Moav, the structure you get on output should have the same underlying fcc structure, although the crystalline symmetry may be in a different representation. A way you can check this is:
mcsqs_struct = SQSTransformation().apply_transformation(disordered_structure)
mcsqs_struct_copy = mc_sqs_struct.copy()
mcsqs_struct_copy.replace_species({'Cu': 'Mn'}) # replaces all Cu sites with Mn in-place
print(disordered_structure.get_space_group_info(), mcsqs_struct_copy.get_space_group_info())
you should get the same symmetries from both structures.
Hi Aaron,
Thanks again for the suggestion — it worked!
Just to understand this better: I thought FrameworkMatcher
is supposed to ignore species when comparing structures, so I was expecting it to recognize the same framework even without manually replacing the species.
Also, in one test case I tried, the output SQS had exactly the same space group and lattice parameters as the seed structure, but FrameworkMatcher
still didn’t recognize them as matching until I applied your trick of replacing all species with the same element.
Here’s the information from that test case:
Seed structure:
K1 Ti1 O3
1.0
3.6917200000000001 0.0000000000000000 0.0000000000000002
-0.0000000000000002 3.6917200000000001 0.0000000000000002
0.0000000000000000 0.0000000000000000 5.7668970000000002
K Ti O
1 1 3
direct
0.5000000000000000 0.5000000000000000 0.9962770000000000 K
0.0000000000000000 0.0000000000000000 0.4947610000000000 Ti
0.0000000000000000 0.5000000000000000 0.6058740000000000 O
0.5000000000000000 0.0000000000000000 0.6058740000000000 O
0.0000000000000000 0.0000000000000000 0.1946150000000000 O
Output SQS structure:
K4 Ti8 Bi4 O24
1.0
7.3834400000000002 0.0000000000000000 0.0000000000000000
0.0000000000000000 7.3834400000000002 0.0000000000000000
0.0000000000000000 0.0000000000000000 11.5337940000000003
K Bi K Bi K Bi Ti O
2 1 1 1 1 2 8 24
direct
0.2500000000000000 0.2500000000000000 0.4981380000000000 K
0.2500000000000000 0.7500000000000000 0.4981380000000000 K
0.2500000000000000 0.2500000000000000 0.9981380000000000 Bi
0.2500000000000000 0.7500000000000000 0.9981380000000000 K
0.7500000000000000 0.2500000000000000 0.4981380000000000 Bi
0.7500000000000000 0.7500000000000000 0.4981380000000000 K
0.7500000000000000 0.2500000000000000 0.9981380000000000 Bi
0.7500000000000000 0.7500000000000000 0.9981380000000000 Bi
1.0000000000000000 1.0000000000000000 0.2473800000000000 Ti
0.5000000000000000 1.0000000000000000 0.2473800000000000 Ti
1.0000000000000000 1.0000000000000000 0.7473800000000000 Ti
0.5000000000000000 1.0000000000000000 0.7473800000000000 Ti
1.0000000000000000 0.5000000000000000 0.2473800000000000 Ti
0.5000000000000000 0.5000000000000000 0.2473800000000000 Ti
1.0000000000000000 0.5000000000000000 0.7473800000000000 Ti
0.5000000000000000 0.5000000000000000 0.7473800000000000 Ti
0.2500000000000000 1.0000000000000000 0.3029370000000000 O
0.2500000000000000 1.0000000000000000 0.8029370000000000 O
0.2500000000000000 0.5000000000000000 0.3029370000000000 O
0.2500000000000000 0.5000000000000000 0.8029370000000000 O
0.7500000000000000 1.0000000000000000 0.3029370000000000 O
0.7500000000000000 1.0000000000000000 0.8029370000000000 O
0.7500000000000000 0.5000000000000000 0.3029370000000000 O
0.7500000000000000 0.5000000000000000 0.8029370000000000 O
1.0000000000000000 0.2500000000000000 0.3029370000000000 O
1.0000000000000000 0.2500000000000000 0.8029370000000000 O
1.0000000000000000 0.7500000000000000 0.3029370000000000 O
1.0000000000000000 0.7500000000000000 0.8029370000000000 O
0.5000000000000000 0.2500000000000000 0.3029370000000000 O
0.5000000000000000 0.2500000000000000 0.8029370000000000 O
0.5000000000000000 0.7500000000000000 0.3029370000000000 O
0.5000000000000000 0.7500000000000000 0.8029370000000000 O
1.0000000000000000 1.0000000000000000 0.0973070000000000 O
0.5000000000000000 1.0000000000000000 0.0973070000000000 O
1.0000000000000000 1.0000000000000000 0.5973070000000000 O
0.5000000000000000 1.0000000000000000 0.5973070000000000 O
1.0000000000000000 0.5000000000000000 0.0973070000000000 O
0.5000000000000000 0.5000000000000000 0.0973070000000000 O
1.0000000000000000 0.5000000000000000 0.5973070000000000 O
0.5000000000000000 0.5000000000000000 0.5973070000000000 O
Do you have an idea why that happens? Is there something else that FrameworkMatcher
is sensitive to beyond species and basic lattice parameters?
Thanks again for your time and help!
Best,
Nadav
If you’re using the StructureMatcher
class with the FrameworkComparator
, you also have to do attempt_supercell = True
for the check to work. Not sure if that’s what you meant by FrameworkMatcher
?
If you replace species in the MCSQS structure first to match the original composition, the structures could match with attempt_supercell = False
when primitive cell reduction is performed.
See this code for details, the last line prints True
for a match between the MCSQS structure without changes and the original structure
from pymatgen.core import Structure
from pymatgen.analysis.structure_matcher import StructureMatcher, FrameworkComparator
inp_struct = Structure.from_str("""K1 Ti1 O3
1.0
3.6917200000000001 0.0000000000000000 0.0000000000000002
-0.0000000000000002 3.6917200000000001 0.0000000000000002
0.0000000000000000 0.0000000000000000 5.7668970000000002
K Ti O
1 1 3
direct
0.5000000000000000 0.5000000000000000 0.9962770000000000 K
0.0000000000000000 0.0000000000000000 0.4947610000000000 Ti
0.0000000000000000 0.5000000000000000 0.6058740000000000 O
0.5000000000000000 0.0000000000000000 0.6058740000000000 O
0.0000000000000000 0.0000000000000000 0.1946150000000000 O
""", fmt = "poscar")
mcsqs_struct = Structure.from_str("""K4 Ti8 Bi4 O24
1.0
7.3834400000000002 0.0000000000000000 0.0000000000000000
0.0000000000000000 7.3834400000000002 0.0000000000000000
0.0000000000000000 0.0000000000000000 11.5337940000000003
K Bi K Bi K Bi Ti O
2 1 1 1 1 2 8 24
direct
0.2500000000000000 0.2500000000000000 0.4981380000000000 K
0.2500000000000000 0.7500000000000000 0.4981380000000000 K
0.2500000000000000 0.2500000000000000 0.9981380000000000 Bi
0.2500000000000000 0.7500000000000000 0.9981380000000000 K
0.7500000000000000 0.2500000000000000 0.4981380000000000 Bi
0.7500000000000000 0.7500000000000000 0.4981380000000000 K
0.7500000000000000 0.2500000000000000 0.9981380000000000 Bi
0.7500000000000000 0.7500000000000000 0.9981380000000000 Bi
1.0000000000000000 1.0000000000000000 0.2473800000000000 Ti
0.5000000000000000 1.0000000000000000 0.2473800000000000 Ti
1.0000000000000000 1.0000000000000000 0.7473800000000000 Ti
0.5000000000000000 1.0000000000000000 0.7473800000000000 Ti
1.0000000000000000 0.5000000000000000 0.2473800000000000 Ti
0.5000000000000000 0.5000000000000000 0.2473800000000000 Ti
1.0000000000000000 0.5000000000000000 0.7473800000000000 Ti
0.5000000000000000 0.5000000000000000 0.7473800000000000 Ti
0.2500000000000000 1.0000000000000000 0.3029370000000000 O
0.2500000000000000 1.0000000000000000 0.8029370000000000 O
0.2500000000000000 0.5000000000000000 0.3029370000000000 O
0.2500000000000000 0.5000000000000000 0.8029370000000000 O
0.7500000000000000 1.0000000000000000 0.3029370000000000 O
0.7500000000000000 1.0000000000000000 0.8029370000000000 O
0.7500000000000000 0.5000000000000000 0.3029370000000000 O
0.7500000000000000 0.5000000000000000 0.8029370000000000 O
1.0000000000000000 0.2500000000000000 0.3029370000000000 O
1.0000000000000000 0.2500000000000000 0.8029370000000000 O
1.0000000000000000 0.7500000000000000 0.3029370000000000 O
1.0000000000000000 0.7500000000000000 0.8029370000000000 O
0.5000000000000000 0.2500000000000000 0.3029370000000000 O
0.5000000000000000 0.2500000000000000 0.8029370000000000 O
0.5000000000000000 0.7500000000000000 0.3029370000000000 O
0.5000000000000000 0.7500000000000000 0.8029370000000000 O
1.0000000000000000 1.0000000000000000 0.0973070000000000 O
0.5000000000000000 1.0000000000000000 0.0973070000000000 O
1.0000000000000000 1.0000000000000000 0.5973070000000000 O
0.5000000000000000 1.0000000000000000 0.5973070000000000 O
1.0000000000000000 0.5000000000000000 0.0973070000000000 O
0.5000000000000000 0.5000000000000000 0.0973070000000000 O
1.0000000000000000 0.5000000000000000 0.5973070000000000 O
0.5000000000000000 0.5000000000000000 0.5973070000000000 O
""", fmt = "poscar")
matcher = StructureMatcher(
comparator= FrameworkComparator(),
attempt_supercell=True
)
matcher.fit(inp_struct,mcsqs_struct)
Hi, thanks for the explanation and the code example it worked perfectly.
I’m still trying to wrap my head around why I need to use either attempt_supercell=True
or manually call .replace_species()
on the MCSQS structure for it to match the seed. Without one of those, the match fails, even though the underlying framework is the same.
What’s confusing is that when I first tested FrameworkComparator
, I compared a structure from Materials Project with a supercell version I created in VESTA — and the match worked without attempt_supercell=True
. So I assumed FrameworkComparator
was already handling that kind of comparison internally.
But now, in this SQS case, it only works if I explicitly guide it — either by using .replace_species()
to revert the MCSQS structure to the seed composition, even when I thought the species aren’t relevant for FrameworkComparator
, or by enabling attempt_supercell
. Why does the framework match succeed in one case and not the other?
Is it something about how the atom types or ordering are handled internally?
Appreciate any insight — feels like there’s a subtle piece I’m missing.
Best,
Nadav
Sorry I missed this! The StuctureMatcher
logic is a bit confusing, but the docstring is helpful in this case:
attempt_supercell (bool): If set to True and number of sites in
cells differ after a primitive cell reduction (divisible by an
integer) attempts to generate a supercell transformation of the
smaller cell which is equivalent to the larger structure.
The key here is that StructureMatcher
can’t match sites that aren’t present in a structure. It doesn’t generate all possible images of sites within inp_struct
. Thus the 5 sites in inp_struct
cannot match to those in mcsqs_struct
, which has 40 sites
When attempt_supercell
is True, StructureMatcher
scales inp_struct
by 8 and then both structures match.
Alternatively, you could disable primitive cell reduction and scale inp_struct
yourself by a factor of 8:
StructureMatcher(
comparator= FrameworkComparator(),
attempt_supercell=False,
primitive_cell=False,
).fit(inp_struct*(2,2,2),mcsqs_struct)
>>> True
(The key is disabling primitive cell reduction, or the supercell scaling of inp_struct
will be undone.) Hope that helps!