Generating compatible interfaces of known orientation relationships

I’ve been looking through the pymatgen interface generation docs, and I can’t quite figure out a routine that’ll do what I want. Specifically, I want to come up with in-plane supercells for two different structures that will produce a cell whose periodicity is compatible with an interface of a known orientation relationship. I.e. I know the full alignment of the lattices and the habit plane (e.g. Kurdjumov-Sachs, the first example here), and I just want to know what in-plane duplication of the two phase surface cells will allow me to match the them with minimal strain in the combined supercell. Note that I’m targeting interatomic potential simulations, so ideally I’d like something that could produce a large cell without taking forever (which a naive enumeration and matching could have trouble with).

Is that possible in pymatgen? I see more general things, where it can check different orientations, etc, or just do the strain, but I can’t figure out if any of those will actually do what I need.

It is not there in pymatgen, but I’ve had an interface builder that I’ve written but failed to write a publication on for a while now. Sometime this summer I’m gonna get it into pymatgen along with a clean-up of the substrate analyzer, which uses the same ZSL algorithm for lattice matching.

That being said, pymatgen structure can’t handle massive structures. Has anyone ever played with 50K atoms in a single pymatgen structure for instance?

Size aside, do I just need to call ZSLGenerator? I’m happy to do my own transformations and construct the cells.

That being said, pymatgen structure can’t handle massive structures. Has anyone ever played with 50K atoms in a single pymatgen structure for instance?

@Chi_Chen has worked with some very large structures in pymatgen I believe, this is partly what motivated the optimization of the get_neighbor routines in Cython.

ZSLGenerator identifies the matching lattices given the orientation relationship. There’s a bit of extra work to construct the appropriate slabs, reorient them to align up properly, and then actually construct the interface with the necessary degrees of freedom for manipulation.

RE Shyamd’s comment:

I think the issue of the performance is associated with all the python objects that are constructed in pymatgen, for example constructing PeriodicSite for each atom in a very big cell. If we can find ways to bypass those objects, the performance is reasonable. Constructing the neighbor list of a structure is an example, in which case I work only on the low-level arrays. For example, the following code computes the 5.0 Angstrom neighbor list of a 76k atom cell and it takes 1.4 seconds on my laptop.

from pymatgen.ext.matproj import MPRester
mpr = MPRester()
s = mpr.get_structure_by_material_id('mp-1314') 
s_copy = s * [5, 10, 10]   # 76 K atom in the cell
_ = s_copy.get_neighbor_list(r=5.0)  # find neighbor list of cutoff 5.0, took 1.38 seconds on my laptop
1 Like

I will be very interested in performance optimizations for pymatgen for other applications. Please just let me know.

I’m trying to use ZSLGenerator, and it seemed like it was going OK, but now I’m getting confused. I expected that (for any match returned by zslgen())

match['substrate_transformation'] @ match['sub_vecs'] - match['sub_sl_vecs']

should be entirely 0s. I.e. that the transformation times the sub_vecs would equal sub_sl_vecs. This is violated by one of the matches that are returned from my code.

Is this my misunderstanding of what those quantities mean, or should I try to produce some code that reproduces the behavior and open a github issue?

Hi Noam,

The vectors get “reduced” in the ZSL algorithm so that they follow a convention that enables proper matching. This is effectively a 3d rotation. The easiest way to do this is to compute the transformation matrices directly. If you want to split the strain evenly, you’ll need to separate the rotation from the strain and distribute it appropriately.

OK - I was able to get it to work by backing out the transformation matrix between the returned orig cell and supercell vectors. It might be nice to have better documentation that explains the mathematical relationships between the various components of match.