Hello everyone,
I have two molecules in two separate xyz files: benzene.xyz and methane.xyz .
I would like to create a methylbenzene (C7H8) molecule out of these two using pymatgen.
First I defined a function which uses pybel to read a pymatgen object and return a SMILES string out of it.
from IPython.core.display import SVG, display_svg
from pymatgen.io.babel import BabelMolAdaptor
import pybel as pb
def quick_print_pmg( pmg_obj ):
bma_obj = BabelMolAdaptor(pmg_obj)
pbm_obj = pb.Molecule(bma_obj.openbabel_mol)
smi_obj = pb.readstring('smi', pbm_obj.write("can"))
print("Canonical SMILES = {}".format(smi_obj.write("can")))
#lines below are to show the result
#smi_obj.make3D()
#svg_obj = SVG(smi_obj.write("svg"))
#display_svg(svg_obj)
Now, we can open the two text files and read the molecules
from pymatgen import Molecule
benz_pmg = Molecule.from_file("benzene.xyz")
meth_pmg = Molecule.from_file("methane.xyz")
Two hydrogen atoms have to be removed (and thrown away). I have tried to simply removing the two hydrogen atoms and make a for loop to append()
the atoms from methane to benzene, but this gives no guarantee that the result is correct, plus I have to make sure atoms do not overlap.
benz_pmg.pop(-1)
meth_pmg.pop(-1)
for atom in meth_pmg:
benz_pmg.append(atom.specie, atom.coords + [1,1,1])
quick_print_pmg( benz_pmg )
Canonical SMILES = [CH]=[C][CH]C=[CH].[CH2][CH2]
Another option that I have tried is the substitute()
method of the Molecule
class from pymatgen, which seems like exactly what I am looking for, but I cannot get it working straight.
# read coordinates from files again
benz_pmg = Molecule.from_file("benzene.xyz")
meth_pmg = Molecule.from_file("methane.xyz")
benz_pmg.substitute(0, meth_pmg)
quick_print_pmg( benz_pmg )
Canonical SMILES = [CH]/C=C\[CH].C.[H].[H]
The canonical smiles I am looking for is rather C1=CC=CC=C1C
. Could anyone explain me how to use this properly?
Thank you.
Marco