Reconstructiong lammsdata file using topo tools

Hello lammps users,

I am trying to use topo tools to create a lammps data files for my simulation. To start with, I checked whether I could reproduce a lammpsdata file (say data.foo) which I have obtained previously by using amb2lmp.

I proceeded as follows.

Start vmd

  1. Load data.foo into vmd (topo read lammpsdata data.foo)
  2. Created a psf file from data.foo by using the command animate write psf data.psf
  3. Delete the molecule, data.foo that was loaded in vmd and make sure no other molecule is loaded in vmd.
  4. Load data.psf and load the pdb file of the molecule into the psf file. (This is to prevent vmd from defining its own bonds based on distance, otherwise pdb file would be enough). Before loading the pdb file I changed the column of atom names in the pdbfile to atom types corresponding to the amber forcefield. This is because topo retypebonds (the command which I am going to use later) by default uses atom names to create atom types, i.e., if there n atoms and if one does not assign atom types and uses the command topo retypebonds it creates n-1 bondtypes.
  5. Issue the following commands sequentially in tkconsole of vmd
    topo retypebonds, topo guessangles, topoguessdihedrals, topowritelammpsdata data.foo_topo.

The result is.

data.foo (from amb2lmp)

75 atoms
74 bonds
138 angles
228 dihedrals
0 impropers

6 atom types
6 bond types
12 angle types
10 dihedral types

data.foo_topo (by using topo tools)

75 atoms
74 bonds
138 angles
174 dihedrals
0 impropers
6 atom types
6 bond types
12 angle types
15 dihedral type

Remarks:
I tried the same procedure for a small molecule (ethane). I was able to reproduce the data file using topo tools.
I am not concerned about the Force parameters which I am planning to use from gaff.lt (moltemplate) or gaff.dat (ambertools)
For large molecules, the number of dihedrals and the dihedral types alone don’t match.

Questions:

  1. Am I making a mistake in the above procedure for creating data file using topo tools ?
  2. I do not know why the data is inconsistent with the dihedrals and the dihedral types alone ?
  3. I have seen a similar post discussing this issue
    http://sourceforge.net/p/lammps/mailman/message/26952647/
    Dr. Axel you mentioned about a template approach in this post. Can you please elaborate on that if you do not find fault in my approach listed above ?

Hello lammps users,

I am trying to use topo tools to create a lammps data files for my
simulation. To start with, I checked whether I could reproduce a lammpsdata
file (say data.foo) which I have obtained previously by using amb2lmp.

I proceeded as follows.

Start vmd
1. Load data.foo into vmd (topo read lammpsdata data.foo)
1. Created a psf file from data.foo by using the command animate write psf
data.psf
2. Delete the molecule, data.foo that was loaded in vmd and make sure no
other molecule is loaded in vmd.
3. Load data.psf and load the pdb file of the molecule into the psf file.
(This is to prevent vmd from defining its own bonds based on distance,
otherwise pdb file would be enough). Before loading the pdb file I changed
the column of atom names in the pdbfile to atom types corresponding to the
amber forcefield. This is because topo retypebonds (the command which I am
going to use later) by default uses atom names to create atom types, i.e.,
if there n atoms and if one does not assign atom types and uses the command
topo retypebonds it creates n-1 bondtypes.
4. Issue the following commands sequentially in tkconsole of vmd
topo retypebonds, topo guessangles, topoguessdihedrals, topowritelammpsdata
data.foo_topo.

The result is.

data.foo (from amb2lmp)

75 atoms
74 bonds
138 angles
228 dihedrals
0 impropers

6 atom types
6 bond types
12 angle types
10 dihedral types

data.foo_topo (by using topo tools)

75 atoms
74 bonds
138 angles
174 dihedrals
0 impropers
6 atom types
6 bond types
12 angle types
15 dihedral type

Remarks:
I tried the same procedure for a small molecule (ethane). I was able to
reproduce the data file using topo tools.
I am not concerned about the Force parameters which I am planning to use
from gaff.lt (moltemplate) or gaff.dat (ambertools)
For large molecules, the number of dihedrals and the dihedral types alone
don't match.

Questions:

1. Am I making a mistake in the above procedure for creating data file using
topo tools ?

this is not the right question to ask. what you need to ask yourself
are two other questions:
1. did topotools do what is promises to do (and i don't see any
indication it didn't)?
2. why do those two approaches result in different topology data?

2. I do not know why the data is inconsistent with the dihedrals and the
dihedral types alone ?

please note that topotools promises to find all "topological
dihedrals", i.e. all bonds that are connected to a central bond. there
is a reason the command is called "topo *guess*dihedrals". some
forcefields require to define dihedrals twice with different
parameters to mimic a different style of dihedral potential (LAMMPS
can handle both cases. i.e. it has custom dihedral styles and can
support using more than one functional form in the same simulation, as
well as supports having a dihedral defined multiple times). some
forcefields also define improper dihedrals as regular dihedrals. some
forcefields support "wildcards" for dihedrals, i.e. multiple dihedrals
that topotools detects as different dihedral types are recognized by
the specific force field as the same dihedral type.

you will have to figure out yourself what is going on by comparing the
dihedral definitions in the two data files and reading up on the force
field definition that you want to use.

topotools very deliberately does not aim to provide a complete
topology creation solution because there are so many subtle
differences between different force fields, and often there is no
unambigous way to implement them. what topotools provides are
primitive operations to manipulate and build topologies that simplify
the process significantly. for simple cases like the tutorial examples
on my homepage, it works quite well, but even there, you see that
custom scripting is needed to properly assign the partial charges for
OPLS/AA hydrocarbons.

3. I have seen a similar post discussing this issue
http://sourceforge.net/p/lammps/mailman/message/26952647/
Dr. Axel you mentioned about a template approach in this post. Can you
please elaborate on that if you do not find fault in my approach listed
above ?

i was describing in general strokes the way how a program like psfgen works.

your fault is not in your procedure but in expecting topotools to
magically do what it cannot know about. check out my suggestions above
and hopefully you'll get what you need (or at least closer to it).

axel.