anybody interested in support for reading CHARMM/Amber input directly in to LAMMPS?

hi,

i am currently working on writing a package that
interfaces the molfile plugins from VMD to LAMMPS.
while the primary goal is to have more flexibility in
output file formats (.dcd already works as of last night)
and support a "rerun" option that will allow to use
LAMMPS to analyze existing trajectories, there also
is the opportunity to go one step further and implement
something akin to the "read_data" command that would
allow reading CHARMM or Amber topology (or others)
and parameter files together with a .pdb or .crd file
directly instead of having to use the converter tools
that seem to be difficult to keep up-to-date and have
lost their maintainers.

but since that is a significant time commitment,
i'd first like to get some feedback from folks here
on the mailing list, how likely they would be to
use that kind of interface.

thanks in advance,
     axel.

I don't personally need converters for this formats, but I would be
thrilled if you are interested in writing a parser for them. If so, I
have a request (see below).

       (first a speech to keep your morale up!)
   I don't use AMBER/CHARMM any more. My loathing for them is one of
the reasons I work with LAMMPS. But the lack of converter tools is a
big problem that surprisingly few people seem to post questions about.
In fact I think the problem is so chronic that few computational
biology people even use LAMMPS, so I worry that you're probably not
going to get many responses. (My experience is that the people in
that community are very much "black box" users. They often have no
idea what the simulation program does, and they refer to force fields
"by name".) Don't let this discourage you. What you are doing is the
right thing, even if nobody is interested. Computational molecular
biology is still in the future, and LAMMPS is the best hope for that
field in my opinion.

   For the last year I've been working on a LAMMPS preprocessor. It
works and it is reasonably mature by now, but I haven't announced it
yet. (It's available at moltemplate.org.) Work on it is active and
ongoing.

   Although this is off the topic of your post, if you are writing new
software to generate LAMMPS input files, I have a request: Can you
include an option to output in moltemplate format?

This should be easy because the two formats are almost the same:

----------- LAMMPS DATA format: -----------

Atoms

  1038 7 1 0.000e+00 1.25175e+01 7.97312e+00 7.66818e+01
  1039 7 2 0.000e+00 8.12204e+00 6.32843e+00 9.18037e+00
    :

Masses

   1 14.0
   2 50.0
   :

Bonds

  872 1 1038 1039
   :

Bond Coeffs

  1 30.0 3.2

Pair Coeffs

  1 0.2 2.0
  2 1.50 5.0

----------- MOLTEMPLATE format: -----------

write("Data Atoms") {
  $atom:1038 $mol:7 @atom:CA 0.000e+00 1.25175e+01 7.97312e+00 7.66818e+01
  $atom:1039 $mol:7 @atom:R 0.000e+00 8.12204e+00 6.32843e+00 9.18037e+00
}
write_once("Data Masses") {
   @atom:CA 14.0
   @atom:R 50.0
}
write("Data Bonds") {
  $bond:872 @bond:1 $atom:1038 $atom:1039
}
write_once("Bond Coeffs") {
  @bond:1 30.0 3.2
}
write_once("Pair Coeffs") {
  @atom:CA 0.2 2.0
  @atom:R 1.50 5.0
}

That's the basic idea.

I actually began editing the topotools code to print out the data in
this format, but I was too lazy to figure out where the atom type is
stored. Instead I wrote a script that can read convert DATA files to
moltemplate format, which works well. Unfortunately the atom type
names are lost in the process. I am hoping your scripts can extract
this name from the original source files. That's all I need. (Right
now, my conversion script calls them "@atom:1", "@atom:2", etc...
which is not very informative.)

There are many more details we could add to a moltemplate file, but
the atom type names are enough for now.

I am grateful you are doing this and I don't want to work
cross-purposes. Please let me know if you are going to do write the
conversion utilities and if there is anything I can do.

Cheers.

Andrew

---- Other moltemplate file format details we can worry about later ----

Don't worry about this yet.

You can define molecule or residue types and split the system up into
residues and chains.

Alanine {
  write("Data Angles") {...}
  write("Masses") {...}
  write("Bonds") {...}
  write("Bond Coeffs") {...}
  :
}

You can assign names to atom IDs (although they must be unique). You
can also give names to in addition bond types and ids, angle types and
ids, etc...

Angles, dihedrals, and impropers have a similar format:
write_once("Data Angle Coeffs") {
  @angle:1 30.000 114
  @angle:2 30.000 123
}
write_once("Data Dihedral Coeffs") {
  @dihedral:1 -0.5 1 -180 0.0
  @dihedral:2 -1.5 1 -180 0.0
}
write("Data Angles") {
  $angle:436 @angle:1 $atom:1038 $atom:1040 $atom:1042
    :
}
write("Data Dihedrals") {
  $angle:318 @dihedral:1 $atom:1039 $atom:1038 $atom:1040 $atom:1041
    :
}
Alternately you can generate 3 and 4-body interactions this way:
write_once("Data Angles By Type") {
  @angle:1 @atom:CA @atom:CA @atom:CA @bond:* @bond:*
  @angle:2 @atom:CA @atom:CA @atom:*R @bond:* @bond:*
}
write_once("Data Dihedrals By Type") {
  @dihedral:1 @atom:CA @atom:CA @atom:CA @atom:CA @bond:* @bond:* @bond:*
  @dihedral:2 @atom:*R @atom:CA @atom:CA @atom:*R @bond:* @bond:* @bond:*
}

But, again, we don't have to worry about any of this stuff for now.

Oh dear. I need to edit the emails I send before I cc them. Hi.

i am not writing a software to "convert" formats to lammps
input, am thinking about writing a software to read things
*natively* though using VMD molfile plugins.
http://www.ks.uiuc.edu/Research/vmd/plugins/molfile/

so rather than writing something that generates input
and data files, am trying to *bypass* them.

the reasoning is, that for as long as i can maintain
the molfile plugin interface, i can maintain compatibility
to *all* formats that are supported by the plugin library.
only a few of them have the capability to provide the
information to support starting a simulation, but that
can/will grow over time; as will the capability of the
plugin interface and the library.

to give an example. in order to run a charmm force
field calculation, one needs initial coordinates,
box dimensions and shape, a psf file and
the parameter file. with a molfile plugin interface,
i only need to add a parser for the parameter file
(which is simple enough, i already wrote one in Tcl,
after i determined that charmm2lammps.pl has
been hacker proofed by the original author) and
a little bit of code that associates the parameters
with the topology and then generates the corresponding
LAMMPS information right away using similar
code like read_data does. so there would be a
new command called:

read_charmm <geofile> <gftype> <topofile> <tftype> <parmfile> [box
<box parameters>]

similar things could be done for other systems.

of course, this could be expanded to support other
simulation package, for as long as suitably capable
plugins exist and the missing pieces (reading the
parameters and assigning them) can be programmed
with an acceptable effort.

so to support some other file format, the main task
would be to write a molfile plugin for that.

however, the data file format is an example for a
"difficult" format and anything that follows its style
will suffer from the same (IMO serious) flaws. the
main reason that the topotools plugin exists, is
that it was too impossible to write a molfile plugin
without passing additional, *essential* information
to the plugin (e.g. through environment variables).

hope that explains it a little better.

axel.

hi axel,

adding a direct Amber/CHARMM read_data command would be very useful for me and i think also for many others here as a lot of questions regarding Amber/CHARMM are in the mail-list archive. i work a lot with amber and lammps for using it in an interfacial simulation. setting up a system with the maybe obsolete converters is tricky and consumes a lot of time to get rid of the many snares occuring by converting and choosing the right pair_style, special_bond command and so on.

so i would really appreciate it to have a direct read_data command for Amber and CHARMM topology and coordinate files.

greetings
robert