problems with restart2data and lj/charmm/coul/charmm/inter

Andrea_Benassi · March 28, 2013, 8:36am

Dear all,
I am currently using the Jewett's pair style
lj/charmm/coul/charmm/inter and I noticed that, when I try to convert
a restart file, to extract the velocities and positions, the
restart2data file gives the error:

ERROR: Unknown pair style lj/charmm/coul/charmm/inter

I am not interested in getting the pair coefficient, so at first I
tried with the flag -nc to exclude the pair coefficient extraction but
the error still occurs.
Andrew, do you have an ad-hoc version of restart2data? Or can we make
a workaround in
restart2data to extract only positions and velocities?
Thanks.
Andrea

akohlmey · March 28, 2013, 8:45am

Dear all,
I am currently using the Jewett's pair style
lj/charmm/coul/charmm/inter and I noticed that, when I try to convert
a restart file, to extract the velocities and positions, the
restart2data file gives the error:

ERROR: Unknown pair style lj/charmm/coul/charmm/inter

I am not interested in getting the pair coefficient, so at first I
tried with the flag -nc to exclude the pair coefficient extraction but
the error still occurs.

yes. the reason is the way the (binary) file is parsed. if you look
into the source of restart2data you'll see lots of string comparisons.
most of the time adding a test for the missing string as an "or"
clause of a similar style in the suitable locations does the trick.

axel.

Andrew_Jewett · March 28, 2013, 10:51pm

Dear all,
I am currently using the Jewett's pair style
lj/charmm/coul/charmm/inter and I noticed that, when I try to convert
a restart file, to extract the velocities and positions, the
restart2data file gives the error:

ERROR: Unknown pair style lj/charmm/coul/charmm/inter

I am not interested in getting the pair coefficient, so at first I
tried with the flag -nc to exclude the pair coefficient extraction but
the error still occurs.

Interesting. I tested "lj/charmm/coul/charmm/inter" with
read_restart, (and I think that works), but I did not think to modify
or test the "restart2data" tool. (It's not a tool I use. I use
read_dump or my own "dump2data.py" script, which I can share. See
below.)

I could modify the code for restart2data.cpp to make this work. But
I wonder if there are other pair styles or (bonded styles) which have
the same issue?

--- In general: ---

How is restart2data kept up to date with all of the new pair, bond,
angle etc.. styles which users are submitting to LAMMPS? Is this done
by hand? Is it safe for people who use non-standard (USER) packages
to use restart2data? There are sevaral alternatives to
"restart2data"/"read_data" now. (Such as read_dump.) Does pizza.py
still include "dump2data"?

It's too bad that restart2data.cpp does not simply invoke the
pair_style's own internal read_restart() function. (Unfortunately,
"PairLJCharmmCoulCharmmInter::read_restart()" is sort of complicated,
but I can clean that up and copy that code into restart2data.cpp)

Let me know what I should do. If it makes any difference, I am
happy to submit "lj/charmm/coul/charmm/inter" for formal inclusion
with LAMMPS (USER-MISC) and to make modifications to restarg2data if
necessary.

yes. the reason is the way the (binary) file is parsed. if you look
into the source of restart2data you'll see lots of string comparisons.
most of the time adding a test for the missing string as an "or"
clause of a similar style in the suitable locations does the trick.

Thanks for posting the issue (Andrea), and for the reply (Axel).

Andrew_Jewett · March 29, 2013, 10:36pm

The nice thing about LAMMPS' DATA/DUMP files is that the file format
is relatively simple. (See links below. I wish data files included
metadata to indicate the column format, but that's for another post.)

To get you past your current obstacle, if you have created a data file
for this system earlier, you can copy them from the last frame of a
DUMP file, and paste them into the "Atoms" section of an existing
"DATA" file.

To make this process easier, it helps if before you run the
simulation you carefully formatted your dump file to insure that the
column format matches the format used in the "Atoms" section of your
"DATA" file. For example, for atom_style "full", create your dump
files using "id mol type x y z". For example:

dump 1 all custom 500 traj.lammpstrj id mol type x y z ix iy iz

If the dump file has a different format, you may have to extract the
coordinates from the dump file, and the other atom data from the data
file, and combine the files using a spreadsheet. (If you are familiar
with awk, then save the two text fragments as temporary files,
"crd.txt", "atoms.txt", paste them together into a single file using
the "paste -d' ' crd.txt atoms.txt", and run the result through awk to
select the columns you want from either file. I'm sure there are other
ways.)

Also, if you are desperate, I wrote a funky (and possibly buggy)
script "dump2data.py" for extracting coordinates from dump files. By
default, it extracts the coordinates from the last frame of a dump
file. However it is not a general script. It does not work with dump
files using scaled coordinates (xs ys zs). These are default in
LAMMPS, so you must explicitly tell LAMMPS to generate the dump file
using unscaled coordinates "x y z". (See the example "dump" command
above.)

The "dump2data.py" also can not extract velocity data (although you
can do this with a text editor). It does not understand exotic atom
degrees of freedom (dipole or ellipsoid orientations, etc). This is
not code I am particularly proud of. If you are desperate, you can
try it. I attached it to this message. Instructions how to use it
were posted here:

http://sourceforge.net/mailarchive/message.php?msg_id=29864484

If I remember correctly, pizza.py used to come with another script
named "dump2data". Presumably, that script was much more robust than
this one.

General LAMMPS documentation for the dump and data files are located here:
http://lammps.sandia.gov/doc/dump.html
http://lammps.sandia.gov/doc/read_data.html
http://lammps.sandia.gov/doc/atom_style.html

I hope this helps.
Andrew

dump2data.py (42.9 KB)

sjplimp · April 2, 2013, 4:39pm

It’s too bad that restart2data.cpp does not simply invoke the
pair_style’s own internal read_restart() function. (Unfortunately,

tools/restart2data.cpp is a single standalone file. I don’t think
it makes sense to compile/link it to all of LAMMPS as a library.
Besides the pair style read_restart() methods store the params internal
to the pair style, which restart2data could not then access, and
even if it could it would have to know how to format it for output
to a data file.

So we keep restart2data up-to-date with LAMMPS, for any pair
style that is added. If you want to add lj/c/coul/c/inter to LAMMPS
officially, then that would be great (need a doc page), and it would
also need to be added to restart2data.

A better long-term solution might be to have a write_data command
in LAMMPS. This would require every atom style and every pair/bond/etc
style have a method that would write it’s info into a data file format.
The bits of code for that are now in restart2data.

Steve

Andrew_Jewett · April 2, 2013, 5:29pm

It's too bad that restart2data.cpp does not simply invoke the
pair_style's own internal read_restart() function. (Unfortunately,

tools/restart2data.cpp is a single standalone file. I don't think
it makes sense to compile/link it to all of LAMMPS as a library.
Besides the pair style read_restart() methods store the params internal
to the pair style, which restart2data could not then access, and
even if it could it would have to know how to format it for output
to a data file.

So we keep restart2data up-to-date with LAMMPS, for any pair
style that is added. If you want to add lj/c/coul/c/inter to LAMMPS
officially, then that would be great (need a doc page), and it would
also need to be added to restart2data.

A better long-term solution might be to have a write_data command
in LAMMPS. This would require every atom style and every pair/bond/etc
style have a method that would write it's info into a data file format.
The bits of code for that are now in restart2data.

I like the idea of a "write_data" command, although I wonder if it
would replace the functionality of writing/reading restart files.
Perhaps it would, but perhaps that would be okay.

I am under the impression that most people use restart2data to extract
coordinates from the restart file, and use lammps input script
commands to specify the force-field settings. (I'm changing the
moltemplate docs to encourage users to do that as well.)

I was not under the impression that restart2data writes the
force-field parameters to the data file. (How could it handle hybrid
styles, for example?) Perhaps I am wrong. But this seems like a nice
way to avoid problems. (Support data files which contain coordinate
and topology data only, and leave the force-field parameters out.)

New pair_style contributing authors to implement their own
"write_data" methods, is up to you. (One could generalize it to
enable the printing input script commands instead of data sections.)

Honestly, if possible I'd love to see some kind of data-file like
(meta-data containing) text format eventually replacing restart files
in the long-term future, but that's a lot of work on your part and I
was not asking for this.

For now, I'll get lj/charmm/coul/charmm/inter ready for submission in
the next couple weeks, and tweak the code for restart2data.cpp too.

Thanks for your reply.

Andrew

akohlmey · April 3, 2013, 8:55am

It's too bad that restart2data.cpp does not simply invoke the
pair_style's own internal read_restart() function. (Unfortunately,

tools/restart2data.cpp is a single standalone file. I don't think
it makes sense to compile/link it to all of LAMMPS as a library.
Besides the pair style read_restart() methods store the params internal
to the pair style, which restart2data could not then access, and
even if it could it would have to know how to format it for output
to a data file.

So we keep restart2data up-to-date with LAMMPS, for any pair
style that is added. If you want to add lj/c/coul/c/inter to LAMMPS
officially, then that would be great (need a doc page), and it would
also need to be added to restart2data.

A better long-term solution might be to have a write_data command
in LAMMPS. This would require every atom style and every pair/bond/etc
style have a method that would write it's info into a data file format.
The bits of code for that are now in restart2data.

I like the idea of a "write_data" command, although I wonder if it
would replace the functionality of writing/reading restart files.

no. it is only a stop-gap measure anyway. a better solution would be a
portable, structured and *self-descriptive* file format, e.g.
something based on hdf5 and then a few custom tools to pre/post
process them and the option to tell the read_restart which section of
the file to read. in fact, with this move one could combine topology,
restart and trajectory data in one common format. this would also make
writing GUIs and topology construction tools like moltemplate easier.
the current restart file format has some of that already included.

Perhaps it would, but perhaps that would be okay.

I am under the impression that most people use restart2data to extract
coordinates from the restart file, and use lammps input script
commands to specify the force-field settings. (I'm changing the
moltemplate docs to encourage users to do that as well.)

that is the smart way to do things, yes.

I was not under the impression that restart2data writes the
force-field parameters to the data file. (How could it handle hybrid
styles, for example?) Perhaps I am wrong. But this seems like a nice

it only does it for cases that it can handle. the pair coefficient
format only allows to store the "self-type" parameters and the rest
has to be inferred through mixing rules. only a small number of simple
force fields support this. for the rest the writing out of parameters
is skipped, or - in case of the lj/sdk (cg/cmm) style potentials -
written to a separate file that can be read in via an "include"
statement.

way to avoid problems. (Support data files which contain coordinate
and topology data only, and leave the force-field parameters out.)

yes.

New pair_style contributing authors to implement their own
"write_data" methods, is up to you. (One could generalize it to
enable the printing input script commands instead of data sections.)

Honestly, if possible I'd love to see some kind of data-file like
(meta-data containing) text format eventually replacing restart files
in the long-term future, but that's a lot of work on your part and I
was not asking for this.

text format is *bad*. it doesn't scale. it is wasteful and ugly. what
you need is something that is more like a file system, where you can
have a tree of different types of information and then have tools to
extract and update them in pieces (via script or through
generating/encoding text files). ultimately, you need to be able to
handle multiple instances of LAMMPS classes (for replica calculations)
with topologies, per-atom properties (coordinates, velocities, ...)
parameters, fixes/computes for each replica and potentially for
multiple steps (with skipping over coordinates/properties that doesn't
change and other compression tricks).

with an object oriented code like LAMMPS, an alternative would be to
implement a complete serialization/deserialization, which would allow
for getting as close to a perfect transparent restart as you could
get.

axel.

sjplimp · April 3, 2013, 2:32pm

In addition to Axel’s comments. I don’t see combining
data and restart files for 2 other reasons:

a) restart is binary and thus can be exact, data is
text and cannot be

b) restart files have lots of auxiliary info that
don’t make sense in data files - the “state” of
fixes like Nose/Hoover thermostats, origins of
MSD calculations, etc

I’ll create a simple write_data command. Shouldn’t
be hard and it can eventually make restart2data obsolete.

Steve

sjplimp · April 3, 2013, 6:03pm

Added a first version of a write_data command
in the 3 Apr patch. It doesn’t do everything yet
(see the doc page), but it will be easy to add
support for various pair, bond, etc styles.

When all the hooks are added, the restart2data
tool can go away.

Steve

akohlmey · April 3, 2013, 6:05pm

Added a first version of a write_data command
in the 3 Apr patch. It doesn't do everything yet
(see the doc page), but it will be easy to add
support for various pair, bond, etc styles.

When all the hooks are added, the restart2data
tool can go away.

you are "da man", steve.

axel.

Andrew_Jewett · April 3, 2013, 11:16pm

Added a first version of a write_data command
in the 3 Apr patch. It doesn't do everything yet
(see the doc page), but it will be easy to add
support for various pair, bond, etc styles.

When all the hooks are added, the restart2data
tool can go away.

Steve

oh wow
that's great.

(off the hook. awesome.
...Just kidding. well, sort of)

-Andrew