[lammps-users] Question about Pizza.py commands/mailing list

Hi all,

I have a question about some of the tools in the Pizza.py package, but am hesitant to sign up for the mailing list because it appears to have been over-run by spambots. So, I will post it here and hope someone can answer it:

I am working with some fairly large dump files, and want to convert the some parts of the dump file to ensight format (i.e. particular snapshots). As far as I can tell, the docs say basically to read each step in sequence with next(), delete it if not needed, and if a desired step, process it. I was hoping there may be a more streamlined way to do this (such as a method to read a particular timestep without needing to read the others first), as this will be rather inefficient if I wish to say visualize a particular step which is not early in the dump file.

My hack around this is to do some fancy grepping, and pre-process the dump file, but I was hoping there is a way to do this within Pizza - is that possible?

Thanks,

Dave

David E. Farrell

Graduate Student

Mechanical Engineering

Northwestern University

email: d-farrell2@…435…

Dave,

I think using a combination of command line commands ‘head’ and ‘tail’ might be your quickest bet. They are very quick in extracting part of a large file, although you might be running into memory problems when it is a very large file. Say you want lines 20000 to 30000 of file.dump. I would use something like ‘head –n 30000 file.dump | tail –n 10000 >part.dump’. Alternatively you can write a short little C program that extracts what you need. I think Pizza.py will always try to sequentially read your dump file.

Pieter

Hi David,

As far as my experience the answer is no, since the header for each
timestep does not specify how many bytes that snapshot takes (since the
precision for each coordinate is not specified and can be any number of
ascii characters). You can get around this by using a binary dump format,
which would let you calculate the size of each timestep based what you
dump (8 bytes per value + some more for the header - you'll have to check
the source). So there is no random access into the trajectory (which is
one benefit of binary dump files, besides size of the trajectory).
However, I don't use pizza, so I don't know if it can read binary dump
files, but this capability shouldn't be too hard to add using the struct
module in python (which lets you read and write binary data).

Naveen

It wouldn't be too hard to write a post-processor which would convert the dump file to a strict format with a constant number of bytes per timestep (either truncate or pad floats and ints to a given precision). The bytes per timestep would be written at the head of the dump. Then it would be easy to keep all timesteps in a single file, but still be able to read-in a particular timestep (after some small changes to dump.py).

Of course, it might be easier to just write the different timesteps to different dump files.

--Craig

Hi Craig,

It wouldn't be too difficult to write a preprocessor, but why bother when
LAMMPS can already output a pretty structured binary file (now if only the
header had information about what exactly is in the dump file - maybe some
kind of code to specify each of fields included). I'm not really sure why
people use the text format - is it just easy parseability with text based
tools? Or maybe the lack of portability between different endian
architectures (although the solution would be to use network endianess,
which I believe is big endian and so all modern OS's have functions to
read and write data in that format).

I find binary data much easier to work with (it's so much faster to read
and write and you don't lose any precision that you do when converting
from ASCII). I guess people find parsing binary data much harder, since
without knowing the format it's just a bunch of garbage, but I believe
that LAMMPS could have a proper format that includes a lot of the
simulation parameters at the beginning of the trajectory (at least for the
binary format - you could just basically dump a copy of the binary restart
data in the header of the trajectory). One problem i sometimes have with
older simulations is associating particular trajectories with the input
file I used to generate them.

Naveen

Hi Naveen.

Dave,

We had quite a bit of spam last year to both the LAMMPS and Pizza.py
mailing lists, but we've made it so that non-members can't post unless
hand-approved, so looks like there has been zero spam so far in 2007.
So you might want to go ahead and sign up ---the e-mail traffic to
that list is low.

I also use Pizza.py to convert dump files to Ensight format. But
unfortunately, I don't know of a good way to randomly-access a given
snapshot within a dump file using Pizza.py.

Paul

Thanks all for the input, and Paul - you know, I will have to take a look at that list again if the spam has slowed.

So far, I have been able to do some head/tail -ing before putting things through pizza, which seems to work fine so far. If I end up needing some more fancy stuff, I will figure out another way.

Dave