[lammps-users] large dump files

Hi all.

I'm using dump.py from the pizza.py toolkit to parse LAMMPS dump files. My dump files are fairly big (2GB), and I was wondering if there was an easy way to have dump.py initially parse only every n-th (e.g. every 10-th would be convenient for me) or only parse some range of the dump file. It seems like the dump.tselect() time select function only relates to iterating over snapshots *after* they have initially been parsed.

Otherwise, it doesn't seem so hard for me to write my own parser; but this seems like it defeats the point of having some nice standard analysis tool like pizza.py.



We could probably modify the dump() tool in Pizza.py to take an additional
argument when invoked, like 10, to mean read every 10th snapshot.
Probably a good feature to have.

But I think you could get the same effect by invoking the dump tool with
the 2nd “0” arg which means don’t read until
dump.next() is used. Then you could write a little Python loop to
read all the snapshots N at a time via next() and throw out N-1 of them. When the
loop was done, you’d have every Nth snapshot, and would never be
storing all the snapshots at once (low memory). The only drawback to this, is
it would be slightly slower to read/parse all N snapshots, than to skip over
N-1 without parsing them.


I wrote a simple routine to skip over rather than parse N snapshots in the file. It seems to work, but I’m not sure how robust this is.

---------------------------------------- diff for dump.py ------------------------


def pass_snaps(self,num_to_pass):
f = open(self.flist[self.nextfile],‘r’)
for i in xrange(num_to_pass):
snap = self.pass_snapshot(f)
self.eof = f.tell()


yes, something like that should work
it looks like you have 2 calls to next, with 9 skips in one
loop, so that would read 2 out of 11 snapshots ?