Is there any way of outputting a separate trajectory file for each molecule?

That’s exactly my question, say I have 1000 molecules (with 10 particles each, for example), is there any way of getting the trajectory of these molecules

in different trajectory files without having to code a program to split them?

I have coded a program in bash but if my system is really big it freezes.

Regards!

That's exactly my question, say I have 1000 molecules (with 10 particles
each, for example), is there any way of getting the trajectory of these
molecules
in different trajectory files without having to code a program to split
them?

sure thing. define 1000 dumps, one for each molecule and use
dump_modify thresh to select the individual molecule you want to
output, e.g. via its molecule id.

I have coded a program in bash but if my system is really big it freezes.

that would be an indication, that your program is badly written and -
for example - could have a memory leak.

splitting data like this and reading/writing (large) formatted text
files is usually highly inefficient. you didn't say what the whole
purpose of this exercise is, but i would venture a guess, that this
could likely be much more effectively done with a proper scripting
language that can read/process LAMMPS format trajectories, e.g. python
and pizza.py or Tcl and VMD.

axel.

Hi Axel,

thanks for your reply! I want to work with the properties of each molecule separately to later consider a Boltzmann average for example. My program usually works with trajectory files of 500 megas or less, but when I go to the gigas realm I get messages such as “connection timed out” from my local network. I’m going to try what you’re suggesting about the damps. Thanks!

Hi Axel,

thanks for your reply! I want to work with the properties of each molecule
separately to later consider a Boltzmann average for example. My program
usually works with trajectory files of 500 megas or less, but when I go to
the gigas realm I get messages such as "connection timed out" from my local
network. I'm going to try what you're suggesting about the damps. Thanks!

this doesn't make much sense. are you postprocessing data from a
networked file system?
but even then, transferring a few gigabytes of data should not be a
big problem. unless you are doing the processing in a very, very
inefficient way, by having to read the entire file multiple times,
writing out parts to new files and then reading those back and so on.

generally, efficient postprocessing is done a) using a copy on a local
file system, and b) by reading data only once.
i.e. you could write your postprocessing tool to handle all your 1000
molecules at the same time, i.e. read one trajectory frame into
memory, do the required computations for each molecule and update any
accumulators or whatever else is done, discard the data for the one
frame and read the next. if this exhausts your memory capacity, you
can do it in chunks. on current hardware, it should not be a big
problem to keep the entire trajectory in RAM and then do the
individual processing by looping over the data.

axel.

Great! I’m kind of new to coding and was actually doing it the most inefficient way possible: taking the file, splitting by screenshot, splitting by molecule, adding the screenshots of each molecule in a new file, etc.

If you define 1000 dump commands in a LAMMPS input

script, it will likely run much slower, if you dump very

frequently. Each of those masked dumps (for one molecule)

will loop over all the atoms to identify the masked ones.

I don’t see why a post-processing script can’t operate

on one alrge dump file and spit one line at a time (based on

mol ID) into one of a 1000 new files. It requires

no memory and processes the dump file in linear time

(one pass thru the original file).

Steve