Performing frame-by-frame energy minimization using the "rerun" command

Hi all, I am currently trying to figure out how to use LAMMPS as an efficient postprocessing tool for computing inherent structures from an existing dump file. I have an existing dump file for a bead-spring polymer melt generated in LAMMPS, and would like to read it in and perform an energy minimization on each frame using the built-in FIRE algorithm to quench to the inherent structure. The way I am currently doing this is by using a loop over the number of frames where I call the read_dump and write_dump command on each frame, like so (I have already used read_data to get my topology/bonds and set up the correct potentials):

variable frame loop 0 10000 # create the frame variable

label loop
print "Analyzing frame ${frame}"
variable frame_timestep equal "50 * v_frame" # the frames of the file being read are spaced 50 timesteps apart
print "Timestep is ${frame_timestep}"
read_dump training.lammpstrj ${frame_timestep} x y z format native # load up the correct timestep of the trajectory
minimize 1.0e-4 1.0e-6 100 1000 # perform minimization with the FIRE algorithm
reset_timestep ${frame_timestep} # set the timestep back to the same value in the orginal lammpstrj
write_dump all custom fire.lammpstrj id mol x y z modify append yes # write frame to a single dump file
next frame # iterate frame
jump input loop

This appears to be working, however it’s very slow (for the 10,000 frame and 10,000 particle system, it took 10 hours to run in parallel on 6 processors). The time it takes for each minimization is, on average, only about 0.15 seconds based on the log.lammps file, so there is a lot of extra time coming from somewhere else. My guess is that constantly calling read_dump on every iteration over 10,000 frames is a lot to handle and that using the rerun command would be a lot more efficient. However, the documentation and examples I’ve been able to find for rerun command make it seem like it cannot be used for things like minimize which involve forward time integration.

Does anyone who is more experienced in LAMMPS know whether there is a way I could simply call rerun at the end of script and just perform the loop on each individual frame that way? Or is using read_dump repeatedly the only way to do something like this?

Thanks so much for the help!

Sam

As the name rerun indicates, the rerun command tries to repeat what a run command does with an existing trajectory. You use case is different and thus you cannot use rerun. You would have to write a custom command (e.g. reminimize) that would execute a minimization on each frame of an existing trajectory.

Before going that route, however, I would suggest to do some profiling to determine where exactly the “lost” time is spent to avoid trying to optimize something that does not need optimizing.

1 Like

Axel,

Thanks so much for the helpful reply! I followed your suggestion and did some in-depth timing testing, confirming that most of the time was in fact being spent on the read_dump command, which took progressively longer and longer to perform as you get more frames into the large trajectory file.

The simple (though not elegant) solution to this was to use the bash split command to break the trajectory up into temporary individual numbered files for each frame, have the LAMMPS script iterate over the files instead of the complete trajectory frames, and then delete the temporary files. For reference this sped the total wall time of the simulation on the same architecture up from 10:07:00 to 00:35:00.

As for the reminimize command, perhaps this will be a future side project…

All the best,

Sam