syncing multiple partitions before continuing


I’m running a simulation using “fix deposit,” but I’d like to find an optimal deposition location. That is, rather than depositing in a random space throughout a region, I’d like to deposit in the location which gives the lowest energy configuration.

One way I figured that I could do this is by starting with a given configuration, running ${set_parallelruns} parallel simulations starting from that given configuration in each of which 1 particle is deposited and then run for a bit and then minimized, selecting the parallel simulation which resulted in the lowest energy, and repeating for the next deposition.

This is somewhat similar to the “prd” command, except that there is no “event” which chooses which of the {set_parallelpruns} simulations all the rest should be synced to, but rather all {set_parallelpruns} simulations run for a given amount of time and they all then sync to the one with the lowest energy.

I can do this using 1 partition with the following code:

write_dump all custom traj.restart.0 id type x y z vx vy vz ix iy iz
variable outer_loop_counter loop {set_depparts} label outer_loop if "{outer_loop_counter} > 1" then &
“variable min_energy delete”
variable min_energy equal 1.0E20
variable inner_loop_counter loop {set_parallelruns} label inner_loop read_dump traj.restart.(v_outer_loop_counter-1) (v_set_deptime*v_outer_loop_counter) x y z vx vy vz ix iy iz replace no purge yes add yes fix 1 all langevin 1000 1000 100 65348 fix 2 all deposit 1 1 1 65348 region insertionspace near 3.0 attempt 100 run {set_deptime}
unfix 1
unfix 2
minimize 1.0e-4 0.0 10000 10000
if “{pe} < {min_energy}” then &
“variable min_energy delete” &
"variable min_energy equal {pe}" write_dump all custom traj.restart.{outer_loop_counter} id type x y z vx vy vz ix iy iz
next inner_loop_counter
jump inner_loop
next outer_loop_counter
jump outer_loop

I’d now like to parallelize the inner loop over {set_parallelruns} processors since doing so gives me 100% parallel efficiency, whereas doing each inner loop on {set_parallelruns} processors gives me much lower parallel efficiency. Naively, I’d just change the variable inner_loop_counter from a “loop” type variable to a “uloop” type variable and keep things as is, invoking LAMMPS with “-partition ${set_parallelruns}x1”. However, the different inner loop partitions would finish at different times, so the first partition that stops would begin the outer loop using a different restart file than a later partition. Thus, I need to make the partitions wait for each other to finish the inner loop before going to the next outer loop cycle.

Does anyone know a way to do this? It’d be nice if there’d be a native LAMMPS command I can use, since the only solution I can think of is to use the LAMMPS library interface and do the syncing in my own code.

This is an interesting conceptual Q.

For your N deposition attempts, I don’t see how

you are getting a different deposition location each time,

e.g. thru a different random number. But the loop

structure you’ve written seems like a reasonable

way to perform N deposition attempts, then continue on.

Re: parallelize over inner loop:

I don’t see how parallelizing over partitions could work

in a single script.
You could run your N independent

simuations, but at the end you want to choose one

of them and then presumably run a single long simulation

on all the procs. Then presumably rinse and repeat.

That isn’t the way partitions in LAMMPS work, they

are statically setup at launch time via the command-line


You might be able to write a new PRD-like command

that would have logic for using the partitions for N

runs, then morphing them into 1 partition for the long

run, then morphing back to N partitions for the next iteration.

But I think that is better done external to LAMMPS

thru the lib interface. You could instantiate and re-instantiate

LAMMPS with different partition counts as many times as you like.

Or just instantiate N different LAMMPS instances to

run your N independent simulations.

You could communicate across LAMMPS runs via files

or info you extract and re-populate from your caller.

PRD itself could also be implemented that way, as an

external program that has the logic in the prd.cpp file

and uses LAMMPS as a callable engine for individual

MD runs. Ditto for parallel tempering, etc.


Thanks Steve, that makes sense. I think it’d actually be pretty easy to put the outer loop into my bash script that invokes the LAMMPS executable, which would allow me to use a multi-partition call to LAMMPS to do each part of the inner loop, and have the bash script invoke that call ${set_depparts} times. So I’ll go forward with that.