Dear all,
I am running multiple partition runs. Whenever a criteria is satisfied in one of the partitions, I would like to save the data and shut down all the other partitions. I tried to use the partition command that is supposed to act on the specified partitions but the simulation continues to run after. This is my command run 1000000 every 10 "if '$(v_disp) >= 2' then ' write_data event.data ' 'partition yes * quit'"
Update, I did find a (quite ugly) workaround which is to set a variable that halts my simulation with âerror hardâ, which causes all partitions to crash. I wish something more clean that does not artificially create an error could be done
That would require some C++ programming. It is not so simple to communicate with MPI between multiple partitions since they would have to âlistenâ to each other but by construction, they are independent.
Without looking into details, I doubt that the quit command can be set up for that since what it does currently is not very different from triggering an error. It would be possible to tell it to use MPI_Abort() on the âuniverseâ communicator instead of the âworldâ communicator, but that would have the exact same result of your âtrigger an errorâ-hack.
Instead, I would favor a modification of fix halt. That would be activated regularly during a run and in case of an global exit, it could poll for a message on the inter-world communicator that would then tell every âworldâ to stop their runs after the next iteration.
[Update]
@tomasfbouvier I went ahead and added such a feature to fix halt.
It works for me using the following simple test input. Partition 2 has a shorter wall time limit (10s instead of 100s) and thus fix halt will trigger for that first and it sends messages to the other partitions to stop as well:
# we have a different CPU time limit on partition 2 to trigger fix halt
variable maxtime universe 100 10 100 100 100 100 100 100 100 100
variable curtime equal cpu
units lj
atom_style atomic
lattice fcc 0.8442
region box block 0 20 0 20 0 20
create_box 1 box
create_atoms 1 box
mass 1 1.0
velocity all create 1.44 87287 loop geom
pair_style lj/cut 2.5
pair_coeff 1 1 1.0 1.0 2.5
neighbor 0.3 bin
neigh_modify delay 0 every 20 check no
fix 1 all nve
fix 2 all halt 100 v_curtime > ${maxtime} error soft message yes universe yes
thermo 100
run 10000
Thank you so much for your answer and for implementing the solution. I will try it myself for my use case. Otherwise I guess I will proceed with the âraising an error solutionâ.