Restore system state

_Vishnu_V_Krishnan · June 8, 2019, 8:31am

Hi!

I'm running a simulation with the `nvt` fix, and I need to periodically sample
configurations and `minimize` them. How can I best do this without affecting
the system?

Can I use the `store/state` fix to save the state of the system at a point,
minimize it, then revert to the stored state, and continue? If so, how do I
restore the state to that point?

Or do I have to `dump` just before the minimisation, and read it back in
after?

Is there any information internal to `nvt` or anything else that might be lost
in either of these methods?

Thanks!
Vishnu

akohlmey · June 8, 2019, 10:55am

Hi!

I’m running a simulation with the nvt fix, and I need to periodically sample
configurations and minimize them. How can I best do this without affecting
the system?

the simplest approach is to do this with separate calculations. just write out individual restart files for each configuration you want to minimize. and while you keep running the regular simulations, you can process each restart file and do a minimization as they become available. this has the benefit of running concurrently and thus not slowing down your primary simulation.

axel.

_Vishnu_V_Krishnan · June 10, 2019, 5:38am

I've implemented it that way for now, and it works, but it adds one more step
to the process, and more importantly, generates lots of binary restart files
that I have to store.

Would've been nice to have a simple way to handle it within one run.

A complicated way would be to replicate the system, 'exclude' the replica from
the original's neighbour list, and then minimise, store and discard the
replica. After which, I continue running the original.

akohlmey · June 10, 2019, 9:27am

the simplest approach is to do this with separate calculations. just write
out individual restart files for each configuration you want to minimize.
and while you keep running the regular simulations, you can process each
restart file and do a minimization as they become available. this has the
benefit of running concurrently and thus not slowing down your primary
simulation.

I’ve implemented it that way for now, and it works, but it adds one more step
to the process, and more importantly, generates lots of binary restart files
that I have to store.

you don’t have to keep those files around and you can process them concurrently as they are written.
the benefit of this approach is simplicity and speed. simplicity means less chances for errors. that must not be underestimated.

Would’ve been nice to have a simple way to handle it within one run.

if you insist, you can set this up with a loop.
run for a chunk of steps. write out a restart, then do the minimization.
then use the “clear” command to reset LAMMPS and re-initialize your simulation with “read_restart” and whatever else is needed to continue.

it is a common thing for people with limited experience to want to do everything in just one run. while this sounds compelling at first, in my experience the additional complexity is not worth the risk of things going wrong. you may end up having to repeat or continue the calculation from some intermediate point and this makes things needlessly complicated. the same is true for system setup and equilibration. i always found it better in the long run to do those as separate steps from production simulations.

axel.

_Vishnu_V_Krishnan · June 10, 2019, 10:05am

> > the simplest approach is to do this with separate calculations. just
> write
> > out individual restart files for each configuration you want to
> > minimize.
> > and while you keep running the regular simulations, you can process each
> > restart file and do a minimization as they become available. this has
> > the
> > benefit of running concurrently and thus not slowing down your primary
> > simulation.
> >
>
> I've implemented it that way for now, and it works, but it adds one more
> step
> to the process, and more importantly, generates lots of binary restart
> files
> that I have to store.
>

you don't have to keep those files around and you can process them
concurrently as they are written.
the benefit of this approach is simplicity and speed. simplicity means less
chances for errors. that must not be underestimated.

it is a common thing for people with limited experience to want to do
everything in just one run. while this sounds compelling at first, in my
experience the additional complexity is not worth the risk of things going
wrong. you may end up having to repeat or continue the calculation from
some intermediate point and this makes things needlessly complicated. the
same is true for system setup and equilibration. i always found it better
in the long run to do those as separate steps from production simulations.

I completely agree, which is why I finally did it that way.

> Would've been nice to have a simple way to handle it within one run.

if you insist, you can set this up with a loop.
run for a chunk of steps. write out a restart, then do the minimization.
then use the "clear" command to reset LAMMPS and re-initialize your
simulation with "read_restart" and whatever else is needed to continue.

I'd meant some way that does not involve writing out restart files, because
currently that is a penalty to speed for large systems. If things happened
internally, it would actually be faster.

When I first came across the 'store/state' command, I though I would be able
to have a "check-point" of the full system state (what ever is written to a
restart file) in memory, that I can return to, at any point in the future.
Wouldn't something like that be useful?

akohlmey · June 10, 2019, 10:18am

Would’ve been nice to have a simple way to handle it within one run.

if you insist, you can set this up with a loop.
run for a chunk of steps. write out a restart, then do the minimization.
then use the “clear” command to reset LAMMPS and re-initialize your
simulation with “read_restart” and whatever else is needed to continue.

I’d meant some way that does not involve writing out restart files, because
currently that is a penalty to speed for large systems. If things happened
internally, it would actually be faster.

not automatically. the problem is, that whenever you do a “reset to a previous state”, you have to go through the reneighbor and setup phases of the simulation loop, regardless of whether you store the status internally or not, since atoms may migrate between processors during minimization. and the setup phase includes a full evaluation of the forces. for large systems (and expensive force fields), this can take much longer than writing out a restart. … and you are losing the ability to do the minimizations concurrently, which will gain you much more time-to-solution than you would save by avoiding writing restart files.

axel.

_Vishnu_V_Krishnan · June 10, 2019, 10:42am

Okay, but what if the 'reset' procedure involved storing exactly the same info
as a restart file, but in memory, and then loading it when necessary?

I agree that this will not allow concurrent minimsations, but will be simpler,
Also, assuming finite resources, all dedicated to running this process, will
also mean shorter time-to-solution.

akohlmey · June 10, 2019, 10:50am

Okay, but what if the ‘reset’ procedure involved storing exactly the same info
as a restart file, but in memory, and then loading it when necessary?

that would require a massive rewrite of LAMMPS as most data is distributed and attached to atoms and thus migrates with them, if they pass through subdomains. you would have to keep track of that and reset everything in a reliable fashion. that is almost impossible to do correctly. what LAMMPS does is conceptionally simpler and keeps the actual production simulation to run fast and efficient.

I agree that this will not allow concurrent minimsations, but will be simpler,

i think i have pointed out quite convincingly that it is not simpler from a technical point of view. you only think it is simpler since you have not looked at the technical challenges of what it means to implement and run a distributed data parallel application and thought them through.

Also, assuming finite resources, all dedicated to running this process, will
also mean shorter time-to-solution.

that is ignoring the realities of current computing resources. if the minimizations are a significant use of resources compared to the initial run, you are doing them far too often and it is in general quite simple these days to get additional resources (or a larger share of existing resources), if you can occupy more of them concurrently.

axel.