I am trying to monitor the efficiency of the parallel tempering simulation in LAMMPS by looking at the swap acceptance rate during the simulation. The LAMMPS documentation seems silent about this, although the output shows which temperature replicas are exchanged at every attempt.
The swap acceptance rate will enable me optimized the temperature schedule for improved results/efficiency. Is there any way to get print out without modifying LAMMPS code itself?
I am trying to monitor the efficiency of the parallel tempering simulation in LAMMPS by looking at the swap acceptance rate during the simulation. The LAMMPS documentation seems silent about this, although the output shows which temperature replicas are exchanged at every attempt.
The swap acceptance rate will enable me optimized the temperature schedule for improved results/efficiency. Is there any way to get print out without modifying LAMMPS code itself?
how about writing a little awk/perl/python/matlab script that parses swap
events listed in the output and then calculates the acceptance ratio from that?
Thanks. I realized the same thing after sending the e-mail for help. Acceptance ratio seems like # of changes in consecutive attempts divide by total number of replicas. So, a post-processing awk script should do the trick.
I want use this opportunity to apologize for any offensive words/statement that I might have made in yesterday testy exchange. I believe, we are both passionate about what we do and have the best intentions to do quality reserach.
Thanks. I realized the same thing after sending the e-mail for help.
yes, describing your problem accurately is often the most
important step to figure out a solution for a problem. i like
to call this "verbal debugging" and am occasionally (ab)using
my colleagues for that purpose. it is like going to a shrink,
but considerably cheaper.
Acceptance ratio seems like # of changes in consecutive attempts divide by total number of replicas. So, a post-processing awk script should do the trick.
i would rather do this on a per temperature/replica basis,
at least this is what we did a while ago for this paper: http://dx.doi.org/10.1021/nl802645d
and it helped significantly to improve the computational
efficiency of the study (which required quite massive
computational resources at that time).
I want use this opportunity to apologize for any offensive words/statement that I might have made in yesterday testy exchange.
no problem. i like the smell of a good flame war in the morning.
there are certain triggers, that get my blood pressure up quickly.
I believe, we are both passionate about what we do and have the best intentions to do quality reserach.
the good thing about saying (or rather writing) what you believe
is right, is that you won't build up frustration and hold a long
lasting grudge afterwards.
The paper you linked iis quite interesting and the work must have cost a lot! The per temperature or replica basis will still require a trial and error scheduling.
I saw a study done with parallel tempering Monte Carlo that shows that the optimal acceptance probability should be 38.74 % for minimal computation cost in the canonical ensemble (J. Phys. Chem. B, Vol 109, No. 9, 2005, pgs 4189-4196). Although their temperature scheduling is not very clear.
Some other studies have suggested using a geometric progression in temperature albeit with rigorous mathematical proof one still need to play around with some parameters to get things right. This can be frustrating for large systems with high computational cost.
Glad to know that you are not keeping any grudge and I do understand your tough stance with some people. I believe it's necessary sometime to get them think harder!
that is what the grad student who did the study and i started from.
we had a few long discussions and made some experiments with
toy systems to come up with something that worked for us. due to
the high energy required to desorb DNA bases, we had to cover a
very large temperature range and thus wanted to keep the number
of replicas to a minimum. the student wrote a program to optimize
the distribution, by looking at the temperature fluctuations of each
replica, fitting them to a model distribution function and then computing
the overlap. for the size of the calculation, this effort was justified.
from a statistical mechanical perspective, it doesn't matter so much,
as replica with a lower transfer ratio have a higher statistical weight.
it only means that you have to run a little bit longer to have sufficiently
converged results for the affected replicas. in some cases, e.g.
if you have a phase transition between two temperatures, it can
be beneficial to have several replica close together, to get better
sampling of the transition itself. it always depends on what
convergence of a quantity is rate determining.
That was a clever methodology for such problem. I have neglected the issue of first order phase transition temperature, this might have accounted for my low swap probabilities at higher swap indexes. Your explanation below just brought this to my attention. I will also look into that. Thanks for the heads up.
Since tempering is a multi-replica simulation, each replica
reads the dump command and produces its own output.
So you get a dump file per replica. Presumably you
want each file to have a different name, which can
be done by using a replica-variable in the filename.
With this command "dump 1 all atom 1000 dump.%" all the configuration
will have the same temperature in a dump file as oppose to the command
"dump 1 all atom 1000 dump.$t" which has different temperatures and I
have to match the temperature from the log.lammps.index file?
dump.% is actually a custom style and will dump according to the processor partition which should match up with your partition log files numbering.
For some reason, my LAMMPS version is not getting past partition 0.
My advice is to stick with your dump.$t style, except you need to put the dump command after variable command and before temper as shown below.
You can also write the partition log files as a variable so you will be able to match them up with the dump files. With this style , the temperature in each dump output will be the same.
variable t world 300.0 310.0 320.0 330.0
fix myfix all nvt $t $t 100.0
dump 1 all atom 1000 dump.$t
log log.$t
temper 100000 100 $t myfix 3847 58382
No - you can name the files however you like,
but each file is tied to a replica (that is the
only way to do it - you can't have different
processors write to a file handle opened by
one proc). And the temperature of that
replica's atoms will vary over time as swaps take
place. Temps are swapped, not configs.
If you want a dump file with different configs, all
at one temp (not sure why you would want that),
then you'd have to shuffle the dump files and
pick the right snapshot from each to assemble that
file. Easy to do with a Python script and the
Pizza.py tool.