Partition the VACF

Roger_Nadler · June 21, 2016, 2:39pm

Dear all

I am using the fix vacf successfully but have a question regarding the way it is computed. As far as I can tell only one correlation function is evaluated, i. e. only one v(0) exists. Is this true? If so, how can I obtain from the same run several correlation functions?

Say, I have a simulation that takes 10000 timesteps, and I want to start a VACF every 200 timesteps for 4000 timesteps each. This results in 30 VAC functions which I could average by summing them and divide by 30.

I know that I can achieve this by writing the velocities to disk and write a small python program, but it results in enormous file sizes, even with custom/gz. What fix should I use if I want to get such an averaged VACF?

Thank you very much!
Regards,
Roger

Stefan_Paquay · June 22, 2016, 8:32am

How big is really big? If you really need to, I think you could achieve this with some loop construct in LAMMPS. If you could create a loop that creates a unique name for each compute and fix that outputs it to file, it should work.

A neater idea might be to just generate a dump file with the frequency you want, i.e., 200. Then you can use the rerun command to generate the VACF, and you can pass to your rerun script an offset. In pseudo-code something like this:

variable rest equal “v_total - v_offset”

run ${offset}

compute 1 all vacf …

fix 1 all fix ave/time …

run ${rest}

akohlmey · June 22, 2016, 12:38pm

i don't think there is much gain to compute autocorrelation functions
for overlapping time windows, as the data is highly correlated and
thus will only have limited statistical relevance. when i was a grad
student, i wrote a code that was able to do such a thing and it turned
out to cost quite some extra computation time for rather little gain.
it still paid off, since CPU time was much more scarce in those days.
however, now it'll be more convenient to simply continue the
simulation for a longer period and then just break it into chunks and
collect non-overlapping VACF data and average over that.

axel.

Roger_Nadler · June 22, 2016, 1:10pm

Dear Alex and Stefan

Thanks for your input!

Yes, I did such a program as well. Although it was for post-analysis to get
the diff. coeff. and the vib. spectrum.

OK, you mean if the correlation time should be something like 5 ps with dt = 1
fs then I would repeat the compute vacf for example 10 times for 5000 time
steps (total 50000 steps) and average over the 10 VAC funtions. Something like
that..

I will still try to get Stefan's suggestion working, even if it means that I
have to write several GBs of data for the velocities. Just for fun and for
educational reasons. Wasn't aware of the rerun option.

Cheers,
Roger

sjplimp · June 22, 2016, 3:12pm

This has come up from time to time on the mail
list. We have resisted writing code that does
multiple window averaging within LAMMPS (for
VACF or MSD) b/c it would be easy to abuse
and consume huge amounts of memory. If
you want N averages (e.g. from N starting points)
active at the same time, that means the code
has to store N sets of coords (or velocities, etc).
It also means those N coords need to move with
each atom when they migrate. So what is N for
your scenario? Sometimes people want N = 5,
but sometimes they might want N = 100 or 1000.

Steve

Roger_Nadler · June 23, 2016, 8:16am

I see your point. For me the whole idea of averaging over N starting points comes from the time when I performed MD with DFT, being limited to about 20-25 ps per trajectory. I've used a correlation length of 2 to 3 ps and N was between 100 and 200. Of course, as Alex mentioned already, it is not really necessary to have that much starting points, so now I probably would take ~20 starting points.

I'm already writing a python program to do what I wanted, but I thought that it might be much more elegant, efficient and less memory consuming (than storing the velocities of 40000+ atoms) if one could do it from within LAMMPS directly, which is why I brought the subject up.

Thanks for your input!

Roger