IR intensity difference with different number of cores

Dear Lammps-users,

Thanks for the help from Axel and Steve, i searched some paper about the statistic error of molecular dynamics, and when i increase the simulation length from 150 ps to 1ns and 2 ns, the divergence from the use of different random number seed can be reduced, but for the slight difference from the use of different number of cores, it didn’t work. But, it’s fair enough to take these slight difference as statistic errors. Anyway, thanks for the help!!!

jiasen Guo

Dear Lammps-users,

Thanks for the help from Axel and Steve, i searched some paper about the
statistic error of molecular dynamics, and when i increase the simulation
length from 150 ps to 1ns and 2 ns, the divergence from the use of different
random number seed can be reduced, but for the slight difference from the
use of different number of cores, it didn't work. But, it's fair enough to
take these slight difference as statistic errors. Anyway, thanks for the
help!!!

i don't think you found a paper with good advice to assess statistical
uncertainty, and i don't think that your rationalization is good.

first off, you seem to be overlooking three major things:
1) how much are your results impacted by the equilibration process?
are there any artifacts (e.g. phonons) that remain in your system
after equilibration, that may affect your production data?
2) how much are your results impacted by your method of postprocessing the data?
3) how much are your results impacted by other simulation settings,
e.g. the method and settings to adjust temperature? length of time
step? cutoffs? neighbor list updates?

then, rather than just making your simulation longer and taking what
you see as remaining divergence on good faith (which is not a good
idea in science), you should look for a more systematic approach. one
of them is to analyze your simulation data in chunks. the smaller the
chunks, the better you can assess impact of equilibration (or lack of
it) and the better you can see, if there are low frequency effects. in
the most rigorous of these approaches, you first divide your
simulation data in halves and analyze each half separately and look at
the differences. then you take quarters, eighths and so on until each
chunk is too small and the result too noisy. with this approach, you
can actually quantify the
the statistical variation and extrapolate to an actual statistical
uncertainty, and with those numbers, you can apply a dependable error
bar to each data point of your determined IR spectra.

furthermore, you seem to be ignoring, that the intensities your are
looking at have no real meaning in the first place. they do not take
into account that for either IR or Raman spectroscopy not all
potential vibrational modes are allowed and that the intensity is
determined by the transition moment integrals, i.e. the reaction of
the wavefunction to the dipole operator (IR) or electronic
polarizability (Raman).

...and it doesn't stop there, there are also empirical corrections
that take into account the current temperature of your material and
technically, you should be looking at the (total?) dipole moment
instead of velocities. all of that affects your intensities, but not
the frequencies. those are primarily affected by the accuracy and
applicability of the empirical potential that you are using.

so, if you want to sort this out properly (and i recommend to do and
practice this, as this kind of statistically error assessment should
be done for almost all simulation data where a quantitative result is
desired), you still have to some work and some thinking ahead of you.

axel.