Minimization results differ for different numbers of CPUs

Hendrik_Heenen · September 20, 2013, 5:57pm

Dear All,

I have encountered a problem in LAMMPS-17Jun13 (Linux + Openmpi 1.4.5) when running minimizations on a different number of CPU processors. The problem persists also with LAMMPS-10Sep13 (Mac + Openmpi 1.7.2):

For different systems/geometries I obtain minimization results which vary with the number of employed CPUs. I have attached a minimal example input and corresponding output based on the examples/dreiding.

In the examples I combined harmonic bonds and pair-styles (Lennard-Jones also Buckingham) with long-range coulomb forces (Ewald or pppm).

When the systems are minimized, different final energies are reached when run on a different number of processors. Energy differences go up to 600 meV for the LJ based calculations.

I have deliberately suppressed reneighbouring during the minimization (via neigh_modify once yes) to exclude neighbor list effects. Also different minimizers yielded similarly different minima.

I found two possibilities, to work around this problem:

I- can either shorten my pair cut-off to unphysically short distances (=2 Angstroem)
I- can set the stepsize of the minimization to values around 1e-4 Angstroem which will also result in same final energies (min_modify dmax 0.0001).

Could particularly the latter point to numerical instabilities in the linesearch?

Thank you very much in advance for any ideas you have on that matter.

Yours sincerely

Hendrik Heenen

Attachement.zip (104 KB)

akohlmey · September 23, 2013, 9:31pm

Dear All,

I have encountered a problem in LAMMPS-17Jun13 (Linux + Openmpi 1.4.5) when
running minimizations on a different number of CPU processors. The problem
persists also with LAMMPS-10Sep13 (Mac + Openmpi 1.7.2):

For different systems/geometries I obtain minimization results which vary
with the number of employed CPUs. I have attached a minimal example input
and corresponding output based on the examples/dreiding.

In the examples I combined harmonic bonds and pair-styles (Lennard-Jones
also Buckingham) with long-range coulomb forces (Ewald or pppm).

When the systems are minimized, different final energies are reached when
run on a different number of processors. Energy differences go up to 600 meV
for the LJ based calculations.

I have deliberately suppressed reneighbouring during the minimization (via
neigh_modify once yes) to exclude neighbor list effects. Also different
minimizers yielded similarly different minima.

first off, your system has a very large number of degrees of freedom,
there are going to be many local minima and since you seem to be
fairly far off the next local minimum, any source of small differences
(like different algorithms, different parameters, different order of
summing up forces) have the potential to drop your system into a
different local minimum.

that being said, there are a few things that you can do to reduce the
influx of "noise".

- your ewald energy criterion is very loose. if you want accurate
forces, you should rather use something like 1e-8 or even smaller.
- you are using tabulated forces/energies in real space. trying using:
pair_modify table 0
- you may want to check out lj/long/coul/long and ewald/disp instead

I found two possibilities, to work around this problem:

- I- can either shorten my pair cut-off to unphysically short
distances (=2 Angstroem)

which cutoff?

- I- can set the stepsize of the minimization to values around 1e-4
Angstroem which will also result in same final energies (min_modify dmax
0.0001).

Could particularly the latter point to numerical instabilities in the
linesearch?

i think before jumping to conclusions, it is necessary to do the same
test with a reasonably well pre-minimized structure and check if that
still would lead to different minima.

that should help to determine whether you are dealing with a rugged
potential hypersurface or a potential bug.

axel.

sjplimp · September 24, 2013, 1:51pm

And another Q is whether you have insured
your potentials go to zero energy at the cutoff
by using the pair_modify offset keyword, i.e.

for the LJ part. If they do not, then you have

an energy function that is ill-defined for minimization.

The minimize doc page warns about that.

Steve