Parallel and Serial computation

_Vaidyanathan_M.S · July 15, 2013, 8:40pm

Dear All,

I have been trying to run Lammps in parallel for the first time. To be sure I cross checked the results with that of serial (by comparing diffusivities of the ion in water). System I am using is Na ion in water.
I am getting different answers from the serial and parallel computing.

Version of Lammps - lammps-8Jul-2013

Number of parallel processors - 12 node 24 processors

Parallel Computer used - Lonestar (TACC facility)

For serial Run I used 2 systems - my personal desktop and TACC facility.

Inputs were divided into 3 scripts. 1st to create an equilibrated system. 2nd and 3rd to generate equilibrated outputs.

I am attaching the data file and the input files. This was just a test run before I could run anything other. I have read that, eventhough theoretically both should be same, while restarting, the final results might diverge. But the error seems to be a bit high. The diffusivity values for the ion I got from Serial and Parallel are the following.

Serial from Desktop : 4.7974 *10^(-10) m^2/s

Serial from TACC : 1.6785 * 10^(-9) m^2/s

Parallel:1.2024 *10^(-9) m^2/s

Sorry for such a long email and input scripts. Question is whatever the system is, should i not expect the same answers (eventhough physically it may be bogus). Also while compiling TACC, it used -DLAMMPS_INT64_MAX where as my desktop did not take it. So I compiled with -DLAMMPS_SMALLSMALL. Any help would be greatly appreciated.

Regards

Vaidyanathan M S

Dept of Chemical Engg

The University of Texas at Austin

data_polwater.txt (58.7 KB)

input_polwater.txt (2.46 KB)

input_polwater2.txt (1.8 KB)

input_polwater3.txt (1.11 KB)

akohlmey · July 15, 2013, 9:42pm

Dear All,

I have been trying to run Lammps in parallel for the first time. To be sure
I cross checked the results with that of serial (by comparing diffusivities
of the ion in water). System I am using is Na ion in water.
I am getting different answers from the serial and parallel computing.

i think before even comparing serial and parallel runs, you should
look at your input as a whole.

your equilibration procedure looks very strange. what is the point of
using fix adapt and fix deform?

what are the particles (hydrogens?) with charge -0.5 good for?
if you are using shake, you should be able to crank up the timestep to
2fs (that is why people use it).
but why use ewald with 1.0e-2 convergence? this is almost as good as
not using it?
why fix recenter? if your system is properly set up and equilibrated,
you should not need it.

finally, you compute MD over 250ps. that is rather little.
getting a converged MSD in water can be a bit tricky. check this out:

http://klein-group.icms.temple.edu/akohlmey/files/talk-trieste2004-water.pdf

axel.

_Vaidyanathan_M.S · July 15, 2013, 10:39pm

Dear Axel,

Thanks alot for the response and the reference. I went through it. A time of 10 ns or more was given as a nominal time !!!. I ran for like 2% of it… So probably that could be a reason

I used fix adapt so that I could change the values of the charge on the SPC/E hydrogen/oxygen atoms. I was just studying the effect of charge on those. Instead of changing in the data file, I thought of doing it in the input file.

Fix deform was used, because, initially i started off with a sparse system with a lower density and then deformed to bring to the required density.

I agree that using ewald of 1e-2 is unphysical. I was just trying to see whether the serial and parallel runs comply with each other (just in magnitude) so that I could be confident of the parallel runs (since this is my first time). Fix recenter was used, because otherwise, the system as a whole was moving . So either I had to subtract the COM coordinate at every instant or thought this would be a easier way. Thermal annealing was done thrice so that I am sure that the system is in equilibrium.

Finally as you pointed out, probably 250 ps is so less a time for the system to equilibrate,. My basic doubt is, since the process is deterministic (MD) and not stochastic, and if the initial conditions are the same, should the system not go to the same configuration at the end of X time steps irrespective of whether it is in parallel or serial or am I missing here something basic?

I shall also try other examples given in lammps such as PRD. That would be a much better starting point I guess.

Thanks alot for the reply.

Regards

Vaidyanathan M S

Dept of Chem Engg

UT Austin

akohlmey · July 16, 2013, 6:01am

Dear Axel,

Thanks alot for the response and the reference. I went through it. A time of
10 ns or more was given as a nominal time !!!. I ran for like 2% of it.. So
probably that could be a reason

I used fix adapt so that I could change the values of the charge on the
SPC/E hydrogen/oxygen atoms. I was just studying the effect of charge on
those. Instead of changing in the data file, I thought of doing it in the
input file.

this is abuse of fix adapt. just use the set command.

Fix deform was used, because, initially i started off with a sparse system
with a lower density and then deformed to bring to the required density.

same issue. just use change_box followed by a minimization.

I agree that using ewald of 1e-2 is unphysical. I was just trying to see
whether the serial and parallel runs comply with each other (just in
magnitude) so that I could be confident of the parallel runs (since this is

a gazillion of people use LAMMPS in parallel. if there was something
fundamentally wrong, it would have been noticed. in fact, even running
in serial, LAMMPS acts as if you are running parallel. if there are
any problems in parallel, that don't show up in serial runs, then they
are due to bad physics or bad choice of parameters and thus an
indication that the serial calculation was silently not working
properly.

my first time). Fix recenter was used, because otherwise, the system as a
whole was moving . So either I had to subtract the COM coordinate at every
instant or thought this would be a easier way. Thermal annealing was done

this suppresses the symptom, but does not fix the problem. you still
may be subject to the flying icecube syndrome. you just won't see it.
your system should not have a drift. it may build during equilibration
when starting from a less than perfect initial condition. you could
just remove the center of mass velocity during/after the equilibration
steps and then the system should not drift; at least not
significantly. if it still does, it is a sign for some bad physics
happening somewhat that need to be resolved, not suppressed. with fix
recenter, either your temperature will be wrong or your diffusion
measurement.

thrice so that I am sure that the system is in equilibrium.

Finally as you pointed out, probably 250 ps is so less a time for the system
to equilibrate,. My basic doubt is, since the process is deterministic (MD)
and not stochastic, and if the initial conditions are the same, should the

MD using numerical integration and a thermostat is deterministic only
"locally". after a bit it decorrelates exponentially, since it is
essentially a chaotic(!) system (ever heard of the butterfly effect?).

system not go to the same configuration at the end of X time steps
irrespective of whether it is in parallel or serial or am I missing here
something basic?

yes.

I shall also try other examples given in lammps such as PRD. That would be a
much better starting point I guess.

i am confused. what has parallel replica dynamics to do with self-diffusion?

axel.

p.s.: you forgot to explain those two weird "hydrogen half-ions"

Daniel_Casimir1 · July 16, 2013, 3:15pm

Finally as you pointed out, probably 250 ps is so less a time for the system
to equilibrate,. My basic doubt is, since the process is deterministic (MD)
and not stochastic, and if the initial conditions are the same, should the

MD using numerical integration and a thermostat is deterministic only
“locally”. after a bit it decorrelates exponentially, since it is
essentially a chaotic(!) system (ever heard of the butterfly effect?).

Is this Lyapunov Instability?

akohlmey · July 16, 2013, 3:16pm

Finally as you pointed out, probably 250 ps is so less a time for the
system
to equilibrate,. My basic doubt is, since the process is deterministic
(MD)
and not stochastic, and if the initial conditions are the same, should the

MD using numerical integration and a thermostat is deterministic only
"locally". after a bit it decorrelates exponentially, since it is
essentially a chaotic(!) system (ever heard of the butterfly effect?).

Is this Lyapunov Instability?

yes, it is.