[lammps-users] some questions on mpirun

Dear all,

     I have a couple of questions about running lammps with mpi. I run a
16 processor test job and receive the output as

orthogonal box = (-30 -30 -10) to (30 30 2015)
1 by 1 by 16 processor grid
26048 atoms
.
.
.
Loop time of 242.377 on 16 procs for 5000 steps with 26048 atoms

Pair time (\) = 205\.759 \(84\.8923\) Neigh time \() = 0.076967 (0.0317551)
Comm time (\) = 4\.6173 \(1\.90501\) Outpt time \() = 0.0226319 (0.00933751)
Other time (%) = 31.9006 (13.1616)

I have a couple of questions based on this output:

1. I have also read a few log files from lammps examples. I noticed that
most of the log file indicate either 2 by 2 by 1 processor grid or 2 by 2
by 1 processor grid. I assume my simulation box is z direction, therefore,
I have 1 by 1 by 16.

2. The Comm time (\) = 4\.6173 \(1\.90501\) less than 2. Is that too good to
be true?

3. I used the output option
fix OUTPUT_1 all ave/time 1 50 50 c_TW1 c_TR1 &
I am wondering this command will slow down the calculation, since it has
to calculate the temperature every step then average.

4. Is there any optimization available to improve the speed?

I have attached a copy of the input files which can be run as
mpirun -np 16 lmp_xx -var data.cnt < in.nvt

thanks
haibin

Haibin Chen Ph.D.
Mechanical Engineering Dept
Carnegie Mellon University
Pittsburgh, Pa, 15213

in.nvt (4.32 KB)

data.cnt (1020 KB)

Comments below.

Steve

Dear all,

I have a couple of questions about running lammps with mpi\. I run a

16 processor test job and receive the output as

orthogonal box = (-30 -30 -10) to (30 30 2015)
1 by 1 by 16 processor grid
26048 atoms
.
.
.
Loop time of 242.377 on 16 procs for 5000 steps with 26048 atoms

Pair time (\) = 205\.759 \(84\.8923\) Neigh time \() = 0.076967 (0.0317551)
Comm time (\) = 4\.6173 \(1\.90501\) Outpt time \() = 0.0226319 (0.00933751)
Other time (%) = 31.9006 (13.1616)

I have a couple of questions based on this output:

1. I have also read a few log files from lammps examples. I noticed that
most of the log file indicate either 2 by 2 by 1 processor grid or 2 by 2
by 1 processor grid. I assume my simulation box is z direction, therefore,
I have 1 by 1 by 16.

The log file tells you this, so not sure what your question is.

2. The Comm time (\) = 4\.6173 \(1\.90501\) less than 2. Is that too good to
be true?

Apparently not.

3. I used the output option
fix OUTPUT_1 all ave/time 1 50 50 c_TW1 c_TR1 &
I am wondering this command will slow down the calculation, since it has
to calculate the temperature every step then average.

Why don't you try turning it off and see if the simulation time changes?

4. Is there any optimization available to improve the speed?

Try running on 1 proc, then 2, 4, etc for a couple 100 timesteps.
That will tell you if you are getting good speed-up. If all the
time is in pair, then you are doing about the best you can.