Scaling with PMMM

Hi LAMMPS users,

I am pretty new to LAMMPS and I am trying to benchmark it on a simple case.

I have the attached input file (it’s a lj case with different PMMM parameter values) and run with different number of cores. Here the results:

-----------------1 core

Performance: 4565.958 tau/day, 10.569 timesteps/s

Performance: 2182.186 tau/day, 5.051 timesteps/s

Performance: 1774.433 tau/day, 4.107 timesteps/s

Performance: 2543.834 tau/day, 5.889 timesteps/s

Performance: 772.765 tau/day, 6.203 timesteps/s

-----------------2 core

Performance: 4225.783 tau/day, 9.782 timesteps/s

Performance: 2094.889 tau/day, 4.849 timesteps/s

Performance: 1816.108 tau/day, 4.204 timesteps/s

Performance: 2326.414 tau/day, 5.385 timesteps/s

Performance: 703.687 tau/day, 5.593 timesteps/s

-----------------4 core

Performance: 3504.945 tau/day, 8.113 timesteps/s

Performance: 1787.126 tau/day, 4.137 timesteps/s

Performance: 1635.585 tau/day, 3.786 timesteps/s

Performance: 1892.753 tau/day, 4.381 timesteps/s

Performance: 546.831 tau/day, 4.529 timesteps/s

-----------------8 core

Performance: 2593.201 tau/day, 6.003 timesteps/s

Performance: 1404.663 tau/day, 3.252 timesteps/s

Performance: 1284.628 tau/day, 2.974 timesteps/s

Performance: 1394.353 tau/day, 3.228 timesteps/s

Performance: 433.404 tau/day, 3.308 timesteps/s

-----------------16 core

Performance: 1172.032 tau/day, 2.713 timesteps/s

Performance: 926.618 tau/day, 2.145 timesteps/s

Performance: 892.738 tau/day, 2.067 timesteps/s

Performance: 931.408 tau/day, 2.156 timesteps/s

Performance: 270.925 tau/day, 2.157 timesteps/s

-----------------32 core

Performance: 652.883 tau/day, 1.511 timesteps/s

Performance: 601.613 tau/day, 1.393 timesteps/s

Performance: 598.401 tau/day, 1.385 timesteps/s

Performance: 618.035 tau/day, 1.431 timesteps/s

Performance: 167.245 tau/day, 1.451 timesteps/s

Results are still odd: I would expect a larger simulation time using more cores, not lower! In any case there is a poor scaling from 1 to 8 cores, better from 8 to 32.

Any suggestions? Thanks.

Best,

Jony

Dr Jony Castagna

Sci-Tech Daresbury

Keckwick Lane

Daresbury

Warrington

WA4 4AD

Tel.: +44 (0)1925 603682

in.lj (2.58 KB)

Your number of atoms?

If you have 1000 atoms or less, more time is wasted on interprocess communications than you gain from parallelization (see Amdahl’s law).

Yours,

Vasily