Parallel computing speed

Dear all lammps users

I’m now simulating 11188 atoms, 8962 bonds, 9622 angles, and 1803434 angstrom box system.

Also I’m using KNL node system, which consists of Intel Xeon Phi 7250 (68 cores) processors, DDR4-2400 96GB RAM (16*6).

What I’m doing now is increasing simulation speed more than 20 ns/day.

Here are some results with my trials and errors to 150,000 timestep.

1 node 64 cores 64 MPI processors 1 OpenMP (444 MPI Grid) :

Performance: 13.228 ns/day, 1.814 hours/ns, 153.100 timesteps/s
94.2% CPU use with 64 MPI tasks x 1 OpenMP threads

MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total

in.intel2.gz (828 Bytes)

in.intel3.gz (829 Bytes)

190313md.dat.gz (318 KB)

in.intel1.gz (827 Bytes)

i don’t think, there is much you can do to improve performance, as your time spend on Kspace is growing massively with the number of MPI ranks. that means you are already beyond the scale out point for pppm. the straightforward option would be to use real Xeon CPUs, which have better performance per CPU core, than the Xeon Phi cores.

you could also try using verlet/split to run Kspace on a separate partition of MPI ranks or try to compile the USER-INTEL package with the “long-range threads” feature enabled for the same purpose.


Dear axel,

Thank you for your reply.

There is another SKL node with Xeon gold 6148 processors, so you mean that using Xeon gold rather than Xeon Phi, right?

I’ll try with your suggestions !

Thanks again

Dongwoo Kang

2019년 3월 22일 금요일, Axel Kohlmeyer <[email protected]>님이 작성: