I have been wrestling with excessively slow computation time for my system. A few key pieces of information are below:
57675 atoms total
25 polymer chains (2307 atoms per chain)
Box dimensions:
0 50 xlo xhi
0 50 ylo yhi
-10 1200 zlo zhi
Init file:
units real
atom_style full
bond_style harmonic
angle_style harmonic
dihedral_style opls
improper_style harmonic
pair_style hybrid/overlay lj/smooth 8.25 11.25 coul/long 11.25
kspace_style pppm 1.0e10-3
kspace_modify gewald 0.1
special_bonds lj/coul 0.0 0.0 0.5
neigh_modify one 50000
neigh_modify page 500000
Pair_style cutoffs set based on 3*sigma for largest value of sigma in pair_coeff for used atom types.
The simulation box starts quite large due to the polymers initially being in the completely extended conformation, which, according to my current understanding, leads to processors being assigned to what will end up being empty space once the polymer chains collapse onto themselves and resulting load imbalance and poor performance.
To address this, I ran a brief npt using coul/cut
instead of coul/long
and while doing so implemented fix balance rcb
with comm_style tiled
to see if that would speed it up, which it did, a little. Once that simulation ended (150k time steps or so at dt = 1 fs), I took the data.restarter file from the end of that simulation (which was a smaller, more compact simulation box for the contracted “polymer clump”) and used it as the input for a second simulation, this time with kspace pppm and fix balance shift
, comm_style brick
. It does run more quickly than previous iterations (3.609 timesteps/s vs. 0.63 timesteps/s), but this is still quite slow for what is a relatively small system and the MPI breakdown shows inordinately large percent varavg values for most categories (see out.682639) which suggests load imbalance. CPU usage is decent (86%) for 32 cores and communication overhead is, at least, acceptable (5.8% of total MPI task). For both the faster simulation and the slower iteration, I set the pppm accuracy to 1.0e-3.
Aside from not using long-range electrostatics and relying on fix balance, what options are available to me for speeding this up? Attachments are for the faster, second simulation with kspace. (I should note that I removed neigh_modify one and page from the second, faster simulation, which should help.)
Not sure if this matters, but I do have to set kspace_modify gewald 0.1
to get the simulation to run. I am also using suffix intel
.
Any help appreciated.
Kind regards,
Sean
out.682639 (3.0 MB)
polysystem45.in (1.1 KB)
polysystemNEW45_kspace.in.init (311 Bytes)