NB3B and MPI

Hi,

We’ve been having some problems with Nb3b and MPI - the system fails to initialise correctly sometimes with increasing thread counts (and frustratingly, sometimes it works perfectly). We have been able to reproduce this using the included nb3b example in the latest stable build (7 Dec 15), but not with an older version (9 Dec 14). The energies for the system become extremely large (or nan). The use of 8 threads is for generating the strange behaviour for this example. I have put a working and failed job output below.

Any help would be appreciated.

James Reid

(scroll to the *** for example 2)

mpirun -N 2 lmp_mpi_7Dec15 < in.nb3b

LAMMPS (7 Dec 2015)
Reading data file …
orthogonal box = (0 0 0) to (22.5907 22.359 23.4708)
1 by 1 by 2 MPI processor grid
reading atoms …
1400 atoms
scanning bonds …
1 = max bonds/atom
reading bonds …
560 bonds
Finding 1-2 1-3 1-4 neighbors …
Special bond factors lj: 0 0 0
Special bond factors coul: 0 0 0
1 = max # of 1-2 neighbors
0 = max # of 1-3 neighbors
0 = max # of 1-4 neighbors
1 = max # of special neighbors
Reading potential file MOH.nb3b.harmonic with DATE: 2013-06-28
Finding 1-2 1-3 1-4 neighbors …
Special bond factors lj: 0 0 1
Special bond factors coul: 0 0 1
1 = max # of 1-2 neighbors
0 = max # of 1-3 neighbors
1 = max # of special neighbors
Respa levels:
1 = bond angle dihedral improper pair
2 = kspace
WARNING: Resetting reneighboring criteria during minimization (…/min.cpp:168)
EwaldDisp initialization …
WARNING: Using a manybody potential with bonds/angles/dihedrals and special_bond exclusions (…/pair.cpp:220)
G vector = 0.269426
WARNING: Using a manybody potential with bonds/angles/dihedrals and special_bond exclusions (…/pair.cpp:220)
Neighbor list info …
4 neighbor list requests
update every 1 steps, delay 0 steps, check yes
max neighbors/atom: 2000, page size: 100000
master list distance cutoff = 12
ghost atom cutoff = 12
binsize = 6, bins = 4 4 4
Setting up cg style minimization …
Unit style: real
vectors: nbox = 6, nkvec = 478
Memory usage per processor = 16.4896 Mbytes
Step TotEng KinEng Temp PotEng E_bond E_angle E_dihed E_impro E_vdwl E_coul E_long Press Lx Ly Lz Xy Xz Yz Volume
0 -61505.983 0 0 -61505.983 198.11978 0 0 0 5426.6842 -20935.868 -46194.919 979.72809 22.5907 22.359 23.4708 0 0 0 11855.229
4 -61506.604 0 0 -61506.604 198.69671 0 0 0 5460.0893 -20970.348 -46195.042 1657.4299 22.5907 22.359 23.4708 0 0 0 11855.229
Loop time of 0.196258 on 2 procs for 4 steps with 1400 atoms

99.6% CPU use with 2 MPI tasks x no OpenMP threads

Minimization stats:
Stopping criterion = energy tolerance
Energy initial, next-to-last, final =
-61505.9829 -61506.5882212 -61506.6041431
Force two-norm initial, final = 17.893 3.40908
Force max component initial, final = 0.757547 0.131738
Final line search alpha, max atom move = 0.394559 0.0519786
Iterations, force evaluations = 4 8

MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total

Hi,

We've been having some problems with Nb3b and MPI - the system fails to
initialise correctly sometimes with increasing thread counts (and
frustratingly, sometimes it works perfectly). We have been able to
reproduce this using the included nb3b example in the latest stable build (7
Dec 15), but not with an older version (9 Dec 14). The energies for the
system become extremely large (or nan). The use of 8 threads is for
generating the strange behaviour for this example. I have put a working and
failed job output below.

james,
i cannot reproduce this on my desktop machine (i am using LAMMPS-ICMS,
but the differences to the sandia version are very, very small these
days) with the latest patchlevel (3Feb2016). can you provide some more
information about the machine that you are running on? and your
compiler version, mpi library and other makefile settings you use.
perhaps you should try on a different machine or check out some
precompiled binaries, if there is a machine that you are able to
install those.

thanks,
     axel.