Makefile for building LAMMPS on Blue Gene/Q

The compiler flag -qlanglvl=redefmac allows you to re-define macros with
a warning instead of an error (I want to say providing a non-trivial
redefinition of a macro without #undef-ing it first is undefined behavior
in the ANSI C and C++ standards, but I don't actually know that for sure).
Add that flag to the CFLAGS definition for XLC (i.e., the bgxlc++ compiler
flags) and it should work.

I can confirm that it compiles and links for me (after that fix) on a
Blue Gene/Q system (in this case, Mira at the ALCF).

Good luck!

Karl D. Hammond
University of Tennessee, Knoxville
[email protected]...

"You can never know everything, and part of what you know is always
   wrong. A portion of wisdom lies in knowing that. A portion of courage
   lies in going on anyway."
"Nothing ever goes as you expect. Expect nothing, and you will not be
   surprised."

Dear Dr. Hammond,

It does seem the flag worked! Thank you and thanks to everyone for chiming in! I have attached the makefile that I have used to successfully compile, hopefully someone may find it useful in the future.

Best Regards,
Jingjie Yeo
Ph.D. Student
School of Mechanical and Aerospace Engineering
Nanyang Technological University, Singapore

Makefile.bgq (3.59 KB)

I've encountered a very strange problem and I'm wondering if anyone else
has encountered it and/or whether I might be doing something wrong.

I am running LAMMPS built by these instructions:
   https://www.alcf.anl.gov/user-guides/lammps
for a Blue Gene/Q system; note that you have to add the two "vpath" lines
that exist in the standard makefiles to make this work with recent LAMMPS
make scripts. Packages included are: MANYBODY, REPLICA, USER-MISC,
USER-OMP. It's built with the IBM XL C++ compiler for Blue Gene, version
12.1.

The problem can be reproduced with the standard "in.lj" benchmark, with
the following modifications:
  * Lines 7-9 should multiply by 100, not 20
  * Add the line "dump 1 all atom 1000 dump.lj" right before "run 100"
Then run with 512 nodes with 16 MPI ranks per node and 4 threads per rank.
It might be reproducible with fewer ranks, but that's the smallest job I
can run on this system. Anyway....

Now look at the dump file. If you see the same problem I do, it will have
normal-looking numbers at the top, but after a while all the lines will
look like
   0 0 0 0 0
OR
   0 0 1.0E-320 1.0E-320 1.0E-320
or some other garbage numbers. Then there will be a block of normal ones,
then more garbage. I should point out that the block of "normal" values
is almost always 128 entries long, which is very suspicious; the block of
zeros varies in length. Some entries also look like
   5 1260512 1 1 61.3053
which is equally nonsensical.

It's worth noting that I can't reproduce this problem (at least not with
this particular benchmark file) with 20x20x20 or even 50x50x50 boxes.
100x100x100 does it, as does 200x200x200. As, of course, does the REAL
system I was trying to simulate....

Any ideas what might be going wrong or checks I can perform? It may be a
compiler issue, but at this point it's not 100% clear to me how to
determine that or how to fix it if it is.

Karl D. Hammond
University of Tennessee, Knoxville
[email protected]...

"You can never know everything, and part of what you know is always
   wrong. A portion of wisdom lies in knowing that. A portion of courage
   lies in going on anyway."
"Nothing ever goes as you expect. Expect nothing, and you will not be
   surprised."

hi karl,

this is a bit of a longshot, but we recently noticed that there are
some issues with compiler optimization for certain compilers on the
atom_vec*.cpp files. can you please try commenting out the lines
containing the following code in all the AtomVec classes

buf[m] = 0.0; // for valgrind

for example using the command:

sed -i -e '/\/\/ for valgrind/s,^,//,' atom_vec_*.cpp

and then recompile.

thanks,
      axel.

You should only need to try that for the one
atom style you are using.

You could also try using a different compiler,
like g++ on BGQ. I’ve seen myself and heard of issues
with XLR, so see if the problem goes away with g++.

Steve

You could also modify the Makefile to turn off all optimizations:

CCFLAGS = -g -O3 -qarch=qp -qtune=qp -qsmp=omp -qsimd=auto
-qhot=level=2 -qprefetch -qunroll=yes

change to:

CCFLAGS = -g -O0

I assume you are not using the OMP package.

Aidan

I actually am using the USER-OMP package, so the "-qsmp=omp" line is
necessary. I'm running several tests now with different optimization
flags; there are a few more I can try as well if none of these work.

Karl D. Hammond
University of Tennessee, Knoxville
[email protected]...

"You can never know everything, and part of what you know is always
   wrong. A portion of wisdom lies in knowing that. A portion of courage
   lies in going on anyway."
"Nothing ever goes as you expect. Expect nothing, and you will not be
   surprised."

I’d also try it w/out OMP to see if that
is affecting the dumps.

Steve

I ran several tests with different compiler flags. The culprit is
   -qhot=level=2
(I suspect any -qhot setting may duplicate the issue), which turns on
several aggressive optimizations that evidently create Heisenbugs in this
instance. With that on, even -O0 replicates the problem. With it off,
higher levels of optimization will produce correct dumps. I did verify
that it is not a problem with the initialization lines in
atom_vec_atomic.cpp (marked "for valgrind").

Thank you all for your suggestions! I suspected it was an optimizing
problem, but having never encountered anything like this before (in my
admittedly short programming life), I'm glad of all the help I can get.

Karl D. Hammond
University of Tennessee, Knoxville
[email protected]...

"You can never know everything, and part of what you know is always
   wrong. A portion of wisdom lies in knowing that. A portion of courage
   lies in going on anyway."
"Nothing ever goes as you expect. Expect nothing, and you will not be
   surprised."

Good to know. Did you identify which particular "HOT" option is
causing the bad behavior? Once you figure out which optimization
options are both safe and beneficial, perhaps you could send us the
makefile.

Aidan