gcmc for water

_Sabine_Leroch · June 15, 2012, 4:01pm

Dear lammps users, dear Paul Crozier

I have extended fix_gcmc to include the deletion and insertion of water molecules. The adapted code works fine (molecules are properly inserted or deleted over several 100000 simulation steps) on one core, while on more cores it throws a segmentation fault as soon as fix_gcmc is called. Unfortunately I have no experience with MPI coding, thus I cannot tackle the problem without help. May I ask you Paul to have a look on my class. It should not be much effort for you since my code is completely based on your fix_gcmc routine, it only lacks communication among the procs at certain commands. I would be graceful for any help from your side.

Thank you in advance.
Best regards
Sabine

Andrew_Jewett · June 15, 2012, 4:33pm

I'm not very good at MPI. But in the absence of using an MPI debugger
(I did not locate a free one), you can always try commenting out
random code until the seg fault goes away.

--details--
Comment out the contents of all of your functions in the cpp file
(using #if 0...#endif). If that causes your program to stop crashing,
then gradually start uncommenting each function one at a time until
you figure out which function caused the seg fault. You can
selectively comment out portions of this function until you find the
target. In my case I think I was passing a "bool" type variable to
MPI_Bcast(), using MPI_INT. (I guess mpi does not like C++ bools.)

Check all of your MPI_Bcast() calls and make sure the argument matches
the MPI type and avoid new C++ specific data types.

Good luck. Feel free to post a link to a free mpi debugger if you run
across one.

akohlmey · June 15, 2012, 4:46pm

I'm not very good at MPI. But in the absence of using an MPI debugger
(I did not locate a free one), you can always try commenting out
random code until the seg fault goes away.

here is the poor man's parallel debugger:

xterm, gdb and some command line magic.

step 1) set up a file (gdb-run) with all gdb commands that you want to
pass to all parallel debugger sessions. for starters you
want:

dir /path/to/my/lammps/src
run

step 2) make sure you compile a version of LAMMPS with
debug info included and optimization turned off

step 3) run:
mpirun -np 2 xterm -e gdb -x gdb-run --args ../../src/lmp_openmpi
-in myinput

you'll see two xterms pop up with gdb in it and the executable
running until it fails. you can use gdb just like normal, only that
you have to type everything multiple times. i've used this from
remote machines even and with up to 8 processes in parallel
(it helps if you can instruct your window manager to place the
xterms in a smart way).

i just yesterday wrote a little piece of code that makes this
more usable for MPI errors, too (i.e. when MPI throws an error,
the default behavior is to call exit(), but then there is no more
stack frame to find out where the problem originated.

HTH,
axel.

_Enrique_Martinez_Sa · June 15, 2012, 5:09pm

You can actually run ddd in parallel.
mpirun -np p ddd ./exec
should open p windowns one with each process.

Enrique

Andrew_Jewett · June 15, 2012, 5:19pm

HTH,

Indeed, it helps quite a bit. Thanks!

I'm not very good at MPI. But in the absence of using an MPI debugger
(I did not locate a free one), you can always try commenting out
random code until the seg fault goes away.

here is the poor man's parallel debugger:

xterm, gdb and some command line magic.

step 1) set up a file (gdb-run) with all gdb commands that you want to
pass to all parallel debugger sessions. for starters you
want:

dir /path/to/my/lammps/src
run

step 2) make sure you compile a version of LAMMPS with
debug info included and optimization turned off

Sabine, if it helps, to do this, I've been adding "-g -O0" to the
CCFLAGS and LINKFLAGS variables in my makefile:
CCFLAGS = -g -O0 ...
LINKFLAGS = -g -O0 ...
Be sure to get rid of the "-O" or "-O2" or "-O3" that may be present
in those lines already.
You probably did not need this help.

step 3) run:
mpirun -np 2 xterm -e gdb -x gdb-run --args ../../src/lmp_openmpi
-in myinput

you'll see two xterms pop up with gdb in it and the executable
running until it fails. you can use gdb just like normal, only that
you have to type everything multiple times. i've used this from
remote machines even and with up to 8 processes in parallel
(it helps if you can instruct your window manager to place the
xterms in a smart way).

i just yesterday wrote a little piece of code that makes this
more usable for MPI errors, too (i.e. when MPI throws an error,
the default behavior is to call exit(), but then there is no more
stack frame to find out where the problem originated.

I can see how that would be useful.
Thanks again.

sjplimp · June 15, 2012, 10:58pm

This sounds promising - Paul can comment.

But there are several issues with doing GCMC in parallel
that are more fundamental than finding a bug in serial
code that is running in parallel. Paul will have to figure
out whether those kinds of issues are covered by your
code, or can be added.

One debugging note. Valgrind (which finds all kinds
of bugs including memory issues) can be run in parallel.

E.g.

mpirun -np 2 valgrind lmp_g++ < in.foo

Steve

akohlmey · June 15, 2012, 11:04pm

One debugging note. Valgrind (which finds all kinds
of bugs including memory issues) can be run in parallel.

E.g.

mpirun -np 2 valgrind lmp_g++ < in.foo

Steve

which becomes even more useful when doing:

mpirun -np 2 valgrind --log-file=foo-%p.log lmp_g++ < in.foo

so that you get one separate output per process.

axel.

sjplimp · June 15, 2012, 11:11pm

mpirun -np 2 valgrind --log-file=foo-%p.log lmp_g++ < in.foo
so that you get one separate output per process.

that's a useful tip I didn't know.

I guess I had always considered it a test of my manhood
to deparse the interleaved valgrind output from
8 procs as it flies by on the screen ...

Steve

Crozier_Paul_S · June 15, 2012, 11:34pm

It is a challenging problem to get fix GCMC working in parallel for molecules in a general way. The problems come in dealing with molecules that straddle processor boundaries and it quickly become a communications mess. For starters, I'd be interested in a serial-only capability for doing molecule insertions in a general way. If you have such a code and are willing to share it, I'd like to take a look.

Thanks,

Paul

_Enrique_Martinez_Sa · June 17, 2012, 6:27pm

We developed what we call a variance-constraint semigrand canonical algorithm to treat decompositions in systems with a miscibility gap. It works in parallel. We divide each processors in different independent regions which are sampled independently. (see the paper attached). I am not sure if the same scheme would work for molecules but in principle I don't see why not.

Hope this helps.
Enrique

Scalable parallel Monte Carloalgorithm for atomistic simulations of precipitation inalloys_Sadigh_PRB85_2012.pdf (537 KB)

akohlmey · June 18, 2012, 1:56pm

We developed what we call a variance-constraint semigrand canonical algorithm to treat decompositions in systems with a miscibility gap. It works in parallel. We divide each processors in different independent regions which are sampled independently. (see the paper attached). I am not sure if the same scheme would work for molecules but in principle I don't see why not.

enrique,

the problem is not the insertion algorithm, it is due to how
information about bonded interactions is stored and communicated.
most of lammps was written under the assumption that this
doesn't change and thus data is stored in a way that makes
it efficient to access in parallel runs, but when you insert a
molecule in a parallel run, this needs to be updated or - worst
case - a complete system init would be needed.
this would be like reading in a modified restart, which is a
very non-parallel procedure (and thus not very efficient).

cheers,
axel.