[lammps-users] Problems with pair_coeff: Too long of a cut?

Brian_Giera · November 13, 2009, 9:38pm

I’ve recompiled LAMMPS with OpenMPI and now my input files do run with a large cutoff! I did not yet adjust the stack size limit.

However, I get an output message (copied at the end of this e-mail) that indicates “[these problems] may result in lower performance,” which appears to be the case. When comparing the exact same input files run with MPICH-1 and OpenMPI, I see with OpenMPI my code is running an order of magnitude slower.

How do I interpret the error message and what changes can I make to fix the problem?

I can see the light at the end of the tunnel!

Many thanks,

Brian

akohlmey · November 13, 2009, 9:47pm

I've recompiled LAMMPS with OpenMPI and now my input files do run with
a large cutoff! I did not yet adjust the stack size limit.

if the jobs work without, then you don't need to mess with it.

However, I get an output message (copied at the end of this e-mail)
that indicates "[these problems] may result in lower performance,"
which appears to be the case. When comparing the exact same input
files run with MPICH-1 and OpenMPI, I see with OpenMPI my code is
running an order of magnitude slower.

How do I interpret the error message and what changes can I make to
fix the problem?

the error message indicates that your OpenMPI is compiled for use
with infiniband, but cannot read the available infiniband libraries
and hence drops down to the next available data transport layer,
which is most likely TCP/IP and that is very slow for a code like
LAMMPS that very much depends on low latency communication.

this is an issue that you would have to discuss with the OpenMPI
folks. the error message suggests that the infiniband libraries
are compiled with a newer version of the c-library than your OpenMPI
or that there is some other incompatibility (e.g. 32bit vs. 64bit).

i've compiled OpenMPI regularly on infiniband machines and never
come across this issue, so i cannot help you there.

cheers,
axel.

sjplimp · November 16, 2009, 3:55pm

I see with OpenMPI my code is running an order of magnitude

slower.

Use MPICH 1.1 or 2.0 - the latter is what I use on my Linux box.

Steve

Brian_Giera · November 17, 2009, 6:41pm

To clarify, MPICH-1.2.7 seems to be the latest version. When you say MPICH 2.0, which you use on your Linux machine, do you mean MPICH2 (described here: http://www.mcs.anl.gov/research/projects/mpich2/)?

sjplimp · November 18, 2009, 2:56pm

Yes, MPICH 1.2 or MPICH2 should be fine. 2 just adds some
functions that LAMMPS doesn't use.

Steve

Brian_Giera · December 8, 2009, 6:30pm

Whoops, I forgot to mention a while ago that this problem has been resolved!

I basically re-installed MPICH 1.2 and reconfigured my system in order that I didn’t use the replicate command. I have found my machine will generate segmentation fault errors for this and other systems if I use the replicate command. I am not sure if this is a specific issue to my computer or the replicate command, but I consider my problem now fixed.

Thanks for sticking with me!