[lammps-users] lammps with share memory machine

Dear,

I have a server with 2CPUs and 4 core per cpu. I want to install LAMMPS for parallel computing.

I have install mpich2-1.0.7 with device of ch3:shm or ssm for share memory. There are errors when I installed the LAMMPS(as follows)

HOW can I deal with it?

if I install mpich2-1.0.7 with device of ch3:sock and use sockets for all communications between processes, or install mpich1.2.5, the effectivity of parallel computing is worse than one cpu.

HELP Thanks in advance

errors:

-o …/lmp_g++
/usr/local/mpi/lib/libmpich.a(ch3_init.o): In function MPIDI_CH3_Init': ch3_init.c:(.text+0x766): undefined reference to shm_unlink’
/usr/local/mpi/lib/libmpich.a(shm_memory.o): In function MPIDI_CH3I_SHM_Get_mem': shm_memory.c:(.text+0x68): undefined reference to shm_open’
collect2: ld returned 1 exit status
make[1]: *** […/lmp_g++] Error 1
make[1]: Leaving directory `/home/tianwd/lammps/lammps/src/Obj_g++’
make: *** [g++] Error 2

This isn't a LAMMPS problem; it's an MPI problem. You'll
need to find a local expert to help you install/test MPI.

Steve

Dear,

I have a server with 2CPUs and 4 core per cpu. I want to install LAMMPS for
parallel computing.

I have install mpich2-1.0.7 with device of ch3:shm or ssm for share memory.
There are errors when I installed the LAMMPS(as follows)

HOW can I deal with it?

the IMNSHO best solution is to dump mpich completely and
use OpenMPI instead. it avoids many of the idiosyncrasies
of mpich and you can dynamically change transport layer,
boot procedure and many other details. it provides proper
stacktraces on crashes (no more cryptic p4:SIGSEGV messages
and dropped error output). i have even been able to use
an OpenMPI binary on an infiniband machine which was
originally compiled on a myrinet based cluster. of course
both clusters were using OpenMPI... :wink:

if I install mpich2-1.0.7 with device of ch3:sock and use sockets for all
communications between processes, or install mpich1.2.5, the effectivity of
parallel computing is worse than one cpu.

see my previous reply. use a decent benchmark input
and make sure that your machine is actually idle
when running the tests.

HELP Thanks in advance

errors:

-o ../lmp_g++
/usr/local/mpi/lib/libmpich.a(ch3_init.o): In function `MPIDI_CH3_Init':
ch3_init.c:(.text+0x766): undefined reference to `shm_unlink'
/usr/local/mpi/lib/libmpich.a(shm_memory.o): In function
`MPIDI_CH3I_SHM_Get_mem':
shm_memory.c:(.text+0x68): undefined reference to `shm_open'
collect2: ld returned 1 exit status
make[1]: *** [../lmp_g++] Error 1
make[1]: Leaving directory `/home/tianwd/lammps/lammps/src/Obj_g++'
make: *** [g++] Error 2

try:

man shm_open

and if you read the manpage you see that those functions
(on linux or other glibc based machines) require linking with -lrt.
(one more reason to use openmpi as takes care of that as well
if you use the mpiCC wrapper for the compiler).

cheers,
   axel.

Hi Axel,

I have been using SGE and MPICH for my LAMMPS simulations in our clusters
but next month we will be getting new machines and I have the opportunity
to either stick to MPICH or change to OpenMPI. You mentioned that it would
be better to use OpenMPI and dump MPICH compleletly. If I were to migrate
to OpenMPI, should I stick to SGE or do you know of a better "free" Grid
Engine?

Thanks in Advance

Jan-Michael

My vote is for troque, but thats a big change. Many people use SGI (openMPI has launcher support for SGE just like Torque) and are quite happy.

But yes switch to OpenMPI :slight_smile:

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[email protected]
(734)936-1985

Hi Axel,

jan-michael,

I have been using SGE and MPICH for my LAMMPS simulations in our clusters
but next month we will be getting new machines and I have the opportunity
to either stick to MPICH or change to OpenMPI. You mentioned that it would
be better to use OpenMPI and dump MPICH compleletly. If I were to migrate

yes, that is my personal opinion. i am certain at least the mpich
developers disagree. :wink:

to OpenMPI, should I stick to SGE or do you know of a better "free" Grid
Engine?

that would be basically an independent decision. it has not impact
on the MPI performance. most machines i'm running on, use Maui/Torque
or their commercial counterparts, but SGE has been working fine
for me as well. it is more a question of convenience to set
up and flexibility. for the typical workload in our group (a mix of
serial and different size parallel jobs that are usually broken
into segments of no longer than 24 hours) the default configuration
of maui/torque works very well.

this gives me another opportunity to advertise openmpi. :wink:
in both cases you don't even need ssh/rsh to start your parallel
job. openmpi supports several schemes and that includes SGE and
Torque/PBS to launch parallel jobs. this has several advantages
from the administrative point of view:
- you actually keep track of the cpu time used by the parallel job
(and not the cputime used by mpirun, which is next to nothing and
thus makes fair share scheduling impossible unless configured to
use wall time instead of cpu time).
- the batch system has control over all processes and if you delete
a job (or its wall time expires) they get killed (and you may not
need to write a script to clean up, which can be tricky unless you
give only exclusive access to individual nodes).
- you don't have to worry about making sure that the -np argument
matches the number of processors/cores assigned to the job.
you can run mpirun without -np and it will start one mpi task
per allocated core.

hope that helps,
    axel.