problem with no. of processors

Dear lammps-users,
I compiled the lammps using Makefile.openmpi without any errors. But
when I was running the lammps through command

mpirun -np 4 ./lmp_openmpi <in.chute

it is using only one processor for simulation. I am not able to
understand why lammps is using only one processor despite of
mentioning 4 processors.
I also tried with lmp_g++, but getting same kind of response. I am
using the 4-core xeon processor machine for running lammps.
Can anyone suggest something to rectify this problem.

Thanking you

Shantanu Maheshwari

Dear Shantanu,

I think in your command is incomplete, and you should bring another argument, i.e. “-machinefile filename”, to give the address of CPUs that you are going to run your program. This if the structure for running program with openmpi. You should make a file, e.g. “filename”, and insert the name of your computer in it, 4 times.

mpirun -machinefile filename -np 4 ./lmp_openmpi <in.chute

Regards,
Mehdi

Dear Mehdi,
Thanks for your help, but still I am ending up in the same problem. I
wrote my computer name four times in a file but still lammps is
running only on one processor. Are there any other things to write in
the file apart from computer name?

Thanking you

Shantanu Maheshwari

Dear Shantanu,

You may take a look at you /etc/hosts to check that you have correctly written the name of your machine or not. Also the name of machine should be written in 4 different line. You may find an example in the following link:

http://www.rocksclusters.org/roll-documentation/hpc/5.4/using_openmpi.html

Regards,
Mehdi

Dear Mehdi,
I looked at your link and tried the same thing, but it's giving me the
error shown below.
My computer name is stokes and when I am writing stokes in four
different lines it is running with single processor.
But when I am writing
stokes-0
stokes-1
stokes-2
stokes-3
It is giving me the error:

ssh: Could not resolve hostname stokes-0-3: Name or service not known

shantanu,

if you run locally, you don't need to provide a host file.

you should first explain, how you determine that you are
running on only one processor?

that is _very_ unlikely. mpirun -np 4 will launch 4 copies.

_however_, if your mpirun is not the mpirun that comes with
the mpi library that you compiled against (e.g. OpenMPI vs.
MPICH), then those 4 copies will not be able to talk to each
other. after launching a parallel run, you should check with

pstree or top how many copies are running.

axel.

Dear Axel,
After launching the parallel run I am checking with top. There is only
one copy is running.

And If i run the same command on different machine,i.e., intel xeon
with 8-core, I am getting the 8 copies of lmp_openmpi running.
I also tried this mpirun with both openmpi and MPICH, it is giving me
the same problem.

Thanking you

Shantanu Maheshwari

Dear Axel,
After launching the parallel run I am checking with top. There is only
one copy is running.

this doesn't make any sense.

try pstree as well.

     ├─gnome-terminal─┬─4*[bash───ssh]
     │ ├─6*[bash]
     │ ├─bash─┬─pstree
     │ │ └─vim
     │ ├─bash───mpirun───4*[lmp_openmpi]

what does the lammps output say?
there should be a line like this:

  10 by 2 by 12 processor grid

axel.

Dear Axel,
After running lammps, pstree is showing
     ├─gnome-terminal─┬─bash───bash─┬─mpirun───lmp_openmpi
     │ │ └─ssh-agent

And 1 by 1 by 1 processor grid
is written in lammps output.

Thanking you

Shantanu Maheshwari

Dear Axel,
After running lammps, pstree is showing
├─gnome-terminal─┬─bash───bash─┬─mpirun───lmp_openmpi
│ │ └─ssh-agent

And 1 by 1 by 1 processor grid
is written in lammps output.

and this is with?

mpirun -np 4 lmp_openmpi -in in.myinput -log log.mylog

a.

Dear Axel,
This is with
mpirun -np 4 ./lmp_openmpi <in.chute

Thanking you

Shantanu Maheshwari

please try:

mpirun -V

which should result in, e.g.:

mpirun (Open MPI) 1.4.1

and:

mpirun -display-map -np 4 ./lmp_openmpi

that should print something like this:

======================== JOB MAP ========================

Data for node: Name: fermi Num procs: 4
   Process OMPI jobid: [27522,1] Process rank: 0
   Process OMPI jobid: [27522,1] Process rank: 1
   Process OMPI jobid: [27522,1] Process rank: 2
   Process OMPI jobid: [27522,1] Process rank: 3

Dear Shantanu,

Have you installed several mpi versions on your machine, simultaneously? What is your Linux version?

Regards,
Mehdi

Dear Axel,
For mpirun -V it is showing mpirun (Open MPI) 1.4.1
And for mpirun -display-map -np 4 ./lmp_openmpi <in.chute it is showing
======================== JOB MAP ========================

Data for node: Name: stokes Num procs: 4
   Process OMPI jobid: [40559,1] Process rank: 0
   Process OMPI jobid: [40559,1] Process rank: 1
   Process OMPI jobid: [40559,1] Process rank: 2
   Process OMPI jobid: [40559,1] Process rank: 3

Dear Mehdi,
I have installed both openmpi and MPICH2 on my system and my linux
version is Ubuntu 10.10.
Thanking you

Shantanu Maheshwari

Dear Axel,
For mpirun -V it is showing mpirun (Open MPI) 1.4.1
And for mpirun -display-map -np 4 ./lmp_openmpi <in.chute it is showing
======================== JOB MAP ========================

Data for node: Name: stokes Num procs: 4
Process OMPI jobid: [40559,1] Process rank: 0
Process OMPI jobid: [40559,1] Process rank: 1
Process OMPI jobid: [40559,1] Process rank: 2
Process OMPI jobid: [40559,1] Process rank: 3

=============================================================
LAMMPS (15 Jan 2010)
LAMMPS (15 Jan 2010)
Reading restart file ...
WARNING: Restart file used different # of processors
orthogonal box = (0 0 0) to (40 20 72.2399)
1 by 1 by 1 processor grid

well, there you go. your lammps binary is not compiled against
OpenMPI and mpirun _does_ launch multiple instances, but most
seem to crash.

the fact that you get to see the LAMMPS version number more than once
is a very bad sign. PEBCAC.

you have to recompile properly.

axel.

Dear Axel,
Thanks for your valuable suggestions and time. I will recompile the
lammps and let you know whether problem is solved or not.
Thanking you

Shantanu Maheshwari

Dear Axel,

Is it better for him to remove MPICH from his machine, and compile lammps with openmpi? or it does not matter?

Regards,
Mehdi

Dear Axel,

Is it better for him to remove MPICH from his machine, and compile lammps
with openmpi? or it does not matter?

it depends. if there is only one MPI library installed, then it is
difficult to mix them up. if i was extremely paranoid, i would
uninstall both and then install only one.

i personally favor OpenMPI for a long list of reasons
(too long to repeat here).

axel.

Dear Axel and Mehdi,
I am thankful to both of you for solving my problem. I just removed
MPICH2 from my system and recompiled the lammps again.
Now it is working perfectly fine with
mpirun -np 4 ./lmp_opnempi <in.chute
Once again thanks for your help.
Thanking you

Shantanu Maheshwari