Redhat problem

Hi all,

I installed the Lammps on the redhat using the rpm and it was doing great while running in parallel ( I use the meam potential). lately I started to increase the number of atoms to 100,000, but the problem is that it just uses one core and cannot run in parallel anymore, also it doesn’t show any error. I use mpirun -np … lmp_g++<file.in. Is there any suggestion to solve this problem? Thanks

Mohamed

Hi all,
I installed the Lammps on the redhat using the rpm

​which rpm exactly?​

and it was doing great while running in parallel ( I use the meam

potential). lately I started to increase the number of atoms to 100,000,
but the problem is that it just uses one core and cannot run in parallel
anymore, also it doesn't show any error.

​how do you determine this?​

I use mpirun -np .. lmp_g++<file.in. Is there any suggestion to solve
this problem? Thanks

​first you have to convince us that there actually is a problem and how
exactly it can be reproduced, e.g. using/modifying the examples bundled
with LAMMPS.

axel.​

I installed lammps-centos-rhel-repo-1-2.noarch.rpm in Nov 2014. I mean it was great that it was running in parallel as I see from the CPU performance

, it was using the 16 cores, but now it uses one core for each simulation with larger number of atoms. I am running this code:

------------------------ INITIALIZATION ----------------------------

units metal
dimension 3
boundary p p p
atom_style atomic

----------------------- ATOM DEFINITION ----------------------------

variable latparam equal 4.05
lattice fcc {latparam} **region whole block 0 20 0 20 0 20 OR** **region whole block 0 100 0 100 0 100** create_box 1 whole lattice fcc {latparam} orient x 1 0 0 orient y 0 1 0 orient z 0 0 1
create_atoms 1 region whole
replicate 1 1 1

------------------------ FORCE FIELDS ------------------------------

pair_style eam/alloy
pair_coeff * * Al99.eam.alloy Al

neighbor 2.0 bin
neigh_modify delay 0 every 10 check yes

------------------------- SETTINGS ---------------------------------

compute csym all centro/atom fcc
compute eng all pe/atom

I installed lammps-centos-rhel-repo-1-2.noarch.rpm in Nov 2014. I mean it
was great that it was running in parallel as I see from the CPU performance
, it was using the 16 cores, but now it uses one core for each simulation
with larger number of atoms. I am running this code:

​that is nonsense. LAMMPS will be launched with as many parallel processes
as you ask it to run.​ nothing has changed in that regard.

it is less relevant what your exact input is, but rather with what command
line you start LAMMPS. so what is your command line?

​[...]

So when I run it for a block 0 20 0 20 0 20 it uses the number of cores
that I provide. However, when I run for a block 0 100 0 100 0 100 it uses
only one core. I monitor it by opening a new terminal and typing: top then
press 1 so I can see how many cores are used. Thanks

​what you are doing is not checking whether mpirun launched the right
number of processes, but rather you are checking the CPU utilization. that
is something very different. you have increased your system size by a
factor of 5 in any direction, that means, the total volume and the total
number of atoms has been increased by a factor of 125. does your machine
have a sufficient a​mount of RAM available to support such a calculation?
when i run your input on my laptop (with 4 processes) it consumes a total
of 5GB RAM resident and 6GB address space for the LAMMPS processes combined.

​axel.​

I use: mpirun -np 10 lmp_g++<file.in

at the beginning of the simulation it shows:

LAMMPS (5 Nov 2014-ICMS)
using 10 OpenMP thread(s) per MPI task
LAMMPS (5 Nov 2014-ICMS)
using 10 OpenMP thread(s) per MPI task
LAMMPS (5 Nov 2014-ICMS)
using 10 OpenMP thread(s) per MPI task
LAMMPS (5 Nov 2014-ICMS)
using 10 OpenMP thread(s) per MPI task
LAMMPS (5 Nov 2014-ICMS)
using 10 OpenMP thread(s) per MPI task
LAMMPS (5 Nov 2014-ICMS)
using 10 OpenMP thread(s) per MPI task
LAMMPS (5 Nov 2014-ICMS)
using 10 OpenMP thread(s) per MPI task
LAMMPS (5 Nov 2014-ICMS)
using 10 OpenMP thread(s) per MPI task
LAMMPS (5 Nov 2014-ICMS)
using 10 OpenMP thread(s) per MPI task
LAMMPS (5 Nov 2014-ICMS)
using 10 OpenMP thread(s) per MPI task

But the CPUs are not used.

I have 128 Gb of rams on this workstation, do you think they are not enough?

Thanks

Mohamed

I use: mpirun -np 10 lmp_g++<file.in

at the beginning of the simulation it shows:

LAMMPS (5 Nov 2014-ICMS)
  using 10 OpenMP thread(s) per MPI task
LAMMPS (5 Nov 2014-ICMS)
  using 10 OpenMP thread(s) per MPI task
LAMMPS (5 Nov 2014-ICMS)
  using 10 OpenMP thread(s) per MPI task
LAMMPS (5 Nov 2014-ICMS)
  using 10 OpenMP thread(s) per MPI task
LAMMPS (5 Nov 2014-ICMS)
  using 10 OpenMP thread(s) per MPI task
LAMMPS (5 Nov 2014-ICMS)
  using 10 OpenMP thread(s) per MPI task
LAMMPS (5 Nov 2014-ICMS)
  using 10 OpenMP thread(s) per MPI task
LAMMPS (5 Nov 2014-ICMS)
  using 10 OpenMP thread(s) per MPI task
LAMMPS (5 Nov 2014-ICMS)
  using 10 OpenMP thread(s) per MPI task
LAMMPS (5 Nov 2014-ICMS)
  using 10 OpenMP thread(s) per MPI task

But the CPUs are not used.

​you have not produced any convincing argument to prove that. on top of
this, your output indicates something even worse. either you have installed
only the serial version of LAMMPS, or you have loaded the wrong mpi
environment module. since the output repeats 10 times, you are actually not
running in parallel, but 10 copies of a 1 processor run side by side.

this indicates that you need some help from a local HPC expert that can
come and instruct you in person and review and correct what and how you are
running parallel calculations.

axel.

If it is not running in parallel, why it was using multiple cores when the number of atoms was less and i was using the same code ?!

If it is not running in parallel, why it was using multiple cores when the number of atoms was less and i was using the same code ?

I don’t know, since I don’t have a crystal ball and cannot know what is going on in your machine. Can you prove that you don’t get 10x the output now, when you reduce the system size? There are lots of inconsistencies in the info you provide. The lammps output some more lines later should confirm my claim when it prints the processor grid. Also, how come that you have 10 threads for openmp active? I simply cannot see how there would be only one processor active in total, and you don’t provide any convincing information to the contrary. So far, I have to assume it is your fault and your information is not reliable, hence my suggestion to have someone knowledgeable look over your shoulder.

Axel