Multiple LAMMPS threads on Slurm

Dear LAMMPS Users,

I have a problem getting to run multiple LAMMPS threads on our Slurm cluster using srun command.

In order to solve another issue we were seeing (https://ask.cyberinfrastructure.org/t/running-multiple-lammps-jobs-on-a-node/504/2),

we have switched to using srun command instead of mpirun to run LAMMPS.

Now, the issue is when I try to run LAMMPS on 2-cores, I see the job is repeated 2 times on 1 proc, instead of getting 2 MPI tasks.

That is,

For srun -n 2 option, the output is as shown:

100.0% CPU use with 1 MPI tasks x no OpenMP threads

100.0% CPU use with 1 MPI tasks x no OpenMP threads

For mpirun -np 2, the output is as shown:

100.0% CPU use with 2 MPI tasks x no OpenMP threads

https://sourceforge.net/p/lammps/mailman/message/35348204/

I have checked that

  1. Other MPI jobs run as expected with srun and mpirun commands on Slurm
  2. LAMMPS was built with openmpi-1.8/gcc and that is the same MPI version that is being used to run this LAMMPS simulation.

I still see the same errors.

I have also tried with srun -c 2, --mpi=openmpi options and varying --ntasks, --ntasks-per-node, --cpus-per-task options in Slurm.

Is there something that I am missing?

Can you please suggest how we can get past this?

System information:

CentOS 7

Slurm Scheduler

LAMMPS version - lammps-16Feb16

MPI Version - openmpi-1.8/gcc

Each of our compute nodes has either 20-cores / 24-cores and each core can run 1 process.

Complete job script:

the srun vs mpirun discrepancy clearly indicates, that your srun is
not compatible with your openmpi installation.
mpirun sets environment variables (OMPI_*) that the individual MPI
ranks use to determine their rank, communicator and physical location
(i.e. rank on the local node).
if you do something like "mpirun -np 2 env", you should see which, and
then compare with srun -n 2 env. this has to be an issue of your local
cluster or software configuration and not LAMMPS. perhaps a typo
slipped in somewhere or some software got out of sync somehow. this is
not always easy to figure out. the symptoms are clear, though.

as for the job placement, it is not quite clear from your description,
whether the intention is to run a bundle of multiple LAMMPS MPI jobs
on the same node, or run them in sequence. it would seem like a large
waste to reserve space for 24 MPI ranks, and then only use two of them
at a time and instead run a sequence of 2 CPU jobs.

if the desire is to run them concurrently, it is probably better to
use the multi-partition option in LAMMPS. this way one can launch one
LAMMPS execution with mpirun -np 24 and then use the partition flag to
split this into 12 partitions of 2 CPUs each and then use variables,
as described in the LAMMPS manual, to load a different input into each
partition.

if the desire is to run them sequentially, then i would experiment,
whether execution is sped up by using more MPI ranks or a combination
of MPI ranks and OpenMP threads (via the USER-OMP, the KOKKOS, or the
USER-INTEL package).

axel.