Dear LAMMPS Users,
I have a problem getting to run multiple LAMMPS threads on our Slurm cluster using srun command.
In order to solve another issue we were seeing (https://ask.cyberinfrastructure.org/t/running-multiple-lammps-jobs-on-a-node/504/2),
we have switched to using srun command instead of mpirun to run LAMMPS.
Now, the issue is when I try to run LAMMPS on 2-cores, I see the job is repeated 2 times on 1 proc, instead of getting 2 MPI tasks.
That is,
For srun -n 2 option, the output is as shown:
…
100.0% CPU use with 1 MPI tasks x no OpenMP threads
…
100.0% CPU use with 1 MPI tasks x no OpenMP threads
For mpirun -np 2, the output is as shown:
…
100.0% CPU use with 2 MPI tasks x no OpenMP threads
…
https://sourceforge.net/p/lammps/mailman/message/35348204/
I have checked that
- Other MPI jobs run as expected with srun and mpirun commands on Slurm
- LAMMPS was built with openmpi-1.8/gcc and that is the same MPI version that is being used to run this LAMMPS simulation.
I still see the same errors.
I have also tried with srun -c 2, --mpi=openmpi options and varying --ntasks, --ntasks-per-node, --cpus-per-task options in Slurm.
Is there something that I am missing?
Can you please suggest how we can get past this?
System information:
CentOS 7
Slurm Scheduler
LAMMPS version - lammps-16Feb16
MPI Version - openmpi-1.8/gcc
Each of our compute nodes has either 20-cores / 24-cores and each core can run 1 process.
Complete job script: