Greetings,
I am trying to run lammps with OMP package enabled. However I face the following error on supercomputer:
[mpiexec@qbc020] fn_kvs_get (pm/pmiserv/pmiserv_pmi_v2.c:299): assert (idx != -1) failed
[mpiexec@qbc020] handle_pmi_cmd (pm/pmiserv/pmiserv_cb.c:49): PMI handler returned error
[mpiexec@qbc020] control_cb (pm/pmiserv/pmiserv_cb.c:286): unable to process PMI command
[mpiexec@qbc020] HYDT_dmxu_poll_wait_for_event (tools/demux/demux_poll.c:77): callback returned error status
[mpiexec@qbc020] HYD_pmci_wait_for_completion (pm/pmiserv/pmiserv_pmci.c:196): error waiting for event
[mpiexec@qbc020] main (ui/mpich/mpiexec.c:336): process manager error waiting for completion
I compiled the lammps by commenting the “MPI_BARIIER” in timer.cpp to get asynchronous parallelization. This is my batch file content:
#!/bin/bash
#SBATCH -N 20 # request two nodes
#SBATCH -n 120 # specify 16 MPI processes (8 per node) # specify 6 threads per process
#SBATCH -t 10:00:00
#SBATCH -c 8
#SBATCH -p workq
#SBATCH -A myAllocation
#SBATCH -o slurm-CH4_wat_MPIOnly.out # optional, name of the stdout, using the job number (%j) and the first node (%N)
#SBATCH -e slurm-CH4_wat_MPIOnly.err # optional, name of the stderr, using job and first node values
#module load cuda/10.2.89/intel-19.0.5
#module load lammps/20200303
module load mpich/3.3.2/intel-19.0.5
#module load openmpi/4.0.3/intel-19.0.5
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
#mpirun -np 48 lmp -sf gpu -pk gpu 2 -in runHydr.in
echo $SLURM_NPROCS
echo $OMP_NUM_THREADS
mpirun -np $SLURM_NPROCS /myFolder/build_mpiOpenMP_SyncTiming/lmp -sf omp -pk omp $OMP_NUM_THREADS -in runHydr.in
date