Greetings! I am curretly using 9Aug13 version on my HPC with -partition option. PBS script is as attached. However, the whole process exited as the fastest replica reaches the end (i.e., only one replica makes it to completion). However this is not seen in another HPC…so I am guessing it to be platform related. Could anyone direct me to the code piece about this so that I can try to debug? Thanks for the help and infomation
LC Liu
#!/bin/bash #PBS -N paraffin #PBS -o out #PBS -e err #PBS -q hp #PBS -l nodes=9:ppn=8
cd $PBS_O_WORKDIR
echo ‘=======================================================’
echo Working directory is $PBS_O_WORKDIR
echo "Starting on hostname at date"
if [ -n “$PBS_NODEFILE” ]; then
if [ -f PBS_NODEFILE ]; then
echo "Nodes used for this job:"
cat {PBS_NODEFILE}
NPROCS=wc -l < $PBS_NODEFILE
fi
fi
Greetings! I am curretly using 9Aug13 version on my HPC with -partition
option. PBS script is as attached. However, the whole process exited as the
fastest replica reaches the end (i.e., only one replica makes it to
completion). However this is not seen in another HPC....so I am guessing it
to be platform related. Could anyone direct me to the code piece about this
so that I can try to debug? Thanks for the help and infomation
this way you _force_ all tasks to wait on the barrier before you call
finalize. if MPI_Barrier() doesn't work, you have to sue the provider
for not delivering a working MPI.