I just wanted some clarification regarding the correct way to run mcsqs in parallel.
I would expect to run, for example: mpiexec -n 16 mcsqs
However, I can see in the SQS library, you use a command in a SLURM script which is:
for (( id=0 ; id<8 ; id++ ))
do
mcsqs -n=40 -ip=$id &
done
wait
What is the correct way of running mcsqs in parallel? Are there any advantages to the latter approach? (I’ll also be running it via SLURM on a HPC if it makes any difference)
To clarify, I’m confused as to how much communication is going on between different mcsqs processes.
My current understanding: using the -ip method, I see that a lot of the outputs end up with the same cell/objective functions, even though the calculation didn’t finish. Since the random seed comes from the system clock, if I’m starting these processes all at the same time, doesn’t that mean I’m essentially running several identical mcsqs instances, and they’re not actually running in parallel?
I found this previous thread: here.
(forwarded by avgjoe)