[lammps-users] AIREBO not working in parallel

ajay,

please always keep the list in cc, so that others can
see how your problem is resolved.

Hello Axel,
Thank you very much for your prompt reply.
I was using the lammps April 2010 version which had got stalled.
Yesterday I tried with the Oct Version, and could run in parallel with 8
processors. This is running fine now and has reached more than 100000
timesteps.
This still being slow, considering the huge size of my simulation cell, I

well, what people consider a large simulation cell varies.

tried to increase to 16 and 32 processors. The simulation got stalled
completely.

that has to be a problem with your local machine and MPI infrastructure.
i just took your input and scaled it easily to over 20 nodes over a myrinet
interconnect.

Regarding the cluster I am using - it is a red hat linux based
cluster. http://www.rcac.purdue.edu/userinfo/resources/coates/

hmm... there is not special except for the 10GigE (which seems to
me as an odd choice, unless your local MPI implementation does
some TCP/IP bypassing). your best chance is probably to contact
the local support staff and help you to identify where the job is stalling.

i can run your input over myrinet, infiniband and tcp/ip (plain GigE)
and it was all working for me.

this makes the most likely explanation to be an issue with your MPI
library or a mis-compiled binary. in both cases, there is very little
that can be done from remote.

I actually don't understand your question about addressing the nodes. May be
the script file might answer your question.

yes. that explains most of it. you are running many tasks
(8) over the same 'wire' and that can put some extra load
on the interconnect. however, can you run other input examples
from the lammps distribution well, e.g. the peptide input. or
the rhodo benchmark. those will put the interconnect to a
tests.

please also note, that AIREBO has been ported to OpenMP
as available from here: http://goo.gl/oKYI
since AIREBO is fairly compute intense, the OpenMP parallelization
is quite effective.

so for maximum speed with many nodes on the machine you are
running on, you should try out a multi-level openmp+mpi parallel
binary (by replacing airebo with airebo/omp and lj/class2 with
lj/class2/omp in your input) and then running only 2 MPI tasks per
node and 4 OpenMP threads.

cheers,
   axel.

Hello Axel,

Thank you very much again. I am going to talk to the support group according to your suggestions.

Thank you
With Regards
Ajay