# large cutoff distance in LAMMPS

Hi Steve,

I have a question on the simulation with large cutoff distance. In my system, 90% atoms (A) have the cutoff distance around 1.2 [LJ unit]. However, 10% of them (B) have a very large cutoff distance, e.g. 5 [LJ unit] between A and B, 10 [LJ unit] between B and B.

I use

neighbor 2.5 multi
communicate multi

in my simulations.

However, the simulation still runs very slow, due to the large cutoff distance between B and B atoms and communications between processors. Can you give me some suggestions on how to accelerate this kind of simulations?

Thank you very much!

All my best,

Hi

what is your percentage of communication on the full runtime?

Cheers
Christian

-------- Original-Nachricht --------

Hi Chris,

Here is the typical loop time of my simulation (in a short time). You can see the 'Comm time ’ is about 63% and it can be even higher.

Pair time () = 3665.35 (12.6495) Bond time () = 404.367 (1.39552)
Neigh time () = 1583.01 (5.46316) Comm time () = 18300.6 (63.1575)
Outpt time () = 125.238 (0.432211) Other time () = 4897.56 (16.9021)

Hi Chris,

Here is the typical loop time of my simulation (in a short time). You can
see the 'Comm time ' is about 63% and it can be even higher.

Pair time (\) = 3665\.35 \(12\.6495\) Bond time \() = 404.367 (1.39552)
Neigh time (\) = 1583\.01 \(5\.46316\) Comm time \() = 18300.6 (63.1575)
Outpt time (\) = 125\.238 \(0\.432211\) Other time \() = 4897.56 (16.9021)

hmmm...

how many processors are you using and
what kind of network, if any, is connecting
them?

have you tried using OpenMP + MPI
paralellization?

can you produce an example input?

a lot of the runtime optimizations need
some experimentation.

axel.

Hi Axel,

I am using 40 processors for this kind of simulations. These are very simple systems for polymer chains interacting with nanoparticles, via LJ interactions. My LAMMPS file is compiled by mpi/openmpi-intel-1.3.2.

Per you request, here I attached the input file with the data file for the further test. However, it is too large and I will send it to you separately.

Thank you very much!

All my best,

thanks for the input. i experimented a bit with it,
but it looks to me, that you are already doing all
the things that can be done.

i would have expected that threading can help
reduce communication and thus speed up your
runs. in fact, it does reduce the communication
cost a little bit, but not as much as i thought,
and then more time is spent on non-thread parallel
parts, which eliminates the small benefits.

the only thing that seems to make a bit of a
difference is to adjust the "skin" distance.

with 2.0 instead of 2.5, it runs a little bit faster.

thanks for providing an interesting input. i'll
have to look at it and profile it some more
to see, if there is something else that can
be done.

cheers,
axel.

Hi Axel,

Thanks for your suggestion and effort for this issue.
I may just have to wait for a long time for the simulations to be done.

All my best,

The only thing I have to add, is that I would
verify that those 2 commands are working
correctly on your system, by doing short
benchmark runs. One with neither command,
one with just neighbor, one with both.
Just neighbor should reduce the neighbor
cost. Both should also reduce the comm
cost and the # of ghost atoms.

If they do that, then they are working correctly,
and as Axel said, there probably isn't much more
you can do. See the 2008 CPC paper on the
citations page for examples of how this worked
for a 2-size problem.

Steve