GPU with Pair style hybrid

Benjamin_Cowen · October 7, 2016, 9:00pm

Hello,

Are there plans to allow for building the neighbor lists on GPUs using pair_style hybrid?

I am wanting to carry out a multi-million atom simulation. When I run with a single FF, works great. When I run hybrid, I get the error about not building neighbor lists on GPUs, therefore, I build the neighbor list on CPUs.

However, when I try on CPU, I cannot get past ~2 million atoms before the simulation just hangs, most likely because of RAM (64 GB per node).

Are there any ways around this, or any ideas if support for the hybrid pair style will be added to the gpu package?

Ben

akohlmey · October 8, 2016, 2:20am

Hello,

Are there plans to allow for building the neighbor lists on GPUs using
pair_style hybrid?

I am wanting to carry out a multi-million atom simulation. When I run with a
single FF, works great. When I run hybrid, I get the error about not
building neighbor lists on GPUs, therefore, I build the neighbor list on
CPUs.

However, when I try on CPU, I cannot get past ~2 million atoms before the
simulation just hangs, most likely because of RAM (64 GB per node).

that is quite unlikely to be the reason, as neighbor lists on the CPU
generally require less RAM than on the GPU.

Are there any ways around this, or any ideas if support for the hybrid pair
style will be added to the gpu package?

since you provide no tangible information to support your claim and
about the specifics of your setup, there is no way to provide an
answer to the first part of your question.

i am not aware of any plans to have pair style hybrid work with
neighbor list building on the GPU, as it a) applies only to the
limited case, where *all* potentials can run on the GPU (otherwise
building of neighborlists on the CPU is *required* and b) requires
doing something rather complex, that is difficult to do efficiently on
the GPU.

axel.

Benjamin_Cowen · October 8, 2016, 4:44am

Axel,

LAMMPS version July 30th, 2016

Using this pair_style:

pair_style hybrid tersoff/zbl zbl 0.5 2
pair_coeff * * tersoff/zbl …/…/…/…/potentials/SiC.tersoff.zbl Si NULL
pair_coeff 1 2 zbl 14 31
pair_coeff 2 2 zbl 31 31

The only differences between this simulation that hangs and the one that did not hang is:

Simulation that hangs has:

Neighbor lists built on CPU
Newton pair off

Other than that, the scripts are identical

Benjamin_Cowen · October 8, 2016, 4:46am

(And the potential is different) - Sorry the other simulation does not have hybrid pair_style this one does. That is why on the other one I was able to run with the other settings

Benjamin_Cowen · October 8, 2016, 4:54am

OK, so I found the problem. In my old executable, I was able to run tersoff/zbl/gpu with newton on. For some reason, with my new executable, I have to run with newton off. Has something been updated that requires this or could this somehow be an artifact of a mad executable?

The 2015 version of LAMMPS that I had ran the code fine with Newton on.
I run 2016 version with exact same input script, just new executable and I have to run with Newton off.

Why could this be?
Sorry for all the emails, but I think this is the problem, just not sure how to fix it.
Ben

akohlmey · October 8, 2016, 8:32am

OK, so I found the problem. In my old executable, I was able to run
tersoff/zbl/gpu with newton on. For some reason, with my new executable, I
have to run with newton off. Has something been updated that requires this
or could this somehow be an artifact of a mad executable?

The 2015 version of LAMMPS that I had ran the code fine with Newton on.
I run 2016 version with exact same input script, just new executable and I
have to run with Newton off.

Why could this be?

it is *impossible* to use tersoff/zbl/gpu with an executable from
2015, since it was only added to LAMMPS in april 2016.

Benjamin_Cowen · October 8, 2016, 3:14pm

Interesting point. In my script, I just have tersoff/zbl, but I guess because I was running on a gpu, it assumed it should be tersoff/zbl/gpu?

Is there a way to enforce that it is just tersoff/zbl?

When I run with tersoff/zbl and with newton on, I get this error:

ERROR: Pair style tersoff/zbl/gpu requires newton pair off (…/pair_tersoff_zbl_gpu.cpp:153)

How can I specify tersoff/zbl without assuming it is tersoff/zbl/gpu?

And when I try turning newton off, and running with just 1 mpi core, I get the following message:

ERROR: Insufficient memory on accelerator (…/gpu_extra.h:38)

Any suggestions?

Note that it the simulation does not hang for 1-2 million atoms, but above that it does.

Benjamin_Cowen · October 8, 2016, 3:23pm

Additionally, how do I reconcile this:

This pair style requires the newton setting to be “on” for pair interactions.

But then when I run it with on, the gpu says I have to run it with newton off?

akohlmey · October 8, 2016, 3:27pm

Interesting point. In my script, I just have tersoff/zbl, but I guess
because I was running on a gpu, it assumed it should be tersoff/zbl/gpu?

Is there a way to enforce that it is just tersoff/zbl?

see the docs for the suffix command

if you *want* to run on the GPU, then you need to run with a version
of LAMMPS released after July 1st

When I run with tersoff/zbl and with newton on, I get this error:

ERROR: Pair style tersoff/zbl/gpu requires newton pair off
(../pair_tersoff_zbl_gpu.cpp:153)

How can I specify tersoff/zbl without assuming it is tersoff/zbl/gpu?

read the documentation AND THINK LOGICALLY (this has been your biggest
problem since the very beginning you posted questions on lammps-users,
you almost never think things through, you guess - often wrong - and
you don't seem to make any attempts to verify or validate your claims,
but rather "assume") if you specify the suffix flag, it will try to
append the suffix to any style that supports it. this is all very well
explained in the documentation. if you read *all* of the relevant
parts and think carefully about what is said and make some simple
tests to confirm, you wouldn't be facing such issues as you are facing
over and over again.

And when I try turning newton off, and running with just 1 mpi core, I get
the following message:

ERROR: Insufficient memory on accelerator (../gpu_extra.h:38)

that error is self explanatory.

Any suggestions?

Note that it the simulation does not hang for 1-2 million atoms, but above
that it does.

at this point, we have a total mess of options and it is not clear,
what specific setup and settings this corresponds to. please provide a
simple summary explaining what combinations of GPU and CPU and MPI on
or off do work or not. ...and best provide (simple) examples to
reproduce this. we also need to know LAMMPS version and compilation
settings. ...and keep in mind, that nobody will make an effort to
debug an issue, that cannot be reproduced with the latest development
version of LAMMPS.

axel.

akohlmey · October 8, 2016, 3:28pm

Additionally, how do I reconcile this:

This pair style requires the newton setting to be “on” for pair
interactions.

But then when I run it with on, the gpu says I have to run it with newton
off?

you cannot. at least not without doing some C++ programming. you are
using two pair styles with conflicting requirements.

axel.