Request for Tersoff input to test GPU Implementation

Dear Lammps users

I have a running implementation for the Tersoff potential for the USER-CUDA package. While I still need to change some things to use some performance improving capabilities of the GPUs, I'd like to test the current implementation with some real world input scenarios in order to find and eliminate potential bugs.
I would appreciate your help greatly.

What I need:

- input scripts + necessary data files using the Tersoff potential
- Hybrid potential can NOT be used (so far)
- best are small examples which can be scaled to larger systemsizes
- but large input systems are fine as well

Best regards
Christian

Dear Lammps users

I have a running implementation for the Tersoff potential for the USER-CUDA package. While I still need to change some things to use some performance improving capabilities of the GPUs, I'd like to test the current implementation with some real world input scenarios in order to find and eliminate potential bugs.
I would appreciate your help greatly.

What I need:

- input scripts + necessary data files using the Tersoff potential
- Hybrid potential can NOT be used (so far)
- best are small examples which can be scaled to larger systemsizes
- but large input systems are fine as well

have you tried the one here?
http://lammps.sandia.gov/bench.html#potentials

i am also attaching a set of similar inputs that i'm
using to validate the OpenMP variants of
the two tersoff pair styles. those have some
atom vacancies to enforce asymmetries and
avoid some error cancellation through symmetry.

hope that helps,
     axel.

lammps-tersoff-test.tar.gz (5.55 KB)

Question to Axel …

Is really GPU so efficient ? I still have my doubts about its superiority against the CPU …
A friend brought just a bunch of GPU processors and want to test the efficiency. He wants to run my simulations of Nanowires using the EAM potential, and he excels that thanks to the GPU its now feasible to run MD simulations at time scales of microseconds … i.e strain rates of the 10^-3 order . (or at least that is what i understood) . My humble opinion , is that game companies such as NVIDIA are paying the programmers alot of money to make an effort and move the actual source code into the GPU .

Oscar G.

Question to Axel ...

Is really GPU so efficient ? I still have my doubts about its superiority
against the CPU ...

based on what?

whether a GPU based code or a CPU kernel is faster depends
a _lot_ on the boundary conditions. the problem with GPUs is
that one has to write code differently to take the maximum advantage
of the available resources on a GPU. perfect code for GPUs has to be
highly concurrent and overlap memory accesses with computation
in order to hide memory access latencies. if there is high arithmetic
complexity, like in manybody potentials, the potential for achieving
good acceleration is high. for other scenarios, especially with a
small number of atoms per GPU, you may even see deceleration.

if you want the absolute best performance and money is no
issue, then you will currently still be able to run LAMMPS
faster on an absolute scale, but even on those machines,
there will be more and more problems with the increase of
CPU cores per socket, that will require a different way of
programming to make efficient use of them and _tadaa_
most of the time, that is very similar to how you have to
program a GPU to use it well. so even if not everybody
benefits from it, it is very important to start the development
and testing early, so that everybody has access to a
mature implementation by the time, there is no more
alternative.

A friend brought just a bunch of GPU processors and want to test the
efficiency. He wants to run my simulations of Nanowires using the EAM
potential, and he excels that thanks to the GPU its now feasible to run MD
simulations at time scales of microseconds ... i.e strain rates of the

that is ridiculous and the biggest problem with GPUs:
people who don't understand what they can and what
they cannot do. never ever trust any marketing material
from a vendor. there is always a way to show gigantic
acceleration, all you have to do is to use an example,
where the CPU version sucks. that being said. with
a smart setup of the hardware and the right application,
a speedup of one to two orders of magnitude is doable.

but that requires high-end GPU equipment. a low-end
laptop GPU with two multi-processors and slow memory,
will unlikely provide much acceleration if any, even though
it is technically capable to run the GPU code. but this is
like comparing a current x86_64 processor with an intel
386 CPU.

10^-3 order . (or at least that is what i understood) . My humble opinion ,
is that game companies such as NVIDIA are paying the programmers alot of
money to make an effort and move the actual source code into the GPU .

on what do you base such a statement?

neither is nvidia "a game company", nor do i or people like
christian or mike get paid by nvidia to program and maintain
GPU accelerated code in lammps.

nvidia _does_ sponsor some developers with access to GPU
hardware and on occasion helps fostering GPU code development,
by having staff developers contribute know how and the occasional
subroutine or accelerated kernel and facilitates communication
between developer through a selection of means.

however, that does in no way constitute what you are
implying in your statement, so i ask you to please
provide proof or retract it.

thanks,
    axel.

Wow … Amazing explanation …Speechless … and I retract , about the “big money interests” commentary. As Axel wrote : “nvidia does sponsor some developers” (who? , I don’t care) , and please dont take it personal . I have no idea who is christian or mike, and I never said that NVDIA sponsors LAMMPS to program and maintain GPU accelerated code in lammps (did I ?). I was just wondering if GPU Units are so efficient ,there are many rumors you can find on the internet, most of them marketing material from a vendor whose ultimate goal is to obtain a profit .

Cheers
Oscar G.

Hi

As the author of the USER-CUDA package and someone who has been using GPUs for the last 4 years to solve some scientific problems I can assure you that while there is a lot of marketing hype, GPUs can be much more efficient than CPUs given the right circumstances (which are not that rare in many physics simulations).

For my main MD problem a GTX470 is about 15 times faster than an i7 950 @ 3 GHz (using all cores of the CPU) - granted this is comparing single prec [which is often enough in MD if your system is not too large] vs double prec but even considering that a factor of 7 or so is left.
So the 10 dual GPU workstations I used for my simulations effectively gave the conventional compute power of 70/150 [double/single] conventional nodes [dual socket].

And no I didnt get any money from NVIDIA (though they now provided me two of their Tesla line GPUs to further optimization on their professional compute GPUs).

So if you hear someone claiming effective speedups of somewhere between 1GPU=10 CPU Cores to 1GPU=100 CPU Cores that is more often than not a valid comparison. Numbers above that are usually the result of a poor CPU implementation or some outright errors.

Cheers
Christian

-------- Original-Nachricht --------

Wow ... Amazing explanation ...Speechless ... and I retract , about the "big
money interests" commentary. As Axel wrote : "nvidia _does_ sponsor some
developers" (who? , I don't care) , and please dont take it personal . I

it is not personal, but unsubstantiated statements like yours can
easily become viral and be as bad as the (sadly) typical over-hyping
of many vendors. there also is a _big_ difference between getting
access to sponsored hardware to being on a (big) paycheck.

many people at nvidia and several hardware vendors will be able to
confirm that i am highly critical of their sometimes misleading
descriptions of what GPUs can do, and am insisting that this
is the wrong strategy to pitch their hardware to scientists.

particularly after getting the process started so well with
providing easy to use, very affordable (free of charge) and
very well documented and supported tools (like CUDA).

common sense tells us, that there is no magic device that
can solve all problems instantly, and that any type of disruptive
technology will take a while until it is adopted. particularly in
science, where people first and foremost worry about getting
an accurate (enough) answer over getting it quickly.

have no idea who is christian or mike, and I never said that NVDIA

you should read http://lammps.sandia.gov/authors.html
and http://lammps.sandia.gov/doc/Section_accelerate.html

sponsors LAMMPS to program and maintain GPU accelerated code in lammps (did
I ?). I was just wondering if GPU Units are so efficient ,there are many
rumors you can find on the internet, most of them marketing material from a
vendor whose ultimate goal is to obtain a profit .

there also have been detailed discussions on this very mailing list
and there are published benchmarks in scientific journals. granted,
even there people usually show the strong parts of their implementations,
but if that happens exactly what you need, then you _will_ benefit.

you can test this for yourself at fairly low financial risk
by purchasing a decent, CUDA capable GPU, e.g. a GeForce GTX 580
or a GeForce GTX 560Ti and run some tests. in the worst case, you
have gained a powerful GPU for visualization.

there are also options to do a "GPU test drive"
through certain vendors with GPU cluster expertise.

there is nothing better to dispel myths than to put them to the test.
http://www.nvidia.com/object/nvision08_gpu_v_cpu.html

:wink:

cheers,
    axel.

Adam Savage and jamie Hyneman

TV celebrities and scientists, Jamie Hyneman and Adam Savage, world renowned for their work as hosts of the MythBusters television show, have built a one-of-a-kind, never-before-seen, awe-inspiring machine that demonstrated the difference between a GPU and CPU. What transpired included two robots, thousands of paint balls and the Mona Lisa herself. You’ve got to see it to believe it! ?

NO comments about that …
Oscar G.