GTX 670 cards apparently slower than GTX 580s?

Dear Lammps users,

I am currently testing GTX 670 cards with Lammps.
Quite surprisingly, although these graphics cards have more Cuda cores than the previous cards I was using, GTX 580, my simulation are almost twice slower.

Here is the message I get from my lammps output file :

Christian has experience with this.



From my experience the GTX670 should be slower than the GTX580 although only slightly so in single precision mode. In double precision it is more than a factor of two slower. The reason that it will be slower in single precision (even though it is supposed to have more computational power) is its much lower memory bandwidth. You are apparently using Mixed Precision so it could be either problem. Also you might need to run larger systems then with Fermi generation GPUs to get maximum performance (because you have many more [although slower] cores).

Generally the currently available Kepler generation GPUs are not well suited for LAMMPS. They are based on the supposedly mid level chip GK104 which has some reduced capabilities with regards to computation. Sometime end of this year the "BIG" Kepler will be released, which should be much better suited for LAMMPS. Until then where is no reason to switch out existing Tesla C20XX or GTX580s / GTX570s.

Best regards

-------- Original-Nachricht --------

For the single precision case you are using, I got better performance for the GTX680 than a M2090 for almost all cases (see attached image). This is a single 4-core intel with single M2090 or single GTX680 on PCI-2e. I also saw an improvement in relative performance at smaller atom counts on Kepler I that is good for strong-scaling and sharing GPU between MPI processes.

You need to compile the lib/gpu library using:

CUDA_ARCH = -arch=sm_30

If this doesn’t work, let me know.

  • Mike