Dear Axel,
Thanks for your insight and explanations. It's very helpful.
One point I still don't understand: the number of cores for the Tesla
GPU's is relatively low compares to GeForce, and the memory bandwidth and
FLOPS are similar (compare [1] and [2], for example). Based on these
metrics alone, one would imagine the GeForce card to be faster. What is it
about Tesla-based GPU's therefore that make them faster than GeForce
who says that Tesla GPUs are faster than (high-end) GeForce?
for both the "G200" and the "Fermi" generation of nvidia GPUs,
the fastest single GPU GeForce card (GeForce 285 and GeForce 480)
significantly outperformed the corresponding Tesla Model (C1060, C2050)
when running MD in single or mixed precision. even for double precision,
the GeForce cards tend to do fairly well, since a lot of the performance
in MD, is not due to the pure floating point performance, but also
due to memory bandwidth, so the faster memory can compensate
for lack of double precision capable floating point units to some degree,
specifically for potentials, that don't require a lot of arithmetic.
(specially for single precision, since I understand double precision is not
possible with the latter)?
this is not correct. GeForce GPUs can do double precision, too.
the K10 card is described as the same architecture than what sits in
the GeForce GTX 680, i.e. GK104 with the same number of cores and
otherwise looks like a GeForce GTX 690 with more memory and some
gimmicks like ECC enabled.
according to data available on wikipedia its single vs. double precision
capability appear to be about the same, too. with the GTX 680 being
clocked higher. all the special features that nvidia has announced and
that help with HPC (e.g. get the extra performance boost with lammps
reported on ORNL's new Titan machine) are apparently reserved for the
K20 tesla GPU (with GK110 architecture).
when comparing prices between GeForce and Tesla, you have to
take a number of issues under consideration:
- tesla are built for reliability and being fully loaded 24/7
this means: more testing, more expensive components,
more "manual" work (human workforce is what makes
things expensive).
- tesla cards have usually much more RAM
- tesla cards have an extended warranty through nvidia
- tesla cards have better management and monitoring capability
(helps a lot when you are deploying a lot of them)
- tesla cards have (some) gimmicks enabled, that GeForce cards dont.
with geforce cards vendors have much more limited warranties and
often operate with the principle that they produce cheaper and risk
more failures and then replace what is broken when it is broken, if
the user notices at all. many (memory) errors that are detected in
GPU computing will never show with regular use as a graphics card
(or you may not notice if some pixels are colored wrong).
thus geforce cards can be produced significantly cheaper and you
as a user go a higher risk and have more manual work. ultimately,
the situation is similar to having nvidia quadro GPUs, which are also
practically the same GPU chips, but with other feature sets enabled,
that are not available in GeForce (and tweaks and optimizations for
Stereo, CAD and related operations that don't matter in games).
in general, you always have to distinguish between an
effective sales pitch and real performance data. in the
IT business (and that applies to the entire supply chain
from hardware to software vendors, from companies that
produce components to system integrators) it is common
practice to deceive their customers by carefully omitting
data that would reveal the shortcomings of a specific solution.
axel.