Our group are going to purchase several machines. I have heard a lot
about GPU speed-up and I want to have a try with that.
I have read the doc page of GPU package and searched on the mail list,
but due to that I do not have sufficient knowledge on this field. I
still do not have a clue that how to choose the proper model for us
considering the price as well as the performance. I see several threads
of the related topics on mail list, but the time those mails posted is
several year ago.
My questions are:
- When use GPUs for LAMMPS, it is recommended that to use
double-precision calculations. The results would be not very accurate if
single-precision is used. Is that right?
it depends. different parts of the computation are affected by this differently.
in MD there is a lot of error cancellation going on. it often depends on how much, whether you can get away with single precision or not, and that depends on the system size, topology and geometry, and simulation settings. please note, that the GPU package also has the option of mixed precision, where the most precision sensitive parts of the computation are done in double precision, which can achieve nearly single precision speed with a significant decrease of errors.
How do I know whether double or
single precision calculation should be used in one particular simulation?
this is difficult to tell in general. one thing that is easy to verify is, that using any variable cell calculation (via fix npt or fix press/berendsen) or any calculation where you need to compute the pressure for you analysis, is better done in double precision, since the stress tensor calculation is particularly sensitive to precision settings except if you run simulations at very high pressure.
- Is that right to select the GPU model mainly based on the the rank of
no. there is also the memory bandwidth (clock and bus width), the amount of RAM and the GPU generation, and how the GPU is connected to the CPU.
- I found several articles said that GPUs for the consumer market like
GTX 2080 super have much higher price-to-performance ratio. Although the
price of Telsa series is several times expensive. but the increase for
the speed-up is much lower. Does such saying make sense for LAMMPS?
this primarily applies to calculations in single or mixed precision. most consumer GPUs are (deliberately) crippled in either hardware or driver support to “encourage” folks to buy the significantly more expensive telsa models (or the even more expensive quadro models). there is significant resentment in the community of GPU accelerated MD users (and developers of some GPU MD codes) building because of that, and some developers are actively advocating to stay away from nvidia hardware. technically, the GPU package is capable of supporting non-nvidia GPUs when compiled in OpenCL mode, but there have been recent reports of incompatibilities between the OpenCL implementation of some vendors and the code in LAMMPS. also, there is some uncertainty in how far the support for non-CUDA GPU acceleration in KOKKOS has progressed. neither OpenCL on non-nvidia or KOKKOS on non-CUDA GPUs is currently part of the LAMMPS testing.
Finally, it would be much appreciated if you could recommend a GPU model
with high price-to-performance ratio for us. We are going to purchase
machines with SuperMicro X11DAi-N mother board shipped.
sorry, there is no easy choice. i would recommend you either try to use somebody else’s machine for testing or try to get a machine on loan to do some tests for your specific problems. how well a specific GPU works for a particular application is very difficult to predict without knowing what the exact use case is and what settings are required. see my discussion above. another issue is, that there are often limitations of the mainboard, case and power supply that restrict what kind of GPU (and how many) are possible. another factor is the availability of technical skills. using GPUs well requires more knowledge in compiling and running applications and installing/maintaining tools and divers than when using CPU-only machines. with the availability of CPUs with increasing numbers of cores per socket, the price/performance gap between GPUs and CPUs has been narrowing, even for cases that are well suited for GPU acceleration.