[lammps-users] Can not use fix tune/kspace when runing GPU job

Hi,
I am using fix tune/kspace to tune coulombic cutoff of pppm style. When using GPU, some trial kspace_style are not compatible under current GPU settings. For example, when trying MSM style, an error raises: ERROR: Must use ‘kspace_modify pressure/scalar no’ with GPU MSM Pair styles. So I want to ask how can I skip those kspace_style that can not be run?

Thanks!

[

Jichen Li

[email protected]

](头像签名)

Hi,
I am using fix tune/kspace to tune coulombic cutoff of pppm style. When using GPU, some trial kspace_style are not compatible under current GPU settings. For example, when trying MSM style, an error raises: ERROR: Must use ‘kspace_modify pressure/scalar no’ with GPU MSM Pair styles. So I want to ask how can I skip those kspace_style that can not be run?

you cannot. also, I don’t think that this will yield good results, since running the GPU incurs additional one-time overhead not present for non-accelerated styles that will taint all time measurements taken.
fix tune/kspace is rather old code that predates GPU and other acceleration methods. It also is mostly unmaintained so I would not put much trust in the results on modern hardware with modern toolchains.

Thanks for your kindly reply! If fix tune/kspace is deprecated in the future, is there any strategy to replace this command?

[

Jichen Li

[email protected]

](头像签名)

签名由网易邮箱大师定制

On 12/21/2021 20:35,Axel Kohlmeyer[email protected] wrote:

Thanks for your kindly reply! If fix tune/kspace is deprecated in the future, is there any strategy to replace this command?

No. One issue is that there are now many more possible ways to improve performance that cannot be as easily optimized as what fix tune/kspace does.
It is rather a crude approach. Just to name some examples:

  • when running in parallel, there is a benefit to use MPI plus OpenMP instead of just MPI for parallelization, since Kspace styles cannot be parallelized as well via MPI and at some point it is more effective to use fewer MPI ranks to reduce the parallel overhead of the 3d-FFTs
  • for the same reason there can be a (significant) benefit (for large systems and with many processors) to use the verlet/split run style and have kspace been run concurrently on a separate partition of optimal size (which is usually much smaller than that for the rest of the calculation).
  • when running in parallel and on the GPU, it can beneficial to run kspace on the CPU concurrently to the pair styles on the GPU (and increase the coulomb cutoff).

Also, on modern hardware with TurboBoost or similar features that can vary the CPU clock depending on thermal load, timings are not at all reliable unless that kind of feature is turned off (and then some available performance boost is not available) or the CPUs are properly “pre-heated” (which is difficult to do reliably from within a feature like fix tune/kspace).

Thus the only approach moving forward is to tune this empirically. In my tests I was always able to get a much better performance from manual tuning settings that fix tune/kspace could.