Problem with GPU acceleration on Windows

hi all,

I’ve installed LAMMPS on Windows (version: LAMMPS 64-bit 2Aug2023-MSMPI with Python), and installed MPI-CH2 so I can use GPU acceleration (GPU version: NVIDIA A2000, CPU version: Intel i9-13900K). The thing is, both the input file of myself and the input file from LAMMPS benchmark run much slower with GPU acceleration.

I used “mpiexec -np 16 lmp -in lj.in” for multithreading task, and “mpiexec -np x lmp -sf gpu -pk gpu 1 platform 1 -in lj.in” for GPU acceleration, where the speed goes down as the number of MPI tasks with GPU. Performance of the Lennard-Jones liquid benchmark are as below:

1.mpiexec -np 16 lmp -in lj.in

2.mpiexec -np 16 lmp -sf gpu -pk gpu 1 platform 1 -in lj.in

3.mpiexec -np 1 lmp -sf gpu -pk gpu 1 platform 1 -in lj.in

It can be informed that the pair calculation takes most of the time with GPU acceleration, so I used “package gpu 0 pair/only on” in the input file. But the speed keeps going down.

4.mpiexec -np 1 lmp -sf gpu -pk gpu 1 platform 1 -in lj.in (with package gpu 0 pair/only on in input file)

I was wondering what is the cause for this phenomenon?
Hope someone could help.

Best regards

yun

What kind of CPU and GPU do you have?

GPU: NVIDIA A2000
CPU: Intel i9-13900K
LAMMPS: LAMMPS 64-bit 2Aug2023-MSMPI with Python

There are two issues here:

  • your GPU is at best mid-level
  • you are running a rather small problem

There is a limit as to how many CPUs you can attach to a single GPU where you still see improvements. Use too many MPI processes and it will slow down. Since you don’t have a high end GPU using 16 MPI processes may be too much.

GPUs require a large number of work units to be efficient. The LJ bench example uses on 32000 atoms in its default setup. Try with more by adding -v x 4 -v y 4 -v z 4 to your command line. That will increase the number of atoms to 2 million. That should give better GPU utilization.

Since I this seems to be a desktop machine, you also need to factor in that the GPU may be used by other applications or the desktop itself that will consume parts of its resources.

Thanks for your kind reply.
I’ll try a larger problem as you recommend.
As for the GPU usage, the CPU of my desktop has GPU to be used by other applications, and the GPU has been assigned to be used by LAMMPS, so maybe this is not the cause.
Thanks again for your help, and I’m running the larger case now.