GPU implementation structure

Hello,
I would like to write a GPU implementation for lammps. If I start to read some example of GPU implementation I cannot undestand where the GPU implementation is.

If we take for instance the following file:

I see:
void PairDPDGPU::cpu_compute()
That seems to be the common cpu implementation but is not called using GPU.

The other methods are:
int **dpd_gpu_compute_n
void dpd_gpu_compute

But I cannot find any implementation of those. Could you help me to undestand how does it works and how to write a GPU fix? Where can I find more documentation about GPU implmentations?

Thanks

GPU implementation of what?

The crucial hint is in the build instructions, for example here:
https://docs.lammps.org/Build_extras.html#traditional-make

The GPU package has no example for implementing a “fix”. It focuses on accelerating only the part of the computation that benefits the most from accelerating with a GPU which is the neighbor list build and the computation of the pair style. Other force computations happen concurrently on the CPU. Fixes and computes rarely comprise of a significant contribution to the total time, especially for smaller systems. The LAMMPS GPU package allows to oversubscribe GPUs, i.e. attach multiple MPI processes to the same GPU (usually 2-6) to benefit from the availability of multiple times more CPU cores than GPUs on typical GPU compute nodes these days.

If you are looking for a way to have the entire GPU accelerated computation being run on the GPU, you should look at the KOKKOS package instead since that follows a different approach.

There is no detailed documentation about the implementation outside the publications listed by LAMMPS when running a GPU package accelerated calculation and the source code and README files in lib/gpu.