GPU ID with KOKKOS

You can use the environment variable "CUDA_VISIBLE_DEVICES" to specify
which GPU to use:
http://www.acceleware.com/blog/cudavisibledevices-masking-gpus.

You can check which GPU is being used via "nvida-smi", but note that
the GPU ID in nvidia-smi may not match what you need to specify with
CUDA_VISIBLE_DEVICES.

Also for LJ on GPU, you should set the neighbor binsize = ghost atom
cutoff, which is 2.8 for the standard LJ benchmark.

Stan

Thanks Stan and Anders,

Seems my GPU is Device 0 so KOKKOS was picking it up correctly, even though the order was reversed with nvidia-smi as you suggested.

Compared to MPI with 4 cores, but for my initial tests, I’m getting 1000 speedup with the KOKKOS, CUDA with 4 OMP threads, 1 MPI task and a 1600 speedup with the GPU package. I was getting about 50 speedup with KOKKOS with 4 OMP threads. I am getting about a 15 speedup with OPT. Still testing though, and I’m still optimizing the options and binsizes as you suggested.

Thanks again for your help.

To answer your other questions:

Also, I'm getting segmentation faults if I try to use more than 1 MPI task
when using a GPU. I've compiled MPI and KOKKOS OMP versions of LAMMPS with
Intel compilers and I've compiled KOKKOS CUDA OMP with GNU compilers (only
because I was getting errors when trying to compile with Intel with KOKKOS
CUDA).

This is typically because your MPI doesn't support GPU Direct. Which
MPI are you using?

I have about 95,000 atoms and I have a lot of harmonic bonds, angles, and
OPLS torsions, and I am using a lj/cut potential base potential which I
think gets turned into a KOKKOS potential. No electrostatics.

Should I expect to get speedup with KOKKOS without a GPU?

If you are comparing OpenMP threads to MPI, then typically no. This is
true of the USER-OMP package as well. If you are comparing OpenMP
threads to serial, then yes. You will need to use a half neighbor list
for CPU since a full neighbor list is default.

1) I was getting errors when trying to start KOKKOS from a restart file.
I'm not sure if this is because the restart file was written with a October
20 version of LAMMPS vs a Nov 22 version I compiled with KOKKOS.

Very possible, what is the error?

2) I originally was running lj/cut/opt and apparently KOKKOS didn't
do anything with this with the kk flag. I needed to change to lj/cut so
that KOKKOS recognized it I guess.

Yes that is expected.

Stan