You can use the environment variable "CUDA_VISIBLE_DEVICES" to specify
which GPU to use:
http://www.acceleware.com/blog/cudavisibledevices-masking-gpus.
You can check which GPU is being used via "nvida-smi", but note that
the GPU ID in nvidia-smi may not match what you need to specify with
CUDA_VISIBLE_DEVICES.
Also for LJ on GPU, you should set the neighbor binsize = ghost atom
cutoff, which is 2.8 for the standard LJ benchmark.
Stan
Thanks Stan and Anders,
Seems my GPU is Device 0 so KOKKOS was picking it up correctly, even though the order was reversed with nvidia-smi as you suggested.
Compared to MPI with 4 cores, but for my initial tests, I’m getting 1000 speedup with the KOKKOS, CUDA with 4 OMP threads, 1 MPI task and a 1600 speedup with the GPU package. I was getting about 50 speedup with KOKKOS with 4 OMP threads. I am getting about a 15 speedup with OPT. Still testing though, and I’m still optimizing the options and binsizes as you suggested.
Thanks again for your help.
To answer your other questions:
Also, I'm getting segmentation faults if I try to use more than 1 MPI task
when using a GPU. I've compiled MPI and KOKKOS OMP versions of LAMMPS with
Intel compilers and I've compiled KOKKOS CUDA OMP with GNU compilers (only
because I was getting errors when trying to compile with Intel with KOKKOS
CUDA).
This is typically because your MPI doesn't support GPU Direct. Which
MPI are you using?
I have about 95,000 atoms and I have a lot of harmonic bonds, angles, and
OPLS torsions, and I am using a lj/cut potential base potential which I
think gets turned into a KOKKOS potential. No electrostatics.
Should I expect to get speedup with KOKKOS without a GPU?
If you are comparing OpenMP threads to MPI, then typically no. This is
true of the USER-OMP package as well. If you are comparing OpenMP
threads to serial, then yes. You will need to use a half neighbor list
for CPU since a full neighbor list is default.
1) I was getting errors when trying to start KOKKOS from a restart file.
I'm not sure if this is because the restart file was written with a October
20 version of LAMMPS vs a Nov 22 version I compiled with KOKKOS.
Very possible, what is the error?
2) I originally was running lj/cut/opt and apparently KOKKOS didn't
do anything with this with the kk flag. I needed to change to lj/cut so
that KOKKOS recognized it I guess.
Yes that is expected.
Stan