Hi,
I have observe a strange behavior. Probably a stupid thing I just
don't realize. I am using the standard input script from
examples/USER/cuda/
in.melt_2.5.cuda. One of the first instructions configuring the number
of gpus I want to use. Since I have three of them, I instruct LAMMPS
to use 3 gpus:
package cuda gpu/node 3
Now, when I run LAMMPS with this command:
~/opt/lammps/lmp_kid_my_mv18c32f321_nocufft -sf cuda <
../lammps-input/in.melt_2.5.3.cuda
the LAMMPS uses only 1 gpu. This is confirmed by the LAMMPS output
$ ~/opt/lammps/lmp_kid_my_mv18c32f321_nocufft -sf cuda <
../lammps-input/in.melt_2.5.3.cuda
LAMMPS (1 Jul 2012)
# Using LAMMPS_CUDA
USER-CUDA mode is enabled (lammps.cpp:396)
# CUDA: Activate GPU
# Using device 0: Tesla M2090
Lattice spacing in x,y,z = 1.16961 1.16961 1.16961
Created orthogonal box = (0 0 0) to (46.7843 46.7843 46.7
....
However, if I run LAMMPS with mpiexec on the same input file LAMMPS
uses 3 gpus as anticipated.
$ mpiexec -np 3 -npernode 3
~/opt/lammps/lmp_kid_my_mv18c32f321_nocufft -sf cuda <
../lammps-input/in.melt_2.5.3.cuda
LAMMPS (1 Jul 2012)
# Using LAMMPS_CUDA
USER-CUDA mode is enabled (lammps.cpp:396)
# CUDA: Activate GPU
# Using device 0: Tesla M2090
Lattice spacing in x,y,z = 1.16961 1.16961 1.16961
# Using device 1: Tesla M2090
Created orthogonal box = (0 0 0) to (46.7843 46.7843 46.7843)
# Using device 2: Tesla M2090
1 by 1 by 3 MPI processor grid
Do I have to specify other options for not-mpiexec-controlled
execution? I thought that it is sufficient to specify a request for 3
gpus in the input script? Do I have to articulate this requirement
anywhere else for a standalone execution?
Best,
Magda