GPU library not compiled

akohlmey · June 5, 2018, 12:48am

Normally, I do

cd src

make clean-all

cd ../lib/gpu

make -f Makefile.linux clean

vim Makefile.linux

make -f Makefile.linux

cd ../../src

make yes-gpu

make mpi

Is that OK?

that looks ok from here.

Cuda 8 is the last version that supports sm_20.

i would next try to compile specifically for your other card.

we have a pending pull request, that gives an example for how to compile
the GPU library for two different architectures at the same time.
so it may be worth waiting for the next patch release. but don't hold your
breath. multiple LAMMPS developers are currently busy with different
non-LAMMPS projects (the kind of project that pays our salaries) and
administrative work. so processing of pull requests and release of patches
will be slow for the next few weeks.

axel.

Mahmood_Naderan1 · June 5, 2018, 7:52am

I decided to create two folders
1- lammps-11Aug17-m2000 where I set CUDA_ARCH=-arch=sm_50
2- lammps-11Aug-17-c2075 whre I set CUDA_ARCH=-arch=sm_21

Does that eliminate multiple GPU problem which I have to wait for patch? As you can see below, for the first one, sm_50 symbols are there in lmp_mpi and that device is 0, however, the lammps command fails

mahmood@…7694…:~/lammps-11Aug17-m2000$ strings ~/lammps-11Aug17-m2000/src/lmp_mpi | grep sm_50
.target sm_50
.target sm_50
mahmood@…7694…:~/lammps-11Aug17-m2000$ cd …/eam/
mahmood@…12…7694…:~/eam$ mpirun -np 4 ~/lammps-11Aug17-m2000/src/lmp_mpi -sf gpu -pk gpu 0 -in in.eam
LAMMPS (11 Aug 2017)
ERROR: Illegal package gpu command (…/fix_gpu.cpp:86)
Last command: package gpu 0

akohlmey · June 5, 2018, 12:16pm

I decided to create two folders
1- lammps-11Aug17-m2000 where I set CUDA_ARCH=-arch=sm_50
2- lammps-11Aug-17-c2075 whre I set CUDA_ARCH=-arch=sm_21

Does that eliminate multiple GPU problem which I have to wait for patch?
As you can see below, for the first one, sm_50 symbols are there in lmp_mpi
and that device is 0, however, the lammps command fails

[email protected]...:~/lammps-11Aug17-m2000$ strings ~/lammps-11Aug17-m2000/src/lmp_mpi
> grep sm_50
.target sm_50
.target sm_50
[email protected]...:~/lammps-11Aug17-m2000$ cd ../eam/
[email protected]...:~/eam$ mpirun -np 4 ~/lammps-11Aug17-m2000/src/lmp_mpi -sf
gpu -pk gpu 0 -in in.eam
LAMMPS (11 Aug 2017)
ERROR: Illegal package gpu command (../fix_gpu.cpp:86)
Last command: package gpu 0
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status,
thus causing
the job to be terminated. The first process to do so was:

Process name: [[21154,1],1]
Exit code: 1
--------------------------------------------------------------------------
[email protected]...:~/eam$ ~/NVIDIA_CUDA-8.0_Samples/1_Utilities/deviceQuery/deviceQuery
> grep M2000
Device 0: "Quadro M2000"
> Peer access from Quadro M2000 (GPU0) -> Tesla C2075 (GPU1) : No
> Peer access from Tesla C2075 (GPU1) -> Quadro M2000 (GPU0) : No
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.0, CUDA Runtime
Version = 8.0, NumDevs = 2, Device0 = Quadro M2000, Device1 = Tesla C2075

How that can be explained?

simple. you didn't pay sufficient attention to the documentation.
specifically for using an older version of LAMMPS you *must* look at the
documentation matching your specific version, as the syntax or semantics of
commands may be different from the current version. the LAMMPS homepage
always shows the documentation for the latest patch/development version
(not stable).

for the current version of LAMMPS "-pk gpu 0" means use *all* available
GPUs. for your version, this "wildcard" does not exist, so you have to use
a number > 0.
to select individual GPUs, you cannot use the LAMMPS command line or script
commands, but have to set the CUDA_VISIBLE_DEVICES environment variable to
instruct the nvidia driver which GPUs are visible. the fact, that the CUDA
utility reports the GPUs as #0 and #1 is irrelevant for LAMMPS itself. it
*does* matter for the CUDA_VISIBLE_DEVICES environment variable, though.

Something is going crazy here...

PEBCAC!

axel.

Anders_Hafreager1 · June 5, 2018, 12:44pm

You can specify (with the GPU package) which gpu to use with the gpuID keyword in the package command. For example: package gpu 1 gpuID 1 1 (those two 1’s means from gpuid 1 to gpuid 1).

Mahmood: you can also try to compile with OpenCL (there is another makefile for that).

Anders

Mahmood_Naderan1 · June 5, 2018, 5:13pm

Thank you very much. In fact the CUDA_VISIBLE_DEVICES was the solution even for 11Aug17 version.

Regards,
Mahmood

You can specify (with the GPU package) which gpu to use with the gpuID keyword in the package command. For example: package gpu 1 gpuID 1 1 (those two 1’s means from gpuid 1 to gpuid 1).

Mahmood: you can also try to compile with OpenCL (there is another makefile for that).

Anders