USER-CUDA questions

When I compile using Make.py and specifiy mode=single [or double] arch=35 I still observe this flying by CUDA_ARCH=20?

mpicxx -g -O3 -DLAMMPS_GZIP -I…/…/lib/cuda -DLMP_USER_CUDA -DMPICH_SKIP_MPICXX -DOMPI_SKIP_MPICXX=1 -I/cm/shared/apps/cuda50/toolkit/5.0.35/include -DUNIX -DFFT_CUFFT -DCUDA_PRECISION=1 -DCUDA_ARCH=20 -M …/pair.cpp > pair.d

(btw what are the keywords or precision 3&4?)

When I edit the lib/cudaMakefile.defaults and compile using make auto; make yes-user-cuda;make mpi I observe the same. Is there a way to verify what arch and precision modes where set?

-Henk

See doc/accelerate_cuda.html for details on this.

If you build the lib/cuda manually you do something like this;

make precision=1,2,4 arch=20,35,etc

You should see those choices reflected in the compile lines (for the lib files)
and written into 2 files in lib/cuda - Makefile.defaults and Makefile.lammps

If you then build LAMMPS itself with the USER-CUDA package, it takes
those precision/arch settings from the lib/cuda/Makefile.lammps file, so
you should also see them reflected in the LAMMPS compile lines.

Make.py is simply doing both steps for you. Underneath it is doing
the same thing.

Steve

Sir, I am also facing the same problem. i think there is bug in Makefile.common:

given below is line 83-118 of file Makefile.common in lib/cuda dir
As you can see irrespective of choice given, code below will assign -DCUDA_ARCH=20 flag

make architecture settings

ifeq ((strip (arch)), 13)
CUDA_FLAGS += -DCUDA_ARCH=13
SMVERSIONFLAGS := -arch sm_13
else
ifeq ((strip (arch)), 20)
CUDA_FLAGS += -DCUDA_ARCH=20
#NVCC_FLAGS += -ftz=false -prec-div=true -prec-sqrt=true
NVCC_FLAGS += -ftz=true -prec-div=false -prec-sqrt=false
SMVERSIONFLAGS := -arch sm_20
else
ifeq ((strip (arch)), 21)
CUDA_FLAGS += -DCUDA_ARCH=20
#NVCC_FLAGS += -ftz=false -prec-div=true -prec-sqrt=true
NVCC_FLAGS += -ftz=true -prec-div=false -prec-sqrt=false
SMVERSIONFLAGS := -arch sm_21
else
ifeq ((strip (arch)), 30)
CUDA_FLAGS += -DCUDA_ARCH=20
#NVCC_FLAGS += -ftz=false -prec-div=true -prec-sqrt=true
NVCC_FLAGS += -ftz=true -prec-div=false -prec-sqrt=false
SMVERSIONFLAGS := -arch sm_30
else
ifeq ((strip (arch)), 35)
CUDA_FLAGS += -DCUDA_ARCH=20
#NVCC_FLAGS += -ftz=false -prec-div=true -prec-sqrt=true
NVCC_FLAGS += -ftz=true -prec-div=false -prec-sqrt=false
SMVERSIONFLAGS := -arch sm_35
else
CUDA_FLAGS += -DCUDA_ARCH=99
SMVERSIONFLAGS := -arch sm_13
endif

Once i edited it, its fine for me. (line 107, change 20 to 35)

ah, will try that…now I get confirmation upon compilation that the right arch is picked, thanks, btw there are other errors in that code block

to respond to Steve, the page is a bit incomplete, for example the doc states ‘mode=single’ but ‘mode’ does not appear as option to switch -cuda, so is ‘single’ precision=1? when I use mode=4 or precision=4 with Make.py I receive invalid args error…ah, the mode option are ‘single,double,mixed’ (buried in code)

also in the manual section the first command ‘make’ will just display the help page, I believe it should be ‘make auto’

Thanks all, I’m able to compile now.

-Henk