Hello,
I noticed when making the lammps cuda library that if I run:
$ make arch=21
the arch reverts to 20.
This is in line with what is in “lammps/lib/cuda/makefile.common” (see below).
make architecture settings
ifeq ((strip (arch)), 13)
CUDA_FLAGS += -DCUDA_ARCH=13
SMVERSIONFLAGS := -arch sm_13
else
ifeq ((strip (arch)), 20)
CUDA_FLAGS += -DCUDA_ARCH=20
#NVCC_FLAGS += -ftz=false -prec-div=true -prec-sqrt=true
NVCC_FLAGS += -ftz=true -prec-div=false -prec-sqrt=false
SMVERSIONFLAGS := -arch sm_20
else
ifeq ((strip (arch)), 21)
CUDA_FLAGS += -DCUDA_ARCH=20
#NVCC_FLAGS += -ftz=false -prec-div=true -prec-sqrt=true
NVCC_FLAGS += -ftz=true -prec-div=false -prec-sqrt=false
SMVERSIONFLAGS := -arch sm_21
else
CUDA_FLAGS += -DCUDA_ARCH=99
SMVERSIONFLAGS := -arch sm_13
endif
endif
endif
I noticed this for both the 13Dec11 and 17Jan12 versions of lammps.
Is there any reason for this?
Thanks,
– WM
Hello,
I noticed when making the lammps cuda library that if I run:
$ make arch=21
the arch reverts to 20.
This is in line with what is in "lammps/lib/cuda/makefile.common" (see
below).
# make architecture settings
ifeq (\(strip (arch)), 13)
CUDA_FLAGS += -DCUDA_ARCH=13
SMVERSIONFLAGS := -arch sm_13
else
ifeq (\(strip (arch)), 20)
CUDA_FLAGS += -DCUDA_ARCH=20
#NVCC_FLAGS += -ftz=false -prec-div=true -prec-sqrt=true
NVCC_FLAGS += -ftz=true -prec-div=false -prec-sqrt=false
SMVERSIONFLAGS := -arch sm_20
else
ifeq (\(strip (arch)), 21)
CUDA_FLAGS += -DCUDA_ARCH=20
#NVCC_FLAGS += -ftz=false -prec-div=true -prec-sqrt=true
NVCC_FLAGS += -ftz=true -prec-div=false -prec-sqrt=false
SMVERSIONFLAGS := -arch sm_21
else
CUDA_FLAGS += -DCUDA_ARCH=99
SMVERSIONFLAGS := -arch sm_13
endif
endif
endif
I noticed this for both the 13Dec11 and 17Jan12 versions of lammps.
Is there any reason for this?
they obviously don't need differentiation at the source code level.
axel.
Hi
Axel is correct, there is no source code level difference for CC21 and CC20. Hence the only difference is the actual compiler flag for producing code for devices with CC21 ( the -arch_sm=21 flag, for nvcc), while the source code level path is that for all fermi level GPUs.
Cheers
Christian
-------- Original-Nachricht --------
Christian can comment if this is a typo in the makefile,
or as Axel says, that there is no need to distinguish
between the two.
Steve