struggling with the compilation of lammps-gpu on a K10

Dear all,

I managed to get lammps working with USER-CUDA on an Nvidia K-10
system fairly easily.
On very large alkane systems I get a 2x speed increase. On smaller
systems it becomes less efficient, as expectred.

Therefore I want to test lammps-gpu as it is written to be faster for
smaller system, but I seem to be making some mistakes.

When calling the compiled lammps version I get messages of the following kind:

LAMMPS (27 Jan 2013)
ERROR: GPU library not compiled for this accelerator (gpu_extra.h:40)
Cuda driver error 4 in call at file 'geryon/nvd_device.h' in line 116.
*** The MPI_Abort() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.

Below is a short explanation of what I tried to do.
I changed the architecture up to sm_35 but was never able to
successfully start to simulation,

I would be very glad to any hint that would get me on a right direction.

- Pim

The versions of the nvidia driver and compiler are given here:

nvidia-smi
NVIDIA-SMI 4.304.54 Driver Version: 304.54

0 Tesla K10.G2.8GB
...etcetera ..

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2012 NVIDIA Corporation
Built on Fri_Sep_21_17:28:58_PDT_2012
Cuda compilation tools, release 5.0, V0.2.1221

Now I tried compiling the gpu library with several of the suggested makefiles.
I just took the makefile.lincoln template, it compiled:

CUDA_HOME = /usr/local/cuda
NVCC = nvcc

# Tesla CUDA
CUDA_ARCH = -arch=sm_20
# newer CUDA
#CUDA_ARCH = -arch=sm_13
# older CUDA
#CUDA_ARCH = -arch=sm_10 -DCUDA_PRE_THREE

CUDA_PRECISION = -D_SINGLE_SINGLE
CUDA_INCLUDE = -I$(CUDA_HOME)/include
CUDA_LIB = -L$(CUDA_HOME)/lib64
CUDA_OPTS = -DUNIX -O3 -Xptxas -v --use_fast_math

CUDR_CPP = mpic++ -DMPI_GERYON -DUCL_NO_EXIT -DMPICH_IGNORE_CXX_SEEK
CUDR_OPTS = -O2 # -xHost -no-prec-div -ansi-alias

BIN_DIR = ./
OBJ_DIR = ./
LIB_DIR = ./
AR = ar
BSH = /bin/sh

CUDPP_OPT = -DUSE_CUDPP -Icudpp_mini

include Nvidia.makefile

Greetings, Pim Schravendijk

Hi Pim,

have you tried sm_30?

Regards,
-Trung

Dear all,

I managed to get lammps working with USER-CUDA on an Nvidia K-10
system fairly easily.
On very large alkane systems I get a 2x speed increase. On smaller
systems it becomes less efficient, as expectred.

Therefore I want to test lammps-gpu as it is written to be faster for
smaller system, but I seem to be making some mistakes.

When calling the compiled lammps version I get messages of the following kind:

LAMMPS (27 Jan 2013)
ERROR: GPU library not compiled for this accelerator (gpu_extra.h:40)
Cuda driver error 4 in call at file 'geryon/nvd_device.h' in line 116.
*** The MPI_Abort() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.

Below is a short explanation of what I tried to do.
I changed the architecture up to sm_35 but was never able to
successfully start to simulation,

yes, because the K10 is in essence a slightly modified
GeForce GTX 690 and thus has compute capability 3.0
and not 3.5. please have a look at:

http://en.wikipedia.org/wiki/Nvidia_Tesla

axel.

yes you are correct, thank you!

The issue then becomes: I try to use two gpu cores and two cpu cores,
but I get this:

mpirun -np 2 ../lammps-27Jan13-GPU/src/lmp_schraven -suffix gpu < in.butane
LAMMPS (27 Jan 2013)
ERROR: Accelerator sharing is not currently supported on system
(gpu_extra.h:47)

Should I maybe try to do this only with openmp threads?

I have the feeling I am making some basic mistake but I can't find
many mentions of this error on the mailing lists.

yes you are correct, thank you!

The issue then becomes: I try to use two gpu cores and two cpu cores,
but I get this:

mpirun -np 2 ../lammps-27Jan13-GPU/src/lmp_schraven -suffix gpu < in.butane
LAMMPS (27 Jan 2013)
ERROR: Accelerator sharing is not currently supported on system
(gpu_extra.h:47)

Should I maybe try to do this only with openmp threads?

no. you should check with nvidia-smi to what "compute mode" the gpus are set.
it should be "DEFAULT" (0) and must not be any other.

axel.