GPU package on Tesla K20c, and the message "GPU library not compiled for this accelerator"

Hi Lammps Users,

I’m running into a problem with the GPU package on our K20c (Kepler) GPUs, and I thought I’d ask in case someone has already found a solution. The examples/gpu/ run command and error message is:

130814lmp/examples/gpu(147): mpirun -machinefile mpihosts ./…/…/src/lmp_gpu -in in.gpu.melt.2.5 -sf gpu

ERROR: GPU library not compiled for this accelerator (…/gpu_extra.h:40)

It looks similar to the problem described here http://lammps.sandia.gov/threads/msg35893.html.

I use CUDA_ARCH=sm_35 (or 30 or 21, followed by make yes-GPU and recompile of lib/gpu and src/)

The code was checked out today, version 16 Aug 2013. For the debug output below, I turned on the flags described in the linked email thread. (CUDR_CPP = mpic++ -g -DUCL_SYNC_DEBUG and CUDR_OPTS = )

Additional information: I can run the same code on Fermi cards (C2050) when compiled with sm_21. I can also run USER-CUDA on the Kepler card using a June 2013 version of LAMMPS. The error code is generated from line 578 of lib/cuda/lal_device.cpp dev_program->load_string(device,flags.c_str()); But looking at source code from http://users.nccs.gov/~wb8/geryon/download.htm, the nvidia version of load_string() seems to just return success, so I am confused.

Raw output follows. Many thanks for any insights!
Tristan Sharp

which mpirun:

/usr/mpi/intel/openmpi-1.4.3/bin/mpirun

cat mpihosts:
gpuk004

Input file in.gpu.melt.2.5 also tries restricting to a single gpu:
package gpu force/neigh 0 0 1

Run command and output:
140814lmp/examples/gpu(147): mpirun -machinefile mpihosts ./…/…/src/lmp_gpu -in in.gpu.melt.2.5 -sf gpu

LAMMPS (16 Aug 2013)

Hi Lammps Users,

I'm running into a problem with the GPU package on our K20c (Kepler) GPUs,
and I thought I'd ask in case someone has already found a solution. The
examples/gpu/ run command and error message is:

130814lmp/examples/gpu(147): mpirun -machinefile mpihosts
./../../src/lmp_gpu -in in.gpu.melt.2.5 -sf gpu
ERROR: GPU library not compiled for this accelerator (../gpu_extra.h:40)

It looks similar to the problem described here
http://lammps.sandia.gov/threads/msg35893.html.

I use CUDA_ARCH=sm_35 (or 30 or 21, followed by make yes-GPU and recompile
of lib/gpu and src/)

it works for me with:
CUDA_ARCH = -arch=sm_35

axel.

I am also unable to reproduce this. I just tried with cuda 5.5 on a tesla k20c. Did you install the 5.5 cuda driver when you installed?
I notice your version is 304.54. Mine from an install today is 319.37. If you run nvc_get_devices in lib/gpu it will tell you if you are using the 5.5 driver. (I don't know what version 304.54 corresponds to).

- Mike

Thank you for the replies. The solution was mundane. Maybe similar to http://lammps.sandia.gov/threads/msg35893.html, I was verifying first on our established cards after each checkout. Then I would do a standard rm *.o *.a, but neglected the *.cubin files, which are maybe less expected for CUDA-newcomers. Obvious in hindsight though. Maybe it’s worth a note in lammps/lib/gpu/README about the need to do “make -f Makefile.linux clean”. Thanks again for the great software.

Tristan Sharp