Hi,
I am getting the following error at runtime, after compiling GPUlammps-icms…
loverde@…4051…:/media/DATA/LIPIDS/PEGLIPIDMIX22/M22$ mpirun -np 8 /home/loverde/bin/lammps-icms/src/lmp_openmpi -log log.start -in in.start -echo none
LAMMPS (14 Mar 2013-ICMS)
ERROR: GPU library not compiled for this accelerator (gpu_extra.h:40)
Cuda driver error 4 in call at file ‘geryon/nvd_device.h’ in line 116.
*** The MPI_Abort() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
I have tried changing my CUDA_ARCH variable in the Makefile.linux in the gpu library.
Presently I have,
/* ----------------------------------------------------------------------
Generic Linux Makefile for CUDA
- Change CUDA_ARCH for your GPU
------------------------------------------------------------------------- */
CUDA_HOME = /usr/local/cuda-5.0
NVCC = nvcc
#GTX 690
CUDA_ARCH = -arch=sm_30
Tesla CUDA
#CUDA_ARCH = -arch=sm_21
newer CUDA
#CUDA_ARCH = -arch=sm_13
older CUDA
#CUDA_ARCH = -arch=sm_10 -DCUDA_PRE_THREE
CUDA_PRECISION = -D_DOUBLE_DOUBLE
CUDA_INCLUDE = -I$(CUDA_HOME)/include
CUDA_LIB = -L$(CUDA_HOME)/lib64
CUDA_OPTS = -DUNIX -O3 -Xptxas -v --use_fast_math
CUDR_CPP = mpic++ -DMPI_GERYON -DUCL_NO_EXIT -DMPICH_IGNORE_CXX_SEEK
CUDR_OPTS = -O2 # -xHost -no-prec-div -ansi-alias
BIN_DIR = ./
OBJ_DIR = ./obj
LIB_DIR = ./
AR = ar
BSH = /bin/sh
CUDPP_OPT = -DUSE_CUDPP -Icudpp_mini
include Nvidia.makefile
The following is the output from nvc_get_devices…
Found 1 platform(s).
Using platform: NVIDIA Corporation NVIDIA CUDA Driver
CUDA Driver Version: 5.0
Device 0: “GeForce GTX 690”
Type of device: GPU
Compute capability: 3
Double precision support: Yes
Total amount of global memory: 1.99969 GB
Number of compute units/multiprocessors: 8
Number of cores: 1536
Total amount of constant memory: 65536 bytes
Total amount of local/shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per block: 1024
Maximum group size (# of threads per block) 1024 x 1024 x 64
Maximum item sizes (# threads for each dim) 2147483647 x 65535 x 65535
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Clock rate: 1.0195 GHz
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Default
Concurrent kernel execution: Yes
Device has ECC support enabled: No
Device 1: “GeForce GTX 690”
Type of device: GPU
Compute capability: 3
Double precision support: Yes
Total amount of global memory: 1.99982 GB
Number of compute units/multiprocessors: 8
Number of cores: 1536
Total amount of constant memory: 65536 bytes
Total amount of local/shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per block: 1024
Maximum group size (# of threads per block) 1024 x 1024 x 64
Maximum item sizes (# threads for each dim) 2147483647 x 65535 x 65535
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Clock rate: 1.0195 GHz
Run time limit on kernels: No
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Default
Concurrent kernel execution: Yes
Device has ECC support enabled: No
Is this possibly as issue with CUDA 5.0?
Thanks very much,
Sharon Loverde