Dear lammps gpu users/developers,
I ran into a cuda driver run time error when trying to run lammps on a cluster with four Tesla S2050 hanging on a CPU node. Specifically, the error msg is:
Cuda driver error 101 in call at file 'geryon/nvd_device.h' in line 266.
Looking at nvd_device.h, it occurs in a method that is setting the cuda device to the specified device number. In the fix gpu I am currently just asking it to run on device 0, but changing the device has no effect.
I ran the nvc_get_devices on the node and the specifications match those that I built the gpu lib and compiled lmp_glory with (I am showing only Device 0, but it found all four identical cards).
Found 1 platform(s).
Using platform: NVIDIA Corporation NVIDIA CUDA
CUDA Driver Version: 3.20
CUDA Runtime Version: 3.20
Device 0: "Tesla S2050"
Type of device: GPU
Compute capability: 2
Double precision support: Yes
Total amount of global memory: 2.99969 GB
Number of compute units/multiprocessors: 14
Number of cores: 448
Total amount of constant memory: 65536 bytes
Total amount of local/shared memory per block: 49152 bytes
Total number of registers available per block: 32768
Warp size: 32
Maximum number of threads per block: 1024
Maximum group size (# of threads per block) 1024 x 1024 x 64
Maximum item sizes (# threads for each dim) 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Clock rate: 1.147 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: No
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Exclusive
Concurrent kernel execution: Yes
Device has ECC support enabled: No
Any help would be appreciated.
Thanks,
Kevin