Building LAMMPS with CUDA

Hi all,

I’m trying to configure LAMMPS(30 Jul 2016) with CUDA-7-5.

My ./nvc_get_devices output is the following.

Found 1 platform(s).
Using platform: NVIDIA Corporation NVIDIA CUDA Driver
CUDA Driver Version: 7.50

Device 0: “GeForce GTX 960”
Type of device: GPU
Compute capability: 5.2
Double precision support: Yes
Total amount of global memory: 3.99855 GB
Number of compute units/multiprocessors: 8
Number of cores: 1536
Total amount of constant memory: 65536 bytes
Total amount of local/shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per block: 1024
Maximum group size (# of threads per block) 1024 x 1024 x 64
Maximum item sizes (# threads for each dim) 2147483647 x 65535 x 65535
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Clock rate: 1.2405 GHz
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Default
Concurrent kernel execution: Yes
Device has ECC support enabled: No

You need to compile for the Maxwell architecture instead. Try changing CUDA_ARCH to

CUDA_ARCH = -arch=sm_52

then run

make -f Makefile.linux

Then you should install the gpu package in the src-directory and compile. After that it should work.

Dear Stefan,

I did the same as you said,but I’m getting the same old error.

make: *** [linux] Error 1

Just the above line.

Thanking you,

Umashankar

Error 1 means a command that was invoked by gnu make returned a non-zero exit status. Maybe run make with the debug flag to see where exactly the problem occurs?

Hi all,

I'm trying to configure LAMMPS(30 Jul 2016) with CUDA-7-5.
My ./nvc_get_devices output is the following.

Found 1 platform(s).
Using platform: NVIDIA Corporation NVIDIA CUDA Driver
CUDA Driver Version: 7.50

Device 0: "GeForce GTX 960"
  Type of device: GPU
  Compute capability: 5.2
  Double precision support: Yes
  Total amount of global memory: 3.99855 GB
  Number of compute units/multiprocessors: 8
  Number of cores: 1536
  Total amount of constant memory: 65536 bytes
  Total amount of local/shared memory per block: 49152 bytes
  Total number of registers available per block: 65536
  Warp size: 32
  Maximum number of threads per block: 1024
  Maximum group size (# of threads per block) 1024 x 1024 x 64
  Maximum item sizes (# threads for each dim) 2147483647 x 65535 x 65535
  Maximum memory pitch: 2147483647 bytes
  Texture alignment: 512 bytes
  Clock rate: 1.2405 GHz
  Run time limit on kernels: Yes
  Integrated: No
  Support host page-locked memory mapping: Yes
  Compute mode: Default
  Concurrent kernel execution: Yes
  Device has ECC support enabled: No
--------------------------------------------------------------------------------------------------------------
However when i try to build LAMMPS with Makefile.ubuntu,
there are no lines such as

CUDA_HOME =
CUDA_ARCH =
CUDA_PREC =

in the Makefile.ubuntu

that is correct. they *should* not be there. all GPU code is located
in the GPU library.

Although the built with Makefile.ubuntu is successful, when i ran a
simulation i ended up with error message as following.

LAMMPS (30 Jul 2016)
ERROR: GPU library not compiled for this accelerator (../gpu_extra.h:40)
Cuda driver error 4 in call at file 'geryon/nvd_device.h' in line 124.
Cuda driver error 4 in call at file 'geryon/nvd_device.h' in line 124.
Cuda driver error 4 in call at file 'geryon/nvd_device.h' in line 124.
Cuda driver error 4 in call at file 'geryon/nvd_device.h' in line 124.
Cuda driver error 4 in call at file 'geryon/nvd_device.h' in line 124.
Cuda driver error 4 in call at file 'geryon/nvd_device.h' in line 124.
Cuda driver error 4 in call at file 'geryon/nvd_device.h' in line 124.
Cuda driver error 4 in call at file 'geryon/nvd_device.h' in line 124.
Cuda driver error 4 in call at file 'geryon/nvd_device.h' in line 124.
Cuda driver error 4 in call at file 'geryon/nvd_device.h' in line 124.
Cuda driver error 4 in call at file 'geryon/nvd_device.h' in line 124.
Cuda driver error 4 in call at file 'geryon/nvd_device.h' in line 124.
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus
causing
the job to be terminated. The first process to do so was:

  Process name: [[32973,1],8]
  Exit code: 1
--------------------------------------------------------------------------------------------------------------------------------

how did you run lammps? could it be, that you are using multiple MPI
ranks per GPU, but your GPU is configured for exclusive access?

After that i took Makefile.linux to build the LAMMPS

that is *complete* nonsense. you cannot just copy makefiles around randomly.

in which the input for CUDA_ARCH

CUDA_ARCH =-arch=sm_35

is taken.

then i got the error message as

make: *** [linux] Error 1

deservedly so, because that makefile is not a standalone makefile.

Just the above line.

Can anyone please help me to sort this issue? either to make with
Makefile.ubuntu or Makefile.linux.

your original compilation seems ok. it calls the MPI library. rather
than running with mpirun, you should launch your executable first
serially and using one of the bundled benchmark examples.

axel.

Dear axel,

Thanks for the insight.

The initial compilation with Makefile.ubuntu is successful, and I’m able to run simulations both by MPI and serial ways, with " lmp_ubuntu " executable with benchmarked examples.

But I want to run the LAMMPS with multiple MPI ranks per GPU. My GPU driver is used by VMD also! Does that a problem? Do i need to use my GPU exclusively for LAMMPS in order to gain GPU Computation?

Thanking you,

Umashankar

Dear axel,

Thanks for the insight.

The initial compilation with Makefile.ubuntu is successful, and I'm able to
run simulations both by MPI and serial ways, with " lmp_ubuntu " executable
with benchmarked examples.

with or without GPU support enabled?

But I want to run the LAMMPS with multiple MPI ranks per GPU. My GPU driver
is used by VMD also! Does that a problem? Do i need to use my GPU

depends on whether and how much you are using VMD at the same time.
you essentially only have a bit less than half a regular GPU in your
system, so the amount of oversubscription possible will be limited.

exclusively for LAMMPS in order to gain GPU Computation?

before you can do that, you'll need to learn how to properly
configure, compile and run LAMMPS for your GPU.
as stefan remarked, you have a mismatch between the hardware
capability your GPU has and what you compiled for.
you should try to address that. but also, you should first make sure
that you can get LAMMPS to run with a GPU pair style active with just
one MPI task.

axel.