using LAMMPS with ATI cards, GPU library not compiled for this accelerator

Scott_Carmichael · April 19, 2012, 12:04am

Good afternoon, I have an ATI card (Radeon HD 6870), with OpenCL and OpenMPI compiled on my machine. I have built the lib/gpu binaries using Makefile.linux_opencl, which looks like:

sjplimp · April 19, 2012, 2:14pm

This is a Q for Mike (CCd).

Steve

_Brown_W_Michael · April 19, 2012, 5:10pm

I don't have a lot of experience on ATI, but there are three things I would like you to try.

First run with only 1 mpi process and see if you still see the error.

If so, from LAMMPS root directory:

cd lib/gpu
sed -i 's/cl_khr_fp64/cl_amd_fp64/g' lal_preprocessor.h
make -f Makefile.linux_opencl clean
make -f Makefile.linux_opencl
cd ../../src
rm lmp_openmpi
make openmpi

and see if you still see the error.

If so:

cd lib/gpu
sed -i 's/-DUCL_NO_EXIT//g' Makefile.linux_opencl
make -f Makefile.linux_opencl clean
make -f Makefile.linux_opencl
cd ../../src
rm lmp_openmpi
make openmpi

and send the line that says "OpenCL error in..."

Please let me know how it goes. Thanks. - Mike

_Christian_Muller · April 19, 2012, 7:06pm

Hi

I think the error is because you run with multiple processes and the GPU package tries to use both devices which are reported by OpenCL. The GPU and the CPU device. The CPU device is rejected because its not a GPU. (At least I got the error then I tried running the GPU package on a 64 core AMD node using OpenCL for parallelisation.)

So you might be fine by just using "0 0" as device range input for the "package gpu" command.

One does not have this problem when compiled against the NVIDIA OpenCL lib, or when using a ATI GPU on a System with Intel CPUs. In those cases OpenCL will only report the GPUs as devices.

Best regards
Christian

-------- Original-Nachricht --------

_Brown_W_Michael · April 19, 2012, 8:40pm

The code is written so that it can run on a CPU device, GPU device, or anything with an OpenCL implementation. In fact, running on the CPU should have worked in this case since it supports double precision with the AMD OpenCL. In the case that the device is a CPU, the host-device memory copies are ignored.

I have done this successfully in the past, however, I think that this needs to be retested as some problems have been reported. Of course, it will not be more efficient to run this way on current hardware/code, so it is not a priority...

- Mike

akohlmey · April 19, 2012, 8:46pm

mike,

one thing came to my mind while reading the discussion:

do we need to require using single precision FFTs when
compiling the GPU library with -D_SINGLE_SINGLE ?

you originally had two different pppm implementations,
but we simplified it, so that the precision of the FFT
selection whether the double or single precision pppm
would be used.

cheers,
axel.

_Brown_W_Michael · April 19, 2012, 9:36pm

There should be a runtime complaint when using double p3m on an unsupported accelerator. Since the library is separate from lammps and the fft compilation, I think that making this requirement would just end up with a slightly different runtime error.

The library is now at the point where all of the C++ code could be moved to the package directory and cubins supporting all architectures and precision modes could be supplied in the lib/gpu directory so that the user would never have to do anything in lib/gpu - just make yes-gpu, add include and lib directories to the makefile and build everything with the c++ compiler.

I think that this would also make the code a lot cleaner since there would be direct access to all of the LAMMPS pointers, no global external functions, etc.

On the other hand, I am not sure how large the cubins would be and I have concerns that templates, etc. might make the styles less consistent with the rest of lammps and more difficult for steve to maintain.

Also, not sure how much time would be required to do this...

- Mike