Running pair style dsf on GPU blowing up

Dear LAMMPS users and developers,

  I was able to run the parallel CPU version of a script using the dsf
potential successfully. However, when I turned on GPU acceleration (on 6
cores) on the same model, the temperature blows up after a few hundred MD
iterations. Is there any essential difference between CPU and GPU-enabled
simulation while running this pair style? BTW, the model also has bonds,
tabulated potentials and hybrid/overlay interactions.

  Any help on this issue would be appreciated. If files are needed, just
let me know.

  Best,
  Luis

Dear LAMMPS users and developers,

  I was able to run the parallel CPU version of a script using the dsf
potential successfully. However, when I turned on GPU acceleration (on 6
cores) on the same model, the temperature blows up after a few hundred MD
iterations. Is there any essential difference between CPU and GPU-enabled
simulation while running this pair style? BTW, the model also has bonds,
tabulated potentials and hybrid/overlay interactions.

with all that, it makes little sense to run on the GPU in the first place.

axel.

are you running with double precision on the GPU?

Steve

Hi Luis,

I would suggest you build the GPU package with mixed precision (or double precision), rebuild LAMMPS, and try with one MPI first.

-Trung

Hi, everyone

  Yes, I am running in double precision and used 1,2,4 and 6 mpi
processes. None of them worked. The md run blew up but the minimization
procedure ran fine.
  Axel, can you explain me briefly why using GPU in this situation is
pointless?

  I have another question: Originally, I was trying to run a slab geometry
with 10k particles for this same system but using coul/long and pppm. I
got the error of insufficient memory on the accelerator. That was the
main reason why I switched from coul/long to coul/dsf. I believed that
using a short-range potential with the GPU I would be able to run
thicker and thicker slabs up to the memory limit for the card (GTX 580
1.5Gb). It turns out that I can't even run a bulk system with 3430 atoms
using dsf. So my question is: Why are slabs so computationally demanding
to the GPU for long range solvers? Is there a way to avoid this high
memory usage other than using single/mixed precision?

  Thank you all for the responses.

  Luis

If the system is that small (3430 atoms), would you mind sending the data file and a minimal input script that produce the blow-up?

-Trung