I've followed your suggestion and tried the gpu package, both with cuda and openCL. Both work correctly for the runs that gave erroneous results with the cuda package.
There is just one thing that works on cpus but not with the gpu package. I have a thin, pseudo-2D system. It's 12 A thick which is more than twice the cut-off radius of the potential. That system works fine on cpus but produces erroneous results with the gpu package. Doubling the thickness makes everything work the same as on cpus.
erroneous in which way?
Is there a minimum thickness requirement with the gpu package that is thicker than with cpus?
even if there would be such a (known) restriction, it would be listed
under restrictions in the documentation. but such a restriction
doesn't make any sense at all in the context of MD simulations. thus,
as per usual, the way to go about this it to reduce your input to the
absolute minimum case that reproduces the issue, describe how you
determine the problem, what your compilation options are.(for CPU and
GPU) and provide this information to us. i'm copying the GPU package
maintainer, Trung Nguyen, as the LAMMPS mailing list still seems to be
down after last week's file system failure at sourceforge.