wrong potential energy for atoms with eam/alloy/cuda

Hello people,

I'm trying to use eam/alloy/cuda to simulate He bombardment of W surfaces. Things seem to run ok, except that the potential energies of individual atoms in dump files are wrong to a crazy degree. W atoms are reported as having negative energies of several thousand eV, while in the same system run on cpus, they are correctly reported as no more negative than -9 eV. In the thermofile, the overal potential energy of the system is reported correctly and after a test of 200 steps the trajectories of all the atoms are similar. So it does appear to be purely a reporting error, the calculation overal seems to work ok. Has anyone else seen this sort of behaviour with their cuda runs before?
The lines for calculating and reporting the potential energy in the dump file are simply
compute pe all pe/atom
dump systemdump all custom 100 dumpsystem.* id type x y z vx vy vz fx fy fz c_ke c_pe

It seems like it might be a cruddy compiled exectubale. I compiled with the cuda 6.0 toolkit environment and openmpi 1.5.3 compiled against Intel compilers for mpic++. Does that raise known red flags with anyone?

I don't have access to another platform to see if the files would run ok as a cuda job there. If anyone has a cuda executable and could run a short test, that might clarify whether it is indeed a sick executable that I'm using. To any would-be volunteer, the files are in an archive at

http://dutsm1219.tudelft.net/files.tar.gz

The files include a rather large starting data file (8 million atoms), the gzipped archive is just over 200 MB.

On cpus, the first 15 lines of the dumpfile after 200 steps look like

ITEM: TIMESTEP
200
ITEM: NUMBER OF ATOMS
7960841
ITEM: BOX BOUNDS pp pp ff
0 1292.66
0 1292.66
3300 4540
ITEM: ATOMS id type x y z vx vy vz fx fy fz c_ke c_pe
152929 1 53.3147 99.558 3500.23 -0.679591 5.88666 4.01369 -0.971145 2.30421 0.59066 0.488006 -1.55187
155140 1 46.1257 100.502 3500.1 0.963822 -0.0206257 -0.0437125 0.746121 2.19565 0.867417 0.00887224 -3.03569
155141 1 49.566 101.75 3500.5 -0.871121 2.19269 2.67436 -0.520946 1.31131 0.47719 0.121172 -4.09533
155142 1 52.0869 102.815 3501.08 1.07598 -3.40654 -1.70882 0.168428 -0.969382 -0.661466 0.149403 -4.91839
155143 1 54.7192 104.147 3501.14 0.251209 -1.51864 -0.750099 -1.22822 -2.25753 -0.173889 0.0279328 -4.82264
155171 1 62.5409 105.375 3500.18 -2.71839 -1.74644 6.19034 -0.912122 1.95279 0.533997 0.464529 -2.36643

With cuda, they look like

ITEM: TIMESTEP
200
ITEM: NUMBER OF ATOMS
7960841
ITEM: BOX BOUNDS pp pp ff
0 1292.66
0 1292.66
3300 4540
ITEM: ATOMS id type x y z vx vy vz fx fy fz c_ke c_pe
152929 1 53.3147 99.558 3500.23 -0.679618 5.88672 4.01371 -0.971148 2.30422 0.590659 0.488014 -4004.76
155140 1 46.1257 100.502 3500.1 0.963843 -0.0205649 -0.0436885 0.74612 2.19565 0.86742 0.00887258 -4754.93
155141 1 49.566 101.75 3500.5 -0.871135 2.19273 2.67438 -0.520946 1.31131 0.477188 0.121174 -4586.22
155142 1 52.0869 102.815 3501.08 1.07599 -3.40657 -1.70883 0.168432 -0.969385 -0.661464 0.149405 -4784.09
155143 1 54.7192 104.147 3501.14 0.251175 -1.5187 -0.750104 -1.22824 -2.25747 -0.173879 0.0279345 -4946.52
155171 1 62.5409 105.375 3500.18 -2.71842 -1.74639 6.19036 -0.91212 1.95279 0.533994 0.464531 -5010.86

Thermolines on cpus are

Step Time Temp TotEng PotEng KinEng Press Pxx Pyy
       0 0 1439.1933 -65725907 -67206865 1480957.7 -10.231257 -32.342408 8.3894784
     100 0.0046555808 1439.2932 -65725777 -67206838 1481060.5 -9.7726442 -31.659336 8.960548
     200 0.0093107682 1439.6932 -65725417 -67206889 1481472.1 -9.5194832 -31.020299 9.5275881

and with cuda

Step Time Temp TotEng PotEng KinEng Press Pxx Pyy
       0 0 1439.1933 -65725907 -67206865 1480957.7 -10.231257 -32.342408 8.3894784
     100 0.0046555808 1439.2932 -65725777 -67206838 1481060.5 -9.7726382 -31.659332 8.9605703
     200 0.0093107682 1439.6932 -65725417 -67206889 1481472.2 -9.5195495 -31.020297 9.5276006

greets,
Peter