KOKKOS version of lj/cut/coul/long pair style inconsistent with other versions

Hi,

Im recently tested the KOKKOS package(LAMMPS version 17Nov16), and the KOKKOS version of lj/cut/coul/long pair style seems to be inconsistent with the non-kokkos versions. I have checked the different energies (thermo output) and when using the KOKKOS version of lj/cut/coul/long style the Coulombic part is significantly different than the non-kokkos versions (by significantly I mean 2-5 times higher). The pressure is also in a completely different order of magnitude compared to non-kokkos versions. The rest of the energies; LJ, kspace and all bonded types, are the same between KOKKOS and non-kokkos versions.

The completely different Coulombic energy (and pressure) when using the KOKKOS lj/cut/coul/long style is seen from step 0 (where the different version of the same pair style should result in close to identical energies). And the simulation crashes with "pppm out of range atoms" after a few tens of steps.

The initial configuration is likely not bad, it has been minimized and the simulations are very stable (never crashed, and they have run for millions of timesteps) with both the GPU and original version of the pair style (and for the first few 1000s report almost identical energies).

I have performed test on both Cuda and omp version of kokkos (built with Makefile.kokkos_cuda_openmpi and .kokkos_omp) with exactly the same results.

In addition, I have tested KOKKOS version of lj/CHARMM/coul/long pair style (same system), but in this case both the behavior and reported energies are identical between KOKKOS and non-kokkos versions. So I believe that the issue is related to the KOKKOS version of lj/cut/coul/long pair style.

Regards,

JS

juri,

thanks for your detailed feedback and careful checking. however, it is
missing the most important part: proof.
please provide (small, simple) inputs, outputs and command line flags
matching the logs, so that somebody can check this out for real.
also, please have a look at bundled example input decks and see, if
any of them reproduce the same behavior.

axel.