KOKKOS version of lj/cut/coul/long pair style inconsistent with other versions

FYI, you will get better performance on the GPU if you can use 12 or less atom types with many Kokkos styles. If atom types <= 12 then it will use stack memory for the i-j parameters, otherwise it will use global memory which is slower in general.