kokkos cuda segfault

I’m trying to run kokkos with gpu support on the NASA pleiades cluster. I am able to compile lammps (5 Aug 2016) and get an executable using the python make command:
Make.py -j 6 -p none user-reaxc asphere molecule kspace rigid kokkos orig -kokkos cuda archgpu=35 archcpu=SNB -o kokkos_cuda -a file clean exe -m kokkos_cuda -v

I’m running on Tesla Kepler k40 SandyBridge nodes (with CUDA compute capability 3.5)

I’m compiling using:
gcc 4.9.3
cuda 7.5
SGI MPT

Upon running the in.lj example in the accelerate folder I get a segfault. My lammps command is:
mpiexec lmp_kokkos_cuda_20160804 -k on t 1 g 1 -sf kk -in in.lj -echo both

The cluster automatically outputs some debuging info from gdb which is included below. I’m not familiar with debuging segfaults but I have access to gdb and valgrind on the cluster and would be happy to run them with some help.

Anybody have any suggestions? My makefile and make output are attached.

Thanks.

-Ben

LOG FILE------------

makeoutput.txt (794 KB)

Makefile.kokkos_cuda (3.36 KB)