USER_CUDA hangs and errors with thermo_style stress output

Hi there

1)with test input below calculation hangs at 30 step
2) with newton off option i have 2250 steps and error:
2250 -5.4499641e+12 5.4747937e+10 2.5939344e+11
Cuda error: Cuda_PairEAMCuda: pair Kernel 1 execution failed in file ‘pair_eam_cuda.cu’ in line 275 : unspecified launch failure.

  1. if disable thermo_style for output of stress calc all seems normal.

  2. number and configuration of processors and size of system affect on steps when error is happening.

package cuda gpu/node 3
processors 1 3 2
units metal
atom_style atomic
boundary m m m
#newton off
lattice fcc 3.51
region box block -300 300 -200 200 -200 200 units box
create_box 5 box

region top block 250 290 -190 200 -180 180 units box
create_atoms 3 region top

region left block -100 -0 -100 50 0 100 units box
create_atoms 1 region left

region right block 5 100 -100 50 0 100 units box
create_atoms 2 region right

group left region left
group right region right
pair_style eam/alloy
pair_coeff * * Fe.set Fe Fe Fe Fe Fe

thermo 10
timestep 0.001

compute strs all stress/atom
compute p left reduce sum c_strs[1] c_strs[2] c_strs[3]

fix 1 all nve

velocity left set 10 0 0 sum yes units box
velocity right set -25 0 0 sum yes units box

thermo_style custom step c_p[1] c_p[2] c_p[3]
reset_timestep 0
dump 1 all atom 100 lam.lammpstrj
run 10000


with 2 more processes reneighboring is cause something wrong, easy little input below. Can anyone to confirm this?

Christian can look into this.

Steve