Does it work if you use the same executable without the GPUs and Kokkos, using 2 or more MPI ranks, i.e. CPU only? Can you try with OpenMPI? I can’t see anything wrong.
Stan
Does it work if you use the same executable without the GPUs and Kokkos, using 2 or more MPI ranks, i.e. CPU only? Can you try with OpenMPI? I can’t see anything wrong.
Stan