Hi,
I am trying to build LAMMPS (15May15) to run on a cluster with Nvidia
Kepler GPUs. The cluster is running CENTOS 6 with Intel CPUs.
I have been able to successfully compile and run LAMMPS with the GPU
and USER-OMP packages enabled. I now want to build with the Kokkos
package so I can compare the performance on the cluster. When I use
OpenMPI 1.8.5, GCC 4.8.4 and CUDA 7.0.28 to compile the attached
Makefile, the compilation is successful with no warnings or errors. If
I try to execute ./lmp_kokkos_cuda however, the output is just
"Segmentation Fault". The same thing happens with GCC 4.9.2. This is
the output from GDB:
(gdb) backtrace
#0 __exchange_and_add_dispatch (this=0x0, __a=...) at
/data/opt/gcc-4.9.2/include/c++/4.9.2/ext/atomicity.h:84
#1 std::basic_string<char, std::char_traits<char>,
std::allocator<char> >::_Rep::_M_dispose (this=0x0, __a=...) at
/data/opt/gcc-4.9.2/include/c++/4.9.2/bits/basic_string.h:245
I also tried running it through valgrind, here's the output:
==90365== Invalid read of size 4
==90365== at 0x3657D4:
std::string::_Rep::_M_dispose(std::allocator<char> const&) [clone
.part.3] (atomicity.h:67)
==90365== Address 0x10 is not stack'd, malloc'd or (recently) free'd
Can anyone help with this problem? Is there any other debugging info
that I can provide?
Makefile.kokkos_cuda (2.89 KB)
Hi,
I am trying to build LAMMPS (15May15) to run on a cluster with Nvidia
Kepler GPUs. The cluster is running CENTOS 6 with Intel CPUs.
I have been able to successfully compile and run LAMMPS with the GPU
and USER-OMP packages enabled. I now want to build with the Kokkos
package so I can compare the performance on the cluster. When I use
OpenMPI 1.8.5, GCC 4.8.4 and CUDA 7.0.28 to compile the attached
Makefile, the compilation is successful with no warnings or errors. If
I try to execute ./lmp_kokkos_cuda however, the output is just
"Segmentation Fault". The same thing happens with GCC 4.9.2. This is
the output from GDB:
(gdb) backtrace
#0 __exchange_and_add_dispatch (this=0x0, __a=...) at
/data/opt/gcc-4.9.2/include/c++/4.9.2/ext/atomicity.h:84
#1 std::basic_string<char, std::char_traits<char>,
std::allocator<char> >::_Rep::_M_dispose (this=0x0, __a=...) at
/data/opt/gcc-4.9.2/include/c++/4.9.2/bits/basic_string.h:245
I also tried running it through valgrind, here's the output:
==90365== Invalid read of size 4
==90365== at 0x3657D4:
std::string::_Rep::_M_dispose(std::allocator<char> const&) [clone
.part.3] (atomicity.h:67)
==90365== Address 0x10 is not stack'd, malloc'd or (recently) free'd
Can anyone help with this problem? Is there any other debugging info
that I can provide?
KOKKOS is still under heavy development and considered experimental.
unless you want to do KOKKOS development yourself, you should not
install it.
before compiling with MPI, i would suggest to compile without using
the internal MPI STUBS library.
axel.
I understand, and we will primarily use GPU or USER-CUDA. I would
still like to try get KOKKOS working though.
I recompiled it with MPI_STUBS and get the exact same error.
Stan can probably advise.
Steve
I understand, and we will primarily use GPU or USER-CUDA. I would
still like to try get KOKKOS working though.
why do you use -shared under LINKFLAGS in your makefile?
that flag is used to create a shared library, not a regular
executable. thus you get the segmentation fault.
this is not a KOKKOS problem after all.
axel.
If I remove -shared I get this:
# mpicxx \-std=c\+\+11 \-D\_\_CUDA\_ARCH\_\_=350 \-E \-x c\+\+
\-DCUDA\_DOUBLE\_MATH\_FUNCTIONS \-\-std=c\+\+11 \-fopenmp \-O3
\-D\_\_CUDA\_PREC\_DIV \-D\_\_CUDA\_PREC\_SQRT \-I"\.\./\.\./lib/kokkos/core/src"
\-I"\.\./\.\./lib/kokkos/containers/src"
\-I"\.\./\.\./lib/kokkos/algorithms/src" \-I"\.\./\.\./lib/kokkos/linalg/src"
\-I"\.\./" "\-I/opt/cuda\-7\.0/bin/\.\.//include" \-m64 \-g \-gdwarf\-2
"/tmp/tmpxft\_0001ae4d\_00000000\-4\_kokkos\_depend\.cudafe1\.cpp" >
"/tmp/tmpxft\_0001ae4d\_00000000\-14\_kokkos\_depend\.ii"
\# mpicxx -std=c++11 -c -x c++ --std=c++11 -fopenmp -O3
-I"../../lib/kokkos/core/src" -I"../../lib/kokkos/containers/src"
-I"../../lib/kokkos/algorithms/src" -I"../../lib/kokkos/linalg/src"
-I"../" "-I/opt/cuda-7.0/bin/..//include" -fpreprocessed -m64 -g
-gdwarf-2 -o "kokkos_depend.o"
"/tmp/tmpxft_0001ae4d_00000000-14_kokkos_depend.ii"
/usr/local/bin/ld: -f may not be used without -shared
collect2: error: ld returned 1 exit status
which is why I tried adding it.