kokkos_cuda_system_error

Dear all,

I’m trying to run peptide example using kokkos_cuda package.

After the initialisation and the zero step this error occurs:

terminate called after throwing an instance of ‘std::runtime_error’

what(): Kokkos::Experimental::Impl::SharedAllocationRecord< Kokkos::CudaHostPinnedSpace , void >::get_record ERROR

Traceback functionality not available

Aborted

I have tested this issue on NVIDIA Tesla K80 and NVIDIA Jetson TX1.

The same error occurs when I try to simulate another molecular system in the OPLS-AA force field. Hoverer, if the Coulomb interactions are disabled the simulation doesn’t crash.

Best wishes,

Nikolay Kondratyuk.

Dear all,

I'm trying to run peptide example using kokkos_cuda package.

After the initialisation and the zero step this error occurs:

terminate called after throwing an instance of 'std::runtime_error'

  what(): Kokkos::Experimental::Impl::SharedAllocationRecord<
Kokkos::CudaHostPinnedSpace , void >::get_record ERROR

Traceback functionality not available

Aborted

I have tested this issue on NVIDIA Tesla K80 and NVIDIA Jetson TX1.

thanks for the feedback, but you are missing to tell us the most
important piece of information: which version of LAMMPS did you use?
there has been a new stable release of LAMMPS made just a few days
ago. have you tried that?

also, which host compiler do you use? which cuda toolkit version?
which nvidia driver version?
which variant of KOKKOS support did you compile (CUDA only or CUDA
with OpenMP or CUDA with PThreads, Other)?

axel.

Axel,

thank you for such a prompt response! The KOKKOS with CUDA only is used. The version of LAMMPS is 12-May-2016. The system parameters are as follows:

NVIDIA Tesla K80:

  1. gcc version 4.9.2 (Ubuntu 4.9.2-10ubuntu13)
  2. CUDA Toolkit 7.5.18
  3. CUDA Driver Version: 7.50

NVIDIA Jetson TX1:

  1. gcc 4.8.4 (Ubuntu/Linaro 4.8.4-2ubuntu1~14.04.3)
  2. CUDA Toolkit 7.0
  3. CUDA Driver Version: 7.0

The newest stable version of LAMMPS on Jetson gives the same error also.

The Makefile is attached for the reference.

Best regards,
Nikolay.

Makefile.kokkos_cuda (3.15 KB)

Axel,

thank you for such a prompt response! The KOKKOS with CUDA only is used. The version of LAMMPS is 12-May-2016. The system parameters are as follows:

NVIDIA Tesla K80:

  1. gcc version 4.9.2 (Ubuntu 4.9.2-10ubuntu13)

  2. CUDA Toolkit 7.5.18

  3. CUDA Driver Version: 7.50

NVIDIA Jetson TX1:

  1. gcc 4.8.4 (Ubuntu/Linaro 4.8.4-2ubuntu1~14.04.3)

  2. CUDA Toolkit 7.0

  3. CUDA Driver Version: 7.0

The newest stable version of LAMMPS on Jetson gives the same error also.

The Makefile is attached for the reference.

Best regards,

Nikolay.

Makefile.kokkos_cuda (3.15 KB)

Axel,

thank you for such a prompt response! The KOKKOS with CUDA only is used. The version of LAMMPS is 12-May-2016. The system parameters are as follows:

NVIDIA Tesla K80:

  1. gcc version 4.9.2 (Ubuntu 4.9.2-10ubuntu13)

  2. CUDA Toolkit 7.5.18

  3. CUDA Driver Version: 7.50

NVIDIA Jetson TX1:

  1. gcc 4.8.4 (Ubuntu/Linaro 4.8.4-2ubuntu1~14.04.3)

  2. CUDA Toolkit 7.0

  3. CUDA Driver Version: 7.0

The newest stable version of LAMMPS on Jetson gives the same error also.

The Makefile is attached for the reference.

Best regards,

Nikolay.

Makefile.kokkos_cuda (3.15 KB)

Axel,

thank you for such a prompt response! The KOKKOS with CUDA only is used. The
version of LAMMPS is 12-May-2016. The system parameters are as follows:

NVIDIA Tesla K80:

1) gcc version 4.9.2 (Ubuntu 4.9.2-10ubuntu13)

2) CUDA Toolkit 7.5.18

3) CUDA Driver Version: 7.50

NVIDIA Jetson TX1:

1) gcc 4.8.4 (Ubuntu/Linaro 4.8.4-2ubuntu1~14.04.3)

2) CUDA Toolkit 7.0

3) CUDA Driver Version: 7.0

The newest stable version of LAMMPS on Jetson gives the same error also.

ok, this is going to be much more helpful to the KOKKOS developers.

there is one more thing that you could do to help tracking down the cause:
try running the benchmark/example inputs in LAMMPS, e.g. rhodo or peptide.
if they do not work, let use know. if they do work, compare them to
your input and try to identify which part of your input (e.g. which
style) is triggering the issue. with that information, you could try
and build a minimal test input, that will very easily reproduce the
problem (doesn't have to make sense as a simulation at this point) and
post it here, so that people doing the KOKKOS programming in LAMMPS
can use it to debug and hopefully correct what is going wrong here.

axel.

Axel,

below is the output for the original in.peptide (lammps-30Jul16/examples/peptide/in.peptide)

…/…/src/lmp_kokkos_cuda -k on g 1 -sf kk -in in.peptide

LAMMPS (30 Jul 2016)

KOKKOS mode is enabled (…/kokkos.cpp:38)

using 1 GPU(s)

Reading data file …

orthogonal box = (36.8402 41.0137 29.7681) to (64.2116 68.3851 57.1395)

1 by 1 by 1 MPI processor grid

reading atoms …

2004 atoms

reading velocities …

2004 velocities

scanning bonds …

4 = max bonds/atom

scanning angles …

13 = max angles/atom

scanning dihedrals …

30 = max dihedrals/atom

scanning impropers …

2 = max impropers/atom

reading bonds …

1365 bonds

reading angles …

786 angles

reading dihedrals …

207 dihedrals

reading impropers …

12 impropers

Finding 1-2 1-3 1-4 neighbors …

Special bond factors lj: 0 0 0

Special bond factors coul: 0 0 0

4 = max # of 1-2 neighbors

7 = max # of 1-3 neighbors

14 = max # of 1-4 neighbors

18 = max # of special neighbors

Finding SHAKE clusters …

19 = # of size 2 clusters

6 = # of size 3 clusters

3 = # of size 4 clusters

640 = # of frozen angles

84 atoms in group peptide

PPPM initialization …

WARNING: Using 12-bit tables for long-range coulomb (…/kspace.cpp:316)

G vector (1/distance) = 0.268725

grid = 15 15 15

stencil order = 5

estimated absolute RMS force accuracy = 0.0228209

estimated relative force accuracy = 6.87243e-05

using double precision FFTs

3d grid and FFT values/proc = 10648 3375

Neighbor list info …

1 neighbor list requests

update every 1 steps, delay 5 steps, check yes

max neighbors/atom: 2000, page size: 100000

master list distance cutoff = 12

ghost atom cutoff = 12

binsize = 6, bins = 5 5 5

Setting up Verlet run …

Unit style : real

Current step : 0

Time step : 2

WARNING: Fixes cannot send data in Kokkos communication, switching to classic communication (…/comm_kokkos.cpp:365)

SHAKE stats (type/ave/delta) on step 0

4 1.111 1.44264e-05 36

6 0.996998 7.26967e-06 12

8 1.08 1.32536e-05 18

10 1.111 1.22749e-05 24

12 1.08 1.11767e-05 18

14 0.96 0 2

18 0.957206 4.37979e-05 3840

31 104.519 0.00396029

Memory usage per processor = 30.2199 Mbytes

---------------- Step 0 ----- CPU = 0.0000 (sec) ----------------

TotEng = -5237.4580 KinEng = 1134.9186 Temp = 282.1005

PotEng = -6372.3766 E_bond = 16.5572 E_angle = 36.3726

E_dihed = 15.5190 E_impro = 1.9426 E_vdwl = 692.8945

E_coul = 26772.2646 E_long = -33907.9271 Press = -837.0112

terminate called after throwing an instance of ‘std::runtime_error’

what(): Kokkos::Experimental::Impl::SharedAllocationRecord< Kokkos::CudaHostPinnedSpace , void >::get_record ERROR

Traceback functionality not available

Aborted

Thanks for the info. The Kokkos version of long-range Coulombics will be getting a major overhaul soon, with threaded FFTs and threaded PPPM. I’ll take a look.

Stan