Output with 1 gpu
LAMMPS (29 Oct 2020)
KOKKOS mode is enabled (…/kokkos.cpp:90)
will use up to 1 GPU(s) per node
WARNING: Detected MPICH. Disabling CUDA-aware MPI (…/kokkos.cpp:272)
Lattice spacing in x,y,z = 1.6795962 1.6795962 1.6795962
Created orthogonal box = (0.0000000 0.0000000 0.0000000) to (107.49416 107.49416 107.49416)
2 by 2 by 2 MPI processor grid
Created 1048576 atoms
create_atoms CPU = 0.036 seconds
Neighbor list info …
update every 20 steps, delay 0 steps, check no
max neighbors/atom: 2000, page size: 100000
master list distance cutoff = 2.8
ghost atom cutoff = 2.8
binsize = 2.8, bins = 39 39 39
1 neighbor lists, perpetual/occasional/extra = 1 0 0
(1) pair lj/cut/kk, perpetual
attributes: full, newton off, kokkos_device
pair build: full/bin/kk/device
stencil: full/bin/3d
bin: kk/device
Setting up Verlet run …
Unit style : lj
Current step : 0
Time step : 0.005
Per MPI rank memory allocation (min/avg/max) = 24.45 | 24.45 | 24.45 Mbytes
Step Temp E_pair E_mol TotEng Press
0 1.44 -6.7733681 0 -4.6133701 -5.0196704
100000 0.69306427 -5.6721158 0 -4.6325204 0.71741445
Loop time of 1195.89 on 8 procs for 100000 steps with 1048576 atoms
Performance: 36123.863 tau/day, 83.620 timesteps/s
68.6% CPU use with 8 MPI tasks x 1 OpenMP threads
MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total
Pair | 230.94 | 250.65 | 286.46 | 120.3 | 20.96
Neigh | 27.562 | 47.007 | 62.883 | 165.7 | 3.93
Comm | 506.94 | 534.24 | 555.57 | 59.9 | 44.67
Output | 0.0014031 | 0.0043901 | 0.013426 | 6.7 | 0.00
Modify | 316.82 | 356.9 | 373.42 | 89.3 | 29.84
Other | | 7.075 | | | 0.59
Nlocal: 131072.0 ave 131128 max 130962 min
Histogram: 1 0 0 1 1 0 0 1 2 2
Nghost: 45405.1 ave 45449 max 45372 min
Histogram: 1 3 1 0 0 0 0 1 0 2
Neighs: 0.00000 ave 0 max 0 min
Histogram: 8 0 0 0 0 0 0 0 0 0
FullNghs: 9.83080e+06 ave 9.84135e+06 max 9.81286e+06 min
Histogram: 1 0 0 1 1 0 0 2 1 2
Total # of neighbors = 78646384
Ave neighs/atom = 75.003036
Neighbor list builds = 5000
Dangerous builds not checked
Total wall time: 0:19:59
Output with 2 GPUs
LAMMPS (29 Oct 2020)
KOKKOS mode is enabled (…/kokkos.cpp:90)
will use up to 2 GPU(s) per node
WARNING: Detected MPICH. Disabling CUDA-aware MPI (…/kokkos.cpp:272)
Lattice spacing in x,y,z = 1.6795962 1.6795962 1.6795962
Created orthogonal box = (0.0000000 0.0000000 0.0000000) to (107.49416 107.49416 107.49416)
2 by 2 by 2 MPI processor grid
Created 1048576 atoms
create_atoms CPU = 0.036 seconds
Neighbor list info …
update every 20 steps, delay 0 steps, check no
max neighbors/atom: 2000, page size: 100000
master list distance cutoff = 2.8
ghost atom cutoff = 2.8
binsize = 2.8, bins = 39 39 39
1 neighbor lists, perpetual/occasional/extra = 1 0 0
(1) pair lj/cut/kk, perpetual
attributes: full, newton off, kokkos_device
pair build: full/bin/kk/device
stencil: full/bin/3d
bin: kk/device
Setting up Verlet run …
Unit style : lj
Current step : 0
Time step : 0.005
Per MPI rank memory allocation (min/avg/max) = 24.45 | 24.45 | 24.45 Mbytes
Step Temp E_pair E_mol TotEng Press
0 1.44 -6.7733681 0 -4.6133701 -5.0196704
100000 0.69293473 -5.6719934 0 -4.6325923 0.71721185
Loop time of 1193.63 on 8 procs for 100000 steps with 1048576 atoms
Performance: 36192.000 tau/day, 83.778 timesteps/s
68.5% CPU use with 8 MPI tasks x 1 OpenMP threads
MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total
Pair | 214.59 | 253.82 | 285.6 | 129.8 | 21.26
Neigh | 43.685 | 49.943 | 55.428 | 50.5 | 4.18
Comm | 487.38 | 529.66 | 568.44 | 110.0 | 44.37
Output | 0.001313 | 0.0047092 | 0.011212 | 5.5 | 0.00
Modify | 326.58 | 353.03 | 364.6 | 60.6 | 29.58
Other | | 7.179 | | | 0.60
Nlocal: 131072.0 ave 131218 max 130943 min
Histogram: 1 0 1 1 2 1 1 0 0 1
Nghost: 45387.2 ave 45502 max 45234 min
Histogram: 1 0 1 1 0 0 1 3 0 1
Neighs: 0.00000 ave 0 max 0 min
Histogram: 8 0 0 0 0 0 0 0 0 0
FullNghs: 9.82991e+06 ave 9.84944e+06 max 9.80783e+06 min
Histogram: 1 0 0 1 2 1 0 2 0 1
Total # of neighbors = 78639318
Ave neighs/atom = 74.996298
Neighbor list builds = 5000
Dangerous builds not checked
Total wall time: 0:19:58