Problems with tersoff/kk potential in pair_style hybrid/kk using KOKKOS

Hi people,

I’m currently doing a project where I want to simulate a graphene sheet on top of a Si-crystal substrate. I use the tersoff potential for the interatomic forces in the graphene sheet, since this should be able to run with KOKKOS on a computer cluster available at my university. Unfortunately it doesn’t work when the Tersoff potential is used with the pair_style hybrid/kk. If I simulate the graphene sheet alone with the standard pair_style tersoff/kk everything looks fine though. I have written a simple Lammps script which reproduces the problem. Notice that if I remove the “/kk” from the script this runs perfectly fine on CPU. The scripts creates tw graphene sheets where the interatomic forces in the first sheet is modeled with the Tersoff potential and the second one with a dummy Lennard Jones (LJ) potential. The interactions between the sheets are also modeled by a LJ potential. The script reads as follows (simple_reproduce.in)

######################################
units metal
newton on 
boundary p p m  
atom_style atomic 

# Graphene lattice
lattice custom 2.419 &
        a1    0                 1.0     0     &
        a2    $(sqrt(3)/2)      0.5     0     &
        a3    $(1/(2*sqrt(3)))  0.5     0.83  &
        basis 0                 0       0     &
        basis $(1/3)            $(1/3)  0.0

# Simulation box
region simreg1 block 0 20 0 20 0 0.5  
region simreg2 block 0 20 0 20 4.5 5  
region merge union 2 simreg1 simreg2

create_box 2 merge 
create_atoms 1 region simreg1 basis 1 1
create_atoms 2 region simreg2 basis 2 2


# Dynamics            
variable temp equal 50.0 # Kelvin                  
mass 1 12.0107  
mass 2 12.0107 

velocity all create ${temp} 5432373 dist gaussian

pair_style hybrid/kk tersoff/kk lj/cut/kk 2.0 
pair_coeff * * tersoff/kk C.tersoff C NULL # <----The line that causes issues
pair_coeff 1 2 lj/cut/kk 1 1                 
pair_coeff 2 2 lj/cut/kk 1 1                 

timestep 0.001
fix nve all nve 

# Output
thermo 100
run 1000
######################################

I try to run the Lammps script with the following slurm job script.

######################################
#!/bin/bash

#SBATCH --job-name=Debug
#
#SBATCH --partition=normal
#
#SBATCH --ntasks=1
#
#SBATCH --cpus-per-task=2
#
#SBATCH --gres=gpu:1
#
#SBATCH --output=slurm.out
#

mpirun -n 1 lmp -pk kokkos newton on neigh half -k on g 1 -sf kk -in simple_reproduce.in                                                        
######################################

It crashes immediately yielding the following message:

######################################
LAMMPS (10 Feb 2021)
KOKKOS mode is enabled (src/KOKKOS/kokkos.cpp:92)
will use up to 1 GPU(s) per node
using 1 OpenMP thread(s) per MPI task
Lattice spacing in x,y,z = 2.7932206 4.8380000 2.0077700
Created orthogonal box = (0.0000000 0.0000000 0.0000000) to (55.864412 96.760000 10.038850)
1 by 1 by 1 MPI processor grid
Created 2159 atoms
create_atoms CPU = 0.002 seconds
Created 2118 atoms
create_atoms CPU = 0.002 seconds
[bigfacet:515305] *** Process received signal ***
[bigfacet:515305] Signal: Segmentation fault (11)
[bigfacet:515305] Signal code: Address not mapped (1)
[bigfacet:515305] Failing at address: 0x21
[bigfacet:515305] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7fccd3dca420]
[bigfacet:515305] [ 1] lmp(+0x13e9d27)[0x562d1d413d27]
[bigfacet:515305] [ 2] lmp(+0x93a91b)[0x562d1c96491b]
[bigfacet:515305] [ 3] lmp(+0x777803)[0x562d1c7a1803]
[bigfacet:515305] [ 4] lmp(+0x19e026)[0x562d1c1c8026]
[bigfacet:515305] [ 5] lmp(+0x1a5c8e)[0x562d1c1cfc8e]
[bigfacet:515305] [ 6] lmp(+0x1a5f05)[0x562d1c1cff05]
[bigfacet:515305] [ 7] lmp(+0x14170e)[0x562d1c16b70e]
[bigfacet:515305] [ 8] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fccd3618083]
[bigfacet:515305] [ 9] lmp(+0x181f9e)[0x562d1c1abf9e]
[bigfacet:515305] *** End of error message ***
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 515305 on node bigfacet exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
######################################

Judging from earlier posts you are probably going to ask me for more information about my installation on the cluster, but I’m not really sure what to share here. While I wait for response I am going to try to build Lammps with the GPU package and see if I’m able to run it like that.

I hope that you might be able to give me some hints on this one.

Best regards
Mikkel

I have no problem running your example input with the current release version 3 November 2022. Since there is no C.tersoff file in the LAMMPS distribution, I substituted it with the SiC.tersoff file.

LAMMPS (3 Nov 2022)                                                                                                  [66/1560]
KOKKOS mode is enabled (src/KOKKOS/kokkos.cpp:106)                                                                            
  will use up to 1 GPU(s) per node                                                                                            
  using 1 OpenMP thread(s) per MPI task                                                                                       
Lattice spacing in x,y,z = 2.7932206 4.838 2.00777                                                                            
Created orthogonal box = (0 0 0) to (55.864412 96.76 10.03885)                                                                
  1 by 1 by 1 MPI processor grid                                                                                              
Created 2159 atoms                                                                                                            
  using lattice units in orthogonal box = (0 0 0) to (55.864412 96.76 10.03885)                                               
  create_atoms CPU = 0.001 seconds                                                                                            
Created 2118 atoms                                                                                                            
  using lattice units in orthogonal box = (0 0 0) to (55.864412 96.76 10.03885)                                               
  create_atoms CPU = 0.002 seconds                                                                                            
Reading tersoff potential file SiC.tersoff with DATE: 2011-04-26                                                                
Neighbor list info ...                                                                                                        
  update: every = 1 steps, delay = 0 steps, check = yes                                                                       
  max neighbors/atom: 2000, page size: 100000                                                                                 
  master list distance cutoff = 4.1                                                                                           
  ghost atom cutoff = 4.1                                      
  binsize = 4.1, bins = 14 24 3        
  4 neighbor lists, perpetual/occasional/extra = 4 0 0                                                                        
  (1) pair tersoff/kk, perpetual, skip from (3)                                                                               
      attributes: full, newton on, kokkos_device                                                                              
      pair build: skip/kk/device                               
      stencil: none            
      bin: none                
  (2) pair lj/cut/kk, perpetual, skip from (4)                                                                                
      attributes: half, newton on, kokkos_device, cut 4                                                                       
      pair build: skip/kk/device                               
      stencil: none            
      bin: none                
  (3) neighbor class addition, perpetual                                                                                      
      attributes: full, newton on, kokkos_device                                                                              
      pair build: full/bin/kk/device                           
      stencil: full/bin/3d                                     
      bin: kk/device                                           
  (4) neighbor class addition, perpetual, half/full trim from (3)                                                             
      attributes: half, newton on, kokkos_device, cut 4                                                                       
      pair build: halffull/newton/trim/kk/device                                                                              
      stencil: none            
      bin: none                
Setting up Verlet run ...                                      
  Unit style    : metal          
  Current step  : 0                                                                                                  [19/1560]
  Time step     : 0.001                                        
Per MPI rank memory allocation (min/avg/max) = 2.951 | 2.951 | 2.951 Mbytes                                                   
   Step          Temp          E_pair         E_mol          TotEng         Press          Volume                             
         0   50            -15781.513      0             -15753.877      120940.96      54275.259                             
       100   2722.2501     -17284.856      0             -15780.223      110334.14      62988.322                             
       200   3347.6603     -17691.839      0             -15841.533      39788.317      72508.016                             
       300   3882.6785     -18040.723      0             -15894.703      17862.831      89650.849                             
       400   4132.38       -18193.335      0             -15909.301      15471.372      117070.36                             
       500   4182.2313     -18232.791      0             -15921.203      10604.496      143749.57                             
       600   4268.5304     -18285.279      0             -15925.992      9026.3116      196315.02                             
       700   4295.2436     -18298.525      0             -15924.473      5091.3252      244618.38                             
       800   4320.3859     -18311.555      0             -15923.607      2902.4629      293430.19                             
       900   4423.2769     -18379.743      0             -15934.926      4111.35        347835.03                             
      1000   4444.7757     -18398.491      0             -15941.791      3837.3006      397333.09                             
Loop time of 0.446228 on 1 procs for 1000 steps with 4277 atoms                                                               

Performance: 193.623 ns/day, 0.124 hours/ns, 2241.005 timesteps/s, 9.585 Matom-step/s                                         
99.2% CPU use with 1 MPI tasks x 1 OpenMP threads                                                                             

MPI task timing breakdown:                                     
Section |  min time  |  avg time  |  max time  |%varavg| %total                                                               
---------------------------------------------------------------                                                               
Pair    | 0.22599    | 0.22599    | 0.22599    |   0.0 | 50.65                                                                
Neigh   | 0.082702   | 0.082702   | 0.082702   |   0.0 | 18.53                                                                
Comm    | 0.076283   | 0.076283   | 0.076283   |   0.0 | 17.10                                                                
Output  | 0.00046349 | 0.00046349 | 0.00046349 |   0.0 |  0.10                                                                
Modify  | 0.025467   | 0.025467   | 0.025467   |   0.0 |  5.71                                                                
Other   |            | 0.03532    |            |       |  7.92                                                                

Nlocal:           4277 ave        4277 max        4277 min                                                                    
Histogram: 1 0 0 0 0 0 0 0 0 0                                 
Nghost:           1138 ave        1138 max        1138 min                                                                    
Histogram: 1 0 0 0 0 0 0 0 0 0                                 
Neighs:              0 ave           0 max           0 min                                                                    
Histogram: 1 0 0 0 0 0 0 0 0 0                                 

Total # of neighbors = 0                                       
Ave neighs/atom = 0            
Neighbor list builds = 74                                      
Dangerous builds = 0                                           
Total wall time: 0:00:00         
1 Like

Thanks for the fast reply. My C.tersoff file looks like this:

C C C 3.0 1.0 0.0 3.8049e4 4.3484 -0.57058 0.72751 1.5724e-7 2.2119 346.74 1.95 0.15 3.4879 1393.6

, but I think it might just be a subpart of the SiC.tersoff file. If you cannot see any problems with this notation I might need to rebuild Lammps on the cluster I guess.

There are small numeric differences because your parameter B is set to 346.74, while the SiC.tersoff file has 346.70. The format is correct.

Yes. It looks like you are running into a bug that has since been fixed.

One more recommendation: don’t use the /kk suffix in the input file if you are using the -sf kk command line flag anyway. That will make it easier to run the same input for testing without KOKKOS.

2 Likes

Thanks for the fast reply akohlmey, and the tips regarding the /kk suffix. I’ll post an update here, when I have tried rebuilding and running again.

1 Like

Should be fixed here: Fix issues in Kokkos Tersoff and Stillinger-Weber pair styles by stanmoore1 · Pull Request #3214 · lammps/lammps · GitHub

1 Like

Yes, and it did indeed work updating to the newest version. Thanks for the help both of you.

1 Like