This document briefly describes how to port simple pair styles to KOKKOS. While KOKKOS looks very intimidating, I have come to realize that porting simple truncated pair styles is not that hard because it is pretty much a copy paste job. I am writing this working with pair_style yukawa as an example. !! THIS EXAMPLE WILL ONLY WORK WITH TRUNCATED !! !! PAIR STYLES THAT DO NOT NEED LONG RANGE SOLVERS !! 1. Copy the header of the pair style you want to port to a kokkos-named file: cp pair_yukawa.h pair_yukawa_kokkos.h Change the include guard name, and include pair_kokkos.h, neigh_list_kokkos.h and the pair style you are kokkos-ifying. 2. Your new class should a. inherit the original pair publicly b. be templated for a class DeviceType: template class PairYukawaKokkos : public PairYukawa { ... }; Don't forget to rename the constructor and destructor! 3. Make sure you "declare" three pair styles at the top: PairStyle(yukawa/kk, PairYukawaKokkos) PairStyle(yukawa/kk/device, PairYukawaKokkos) PairStyle(yukawa/kk/host, PairYukawaKokkos) 4. Make sure the original pair you are inheriting from has its destructor declared virtual! THIS IS THE ONLY CHANGE YOU SHOULD BE MAKING TO THE ORIGINAL PAIR STYLE! 5. Your Kokkos-style should override the following member functions: - void compute(int, int); - void settings(int, char**); - double init_one(int,int); 6. It helps if your style also has the following helper member functions: - void init_style(); 7. Add a struct publicly for the potential parameters with some KOKKOS macros: struct params_yukawa { KOKKOS_INLINE_FUNCTION params_yukawa(){ cutsq=0, a = 0, offset = 0; } KOKKOS_INLINE_FUNCTION params_yukawa(int i){ params_yukawa(); } F_FLOAT cutsq, a, offset; }; You should only add the pointer types here that store the coefficients for pair interactions, _not_ "global" values like kappa or cut_global that appear in the pair_style command instead. ------ Now we get to the KOKKOS magic ------- 8. Add two public enums and a typedef: - enum {EnabledNeighFlags=FULL|HALFTHREAD|HALF|N2}; - enum {COUL_FLAG=0}; - typedef DeviceType device_type; 9. Add the following protected member functions. compute_ecoul return 0 since this pair style only has a short-ranged "Van der Waals" (VDWL) part. - void cleanup_copy(); - template KOKKOS_INLINE_FUNCTION F_FLOAT compute_fpair(const F_FLOAT& rsq, const int& i, const int&j, const int& itype, const int& jtype) const; - template KOKKOS_INLINE_FUNCTION F_FLOAT compute_evdwl(const F_FLOAT& rsq, const int& i, const int&j, const int& itype, const int& jtype) const; - template KOKKOS_INLINE_FUNCTION F_FLOAT compute_ecoul(const F_FLOAT& rsq, const int& i, const int&j, const int& itype, const int& jtype) const { return 0; } 10. Copy-paste this whole snippet, making sure you replace params_yukawa with the name of your param struct and PairYukawaKokkos with the name of your KOKKOS pair style: " Kokkos::DualView k_params; typename Kokkos::DualView::t_dev_const_um params; params_yukawa m_params[MAX_TYPES_STACKPARAMS+1][MAX_TYPES_STACKPARAMS+1]; F_FLOAT m_cutsq[MAX_TYPES_STACKPARAMS+1][MAX_TYPES_STACKPARAMS+1]; typename ArrayTypes::t_x_array_randomread x; typename ArrayTypes::t_x_array c_x; typename ArrayTypes::t_f_array f; typename ArrayTypes::t_int_1d_randomread type; DAT::tdual_efloat_1d k_eatom; DAT::tdual_virial_array k_vatom; typename ArrayTypes::t_efloat_1d d_eatom; typename ArrayTypes::t_virial_array d_vatom; typename ArrayTypes::t_tagint_1d tag; int newton_pair; double special_lj[4]; typename ArrayTypes::tdual_ffloat_2d k_cutsq; typename ArrayTypes::t_ffloat_2d d_cutsq; int neighflag; int nlocal,nall,eflag,vflag; void allocate(); friend class PairComputeFunctor; friend class PairComputeFunctor; friend class PairComputeFunctor; friend class PairComputeFunctor; friend class PairComputeFunctor; friend class PairComputeFunctor; friend class PairComputeFunctor; friend class PairComputeFunctor; friend EV_FLOAT pair_compute_neighlist(PairYukawaKokkos*,NeighListKokkos*); friend EV_FLOAT pair_compute_neighlist(PairYukawaKokkos*,NeighListKokkos*); friend EV_FLOAT pair_compute_neighlist(PairYukawaKokkos*,NeighListKokkos*); friend EV_FLOAT pair_compute_neighlist(PairYukawaKokkos*,NeighListKokkos*); friend EV_FLOAT pair_compute(PairYukawaKokkos*,NeighListKokkos*); friend void pair_virial_fdotr_compute(PairYukawaKokkos*); " ------ That was it for the header. Don't worry, ------- ------ it doesn't get more difficult than this. ------- 11. Add a contributing author section to get credit for your hard work. 12. Include your new kokkos header as well as some other required headers: #include "pair_yukawa_kokkos.h" #include "kokkos.h" #include "atom_kokkos.h" 13. Use the namespace MathConst and define the following macros: using namespace MathConst; #define KOKKOS_CUDA_MAX_THREADS 256 #define KOKKOS_CUDA_MIN_BLOCKS 8 14. Implement the constructor and destructor: template PairYukawaKokkos::PairYukawaKokkos(LAMMPS *lmp) : PairYukawa(lmp) { respa_enable = 0; atomKK = (AtomKokkos *) atom; execution_space = ExecutionSpaceFromDevice::space; datamask_read = X_MASK | F_MASK | TYPE_MASK | ENERGY_MASK | VIRIAL_MASK; datamask_modify = F_MASK | ENERGY_MASK | VIRIAL_MASK; cutsq = NULL; } /* ---------------------------------------------------------------------- */ template PairYukawaKokkos::~PairYukawaKokkos() { if (allocated) { memory->destroy_kokkos(k_eatom,eatom); memory->destroy_kokkos(k_vatom,vatom); k_cutsq = DAT::tdual_ffloat_2d(); memory->sfree(cutsq); eatom = NULL; vatom = NULL; cutsq = NULL; } } 15. Implement the following "boiler-plate" for allocation and cleanup of copies: /* ---------------------------------------------------------------------- */ template void PairYukawaKokkos::cleanup_copy() { // WHY needed: this prevents parent copy from deallocating any arrays allocated = 0; cutsq = NULL; eatom = NULL; vatom = NULL; } /* ---------------------------------------------------------------------- allocate all arrays ------------------------------------------------------------------------- */ template void PairYukawaKokkos::allocate() { PairYukawa::allocate(); int n = atom->ntypes; memory->destroy(cutsq); memory->create_kokkos(k_cutsq,cutsq,n+1,n+1,"pair:cutsq"); d_cutsq = k_cutsq.template view(); k_params = Kokkos::DualView( "PairYukawa::params",n+1,n+1); params = k_params.template view(); } 16. Implement member functions that set up the style and coefficients. Pay special attention to how init_one accesses the parameters, and make sure settings checks for the right number of args (it should match the normal style, so check the normal PairStyle to make sure the numbers match. /* ---------------------------------------------------------------------- global settings ------------------------------------------------------------------------- */ template void PairYukawaKokkos::settings(int narg, char **arg) { if (narg > 2) error->all(FLERR,"Illegal pair_style command"); PairYukawa::settings(1,arg); } ------ As you can see, most of the work we did is simply replacing ------ ------ one pair name for the other and changing the names of the ------ ------ pair parameters. It is a lot of typing but not "hard". The ------ ------ last part is porting the compute member function. ------ 17. Port the compute member function. In KOKKOS, this function delegates most of the calculations to the compute_fpair, compute_evdwl and compute_coul functors (the last one returning 0 because there is no long-range force). For compute itself you really only need to rename the pair name. template void PairYukawaKokkos::compute(int eflag_in, int vflag_in) { eflag = eflag_in; vflag = vflag_in; if (neighflag == FULL) no_virial_fdotr_compute = 1; if (eflag || vflag) ev_setup(eflag,vflag,0); else evflag = vflag_fdotr = 0; // reallocate per-atom arrays if necessary if (eflag_atom) { memory->destroy_kokkos(k_eatom,eatom); memory->create_kokkos(k_eatom,eatom,maxeatom,"pair:eatom"); d_eatom = k_eatom.view(); } if (vflag_atom) { memory->destroy_kokkos(k_vatom,vatom); memory->create_kokkos(k_vatom,vatom,maxvatom,6,"pair:vatom"); d_vatom = k_vatom.view(); } atomKK->sync(execution_space,datamask_read); k_cutsq.template sync(); k_params.template sync(); if (eflag || vflag) atomKK->modified(execution_space,datamask_modify); else atomKK->modified(execution_space,F_MASK); x = atomKK->k_x.view(); c_x = atomKK->k_x.view(); f = atomKK->k_f.view(); type = atomKK->k_type.view(); tag = atomKK->k_tag.view(); nlocal = atom->nlocal; nall = atom->nlocal + atom->nghost; newton_pair = force->newton_pair; special_lj[0] = force->special_lj[0]; special_lj[1] = force->special_lj[1]; special_lj[2] = force->special_lj[2]; special_lj[3] = force->special_lj[3]; // loop over neighbors of my atoms EV_FLOAT ev = pair_compute,void >( this,(NeighListKokkos*)list); if (eflag_global) eng_vdwl += ev.evdwl; if (vflag_global) { virial[0] += ev.v[0]; virial[1] += ev.v[1]; virial[2] += ev.v[2]; virial[3] += ev.v[3]; virial[4] += ev.v[4]; virial[5] += ev.v[5]; } if (vflag_fdotr) pair_virial_fdotr_compute(this); if (eflag_atom) { k_eatom.template modify(); k_eatom.template sync(); } if (vflag_atom) { k_vatom.template modify(); k_vatom.template sync(); } } 18. Port compute_fpair, compute_evdwl and compute_coul, when/if necessary. Again, rename the pair and parameter names, but _do not_ delegate the computation to the original pair style! Just "copy-paste" the formula as much as possible. !! NOTE THAT compute_fpair DOES _NOT_ RETURN THE FORCE BUT FORCE/R !! 19. The final thing is to define the template specialization in your source file by adding the following at the bottom (note the LAMMPS_NS namespace!) namespace LAMMPS_NS { template class PairYukawaKokkos; #ifdef KOKKOS_HAVE_CUDA template class PairYukawaKokkos; #endif } 20. That, in principle, is it. Try to compile LAMMPS and see if it works. If you get compilation errors, something is wrong but it is a real pain to sift through all the compiler warning messages. Silly things like a typo in a variable inm compute_evdwl can lead to 400 pages of compiler output, but the actual kokkos errors are fairly explanatory. Just compile like this to filter out all the cuda warnings: make kokkos_cuda_mpi 2>&1 | grep error can be useful, and keep in mind that the chances of making mistakes in copying the boiler-plate should be fairly limited as the only real changes are in the parameter and class names. Some general pointers: - If you get segfaults after a run, it is probably because some of the member functions you implement in the KOKKOS pair style are not virutal. Notorious ones are the destructor and void allocate(). They _need_ to be virtual in the _base_ class! If allocate is not virtual in the base class, you tend to get segfaults during init(). Notes to self/ask Stan: - Why does Pair*Kokkos check narg before calling settings? - A lot of it is just boiler-plating and renaming. Can we template more? I know compile times will suck but... - There is no factor_lj in compute_fpair/compute_evdwl. Are these applied later? - Which headers are really required?