Hello,
I am trying to set up a number of LAMMPS test cases for testing the performance on our GPU systems. I have built LAMMPS with GPU support and tried to run the bench_lj benchmark, which I downloaded some time ago from the Benchmarks website. I am running like this:
mpiexec -np 1 lmp_nas.v100 -sf gpu -pk gpu 1 -v x 4 -v y 4 -v z 8 -v t 1000
and get the error above with the following stacktrace:PT: #5 0x000000000084318d in LAMMPS_AL::Neighbor::init(LAMMPS_AL::NeighborShared*, int, int, int, int, ucl_cudadr::UCL_Device&, int, int, bool, int, int, int, int, int, bool, std::string const&, bool) ()
MPT: #6 0x000000000082ec7c in LAMMPS_AL::Device<float, double>::init_nbor(LAMMPS_AL::Neighbor*, int, int, int, int, int, int, double, bool, int, bool) ()
MPT: #7 0x0000000000860eb3 in LAMMPS_AL::BaseAtomic<float, double>::init_atomic(int, int, int, int, double, double, _IO_FILE*, void const*, char const*, int) ()
MPT: #8 0x000000000084cce9 in LAMMPS_AL::LJ<float, double>::init(int, double**, double**, double**, double**, double**, double**, double*, int, int, int, int, double, double, _IO_FILE*) ()
MPT: #9 0x00000000008399ea in ljl_gpu_init(int, double**, double**, double**, double**, double**, double**, double*, int, int, int, int, double, int&, _IO_FILE*)
MPT: ()
MPT: #10 0x0000000000760e32 in LAMMPS_NS::PairLJCutGPU::init_style (this=0x1e2888e0)
I googled around and found a similar problem reported, the advice was to consult with the LAMMPS developers.
- Am I running the test correctly? Maybe my flags are incorrect?
- Have you seen similar problems reported in the past? Maybe there is some sort of workaround?
- Could you advise on some basic sanity tests to check if my installation is ok?
Many thanks in advance for any guidance on this issue, Gabriele