Stack trace from Segmentation Fault

Dear Lammps,

I am experiencing a segmentation fault when trying to run my simulation.

My set up is as follows: Ubuntu 12.04 32-bit, 2x NVIDIA GTX 580 cards, CUDA 4.0, and the most recent version of LAMMPS (10Feb2015) built using make fftw.

The stack trace is as follows:

Initializing Device and compiling on process 0…
Program received signal SIGSEGV, Segmentation fault.
0x00000000 in ?? ()
(gdb) where
#0 0x00000000 in ?? ()
#1 0x0936fa86 in allocScanStorage ()
#2 0x09328a1a in CUDPPScanPlan::CUDPPScanPlan(CUDPPConfiguration, unsigned int, unsigned int, unsigned int) ()
#3 0x09328b1f in CUDPPRadixSortPlan::CUDPPRadixSortPlan(CUDPPConfiguration, unsigned int) ()
#4 0x09328c7e in cudppPlan ()
#5 0x0924ce28 in LAMMPS_AL::Atom<float, double>::add_fields(bool, bool, int, bool, bool) ()
#6 0x09226120 in LAMMPS_AL::Device<float, double>::init(LAMMPS_AL::Answer<float, double>&, bool, bool, int, int, int, LAMMPS_AL::Neighbor*, int, int, int, double, bool, int, bool) ()
#7 0x09264a39 in LAMMPS_AL::BaseCharge<float, double>::init_atomic(int, int, int, int, double, double, _IO_FILE*, void const*, char const*) ()
#8 0x092b762b in LAMMPS_AL::CHARMMLong<float, double>::init(int, double, double**, double**, double**, double**, double**, double*, int, int, int, int, double, double, _IO_FILE*, double, double, double*, double, double, double, double, double**, double**, bool) ()
#9 0x0924123f in crml_gpu_init(int, double, double**, double**, double**, double**, double**, double*, int, int, int, int, double, int&, _IO_FILE*, double, double, double*, double, double, double, double, double**, double**, bool) ()
#10 0x089697d1 in LAMMPS_NS::PairLJCharmmCoulLongGPU::init_style (this=0x6b4e7d60) at …/pair_lj_charmm_coul_long_gpu.cpp:200
#11 0x088747ee in LAMMPS_NS::Pair::init (this=0x6b4e7d60) at …/pair.cpp:221
#12 0x085d7bee in LAMMPS_NS::Force::init (this=0x6b428da8) at …/force.cpp:119
#13 0x086687ae in LAMMPS_NS::LAMMPS::init (this=0x6b412218) at …/lammps.cpp:696
#14 0x08deae06 in LAMMPS_NS::Run::command (this=0xbfffec20, narg=1, arg=0x6b433a18) at …/run.cpp:169
#15 0x086604f0 in LAMMPS_NS::Input::command_creator<LAMMPS_NS::Run> (lmp=0x6b412218, narg=1, arg=0x6b433a18) at …/input.cpp:631
#16 0x0865b8d7 in LAMMPS_NS::Input::execute_command (this=0x6b412820) at …/input.cpp:614
#17 0x0865cdcf in LAMMPS_NS::Input::file (this=0x6b412820) at …/input.cpp:225
#18 0x0826d153 in main (argc=3, argv=0xbfffee24) at …/main.cpp:31

If anyone has any suggestions as to what might be going wrong here I would be most grateful.

I’ve attached my script and input. The config file is quite big so let me know if you need it.

Thank you in advance.

Sarah-Beth Amos

MRes student in Biophysics
King’s College London

equil.sh (132 Bytes)

pleu4_lipids.equil (2.1 KB)

Hi Sarah-Beth,

does the segfault persist if you run with a single MPI process with only one GPU? Next, can you run examples/accelerate/in.rhodo successfully using the same number of MPI procs and GPUs as you’re doing (i.e. 2 MPI procs on 2 GPUs)?

Also, can you show the whole screen output including number of atoms, bonds, etc. ? In any case, it would be helpful for debugging if you can generate a small config (data file) that can reproduce the segfault.

-Trung