Hi LAMMPS Users
I am trying to install GPU version of LAMMPS. I was successful in compiling LAMMPS with GPU version and it did generate the executable lmp_mesabi. Details regarding arch/version of computer/LAMMPS are at the end of the email.
However when I try to run, it throws the exception
error while loading shared libraries: libcuda.so.1: cannot open shared object file: No such file or directory
From reading the previous threads, I realized it is a problem with the dynamic linking.
When I execute
vsethura@…7239… [~/mylammps/src] % ldd lmp_mesabi
linux-vdso.so.1 => (0x00007ffdf1bfa000)
/lib64/snoopy.so (0x00007f2941afd000)
libmpi.so.12 => /panfs/roc/intel/x86_64/2016/parallel_studio_xe_msi/compilers_and_libraries_2016.3.210/linux/mpi/intel64/lib/libmpi.so.12 (0x00007f294132e000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f2941111000)
libjpeg.so.62 => /usr/lib64/libjpeg.so.62 (0x00007f2940ec1000)
libcudart.so.8.0 => /panfs/roc/msisoft/cuda/8.0/lib64/libcudart.so.8.0 (0x00007f2940c5b000)
libcuda.so.1 => not found
libdl.so.2 => /lib64/libdl.so.2 (0x00007f2940a57000)
libstdc++.so.6 => /panfs/roc/msisoft/gcc/4.9.2_2/lib64/libstdc++.so.6 (0x00007f2940745000)
libmkl_intel_lp64.so => /panfs/roc/intel/x86_64/2016/parallel_studio_xe_msi/compilers_and_libraries_2016.3.210/linux/mkl/lib/intel64/libmkl_intel_lp64.so (0x00007f293fc35000)
libmkl_core.so => /panfs/roc/intel/x86_64/2016/parallel_studio_xe_msi/compilers_and_librares_2016.3.210/linux/mkl/lib/intel64/libmkl_core.so (0x00007f293e224000)
libmkl_sequential.so => /panfs/roc/intel/x86_64/2016/parallel_studio_xe_msi/compilers_and_libraries_2016.3.210/linux/mkl/lib/intel64/libmkl_sequential.so (0x00007f293d54b000)
libm.so.6 => /lib64/libm.so.6 (0x00007f293d2c7000)
libmpifort.so.12 => /panfs/roc/intel/x86_64/2016/parallel_studio_xe_msi/compilers_and_libraries_2016.3.210/linux/mpi/intel64/lib/libmpifort.so.12 (0x00007f293cf29000)
librt.so.1 => /lib64/librt.so.1 (0x00007f293cd21000)
libgcc_s.so.1 => /panfs/roc/msisoft/gcc/4.9.2_2/lib64/libgcc_s.so.1 (0x00007f293cb0b000)
libc.so.6 => /lib64/libc.so.6 (0x00007f293c777000)
/lib64/ld-linux-x86-64.so.2 (0x00007f2941cfe000)
I realize that the libcuda.so.1 is not found.
But I did give a correct softlink to the main file. For instance, when I execute,
vsethura@…7239… [~/mylammps/lib/gpu] % ls -l libcuda.so.1
lrwxrwxrwx 1 vsethura dorfmank 50 Nov 15 11:30 libcuda.so.1 -> /panfs/roc/msisoft/cuda/8.0/lib64/stubs/libcuda.so
which I would expect to mean that it is correctly linked.
Further, as per Axel’s suggestion in one of the previous posts, I added the -Wl,-rpath links to the Makefile.lammps too. The following is my lammps Makefile from GPU folder
gpu_SYSINC =
gpu_SYSLIB = -lcudart -lcuda
gpu_SYSPATH = -L/panfs/roc/msisoft/cuda/8.0/lib64/stubs -Wl,-rpath,/path/panfs/roc/msisoft/cuda/8.0/lib64/stubs
Also, I added the path to LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/panfs/roc/msisoft/cuda/8.0/lib64/stubs
It would be great if anyone could point out what I am doing wrong or missing something obvious.
More details
ARCHITECTURE: MESABI SUPERCOMPUTER - KEPLER K40 nodes sm_35
LAMMPS Version: LAMMPS (22 Sep 2017)
Thanks in advance
Vaidyanathan M S
Postdoctoral Research Assistant
University of Minnesota, Twin Cities