[lammps-users] kokkos linker error (Nvidia): libdl.a

jewettaij · March 11, 2022, 7:26pm

Hello lammps users

I’m running into a linker error compiling the KOKKOS/CUDA version of LAMMPS using cmake:

nvlink fatal : Could not open input file ‘/usr/lib/x86_64-linux-gnu/libdl.a’

It seems like the error I’m running into may be a problem specific to the way my nvidia compiler and drivers were installed on my computer, but I’d love to hear that confirmed. I don’t want to burden everyone trying to determine how to fix it. But once that’s confirmed, I can direct my efforts to fixing that problem (instead of continuing to play around with my cmake settings).

Incidentally, I have no trouble to compile LAMMPS using the GPU package (using “cmake -D GPU_API=cuda -D GPU_PREC=single -D GPU_ARCH=sm_80 -D PKG_GPU=yes …/cmake”).
I only run into problems compiling with KOKKOS.

Here is the procedure I’m using to compile KOKKOS enabled LAMMPS:


mkdir build-kokkos-cuda

cd build-kokkos-cuda/

cmake \
-C ../cmake/presets/basic.cmake \
-C ../cmake/presets/kokkos-cuda.cmake \
../cmake

cmake --build .

…and here is an excerpt of the log from the build process:

[ 0%] Building CXX object lib/kokkos/core/src/CMakeFiles/kokkoscore.dir/impl/Kokkos_CPUDiscovery.cpp.o

...
[100%] Built target lammps
[100%] Linking CXX executable lmp

nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvlink warning : Skipping incompatible '/usr/lib/x86_64-linux-gnu/libdl.a' when searching for -ldl
nvlink fatal : Could not open input file '/usr/lib/x86_64-linux-gnu/libdl.a'
gmake[2]: *** [CMakeFiles/lmp.dir/build.make:118: lmp] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:1026: CMakeFiles/lmp.dir/all] Error 2
gmake: *** [Makefile:149: all] Error 2

More background information:
I am using Nvidia RTX 3060 (laptop) hardware.
I am using the default nvidia binaries and libraries which were bundled with ubuntu 21.10. People using this environment are likely to run into this problem. Here is the “nvcc” binary I am using:

$ which nvcc
/usr/bin/nvcc

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Mon_May__3_19:15:13_PDT_2021
Cuda compilation tools, release 11.3, V11.3.109
Build cuda_11.3.r11.3/compiler.29920130_0

Thanks in advance!
Andrew

akohlmey · March 11, 2022, 7:58pm

You didn’t say which LAMMPS version you are trying to compile.
Please also provide the entire output of the CMake configuration run.

jewettaij · March 11, 2022, 9:19pm

Hi Axel

Update: This might be relevant. Recent changes in libc. (Apparently libc v2.34 incorporates the features of libdl now.)

“In order to support smoother in-place-upgrades and to simplify
the implementation of the runtime all functionality formerly
implemented in the libraries libpthread, libdl, libutil, libanl has
been integrated into libc. New applications do not need to link with
-lpthread, -ldl, -lutil, -lanl anymore. For backwards compatibility,
empty static archives libpthread.a, libdl.a, libutil.a, libanl.a are
provided, so that the linker options keep working.”

I noticed that the “/usr/lib/x86_64-linux-gnu/libdl.a” file exists on my system, but it is suspiciously short (8 bytes long).
$ ls -l /usr/lib/x86_64-linux-gnu/libdl.a

-rw-r–r-- 1 root root 8 Feb 24 11:45 /usr/lib/x86_64-linux-gnu/libdl.a

(When the file is viewed in a text editor, it only contains: “!”)

In earlier versions of ubuntu (v20.04), this file was 15kb long.

For completeness, I attached the log for the configuration step. (Let me know if you also want the log of the compilation step.)

Regarding the version of LAMMPS I am using, I ran into this problem when compiling the version of LAMMPS pulled from github on 2022-3-03.

If I run across a way to get around this issue with libdl, I will post it.
Cheers and thanks!

Andrew

kokkos_cmake_log_2022-3-11.txt (4.11 KB)

akohlmey · March 12, 2022, 7:58am

This sounds more like a packaging problem on Ubuntu then. I have a desktop running Fedora 35 and it also contains glibc-2.34 and the almost empty libdl.a file, but in addition also a corresponding libdl.so file and thus I don’t see any linker command issues about incompatible libraries.

Axel.

akohlmey · March 12, 2022, 8:13am

[…]

For completeness, I attached the log for the configuration step. (Let me know if you also want the log of the compilation step.)

thanks. please note that you will have to edit the kokkos-cuda.cmake preset file to match your GPU architecture.
you are currently compiling for MAXWELL50. Unlike the GPU package (which can use the JIT compiler to build a version of the kernel at runtime since all required data is included), KOKKOS requires providing the exact architecture.

Regarding the version of LAMMPS I am using, I ran into this problem when compiling the version of LAMMPS pulled from github on 2022-3-03.

If I run across a way to get around this issue with libdl, I will post it.

you can create a dummy libdl.so simply by doing:

touch empty.c
gcc -o libdl.so -shared -fpic empty.c

and then put it where the linker can find it. it might be sufficient to do this in the build folder.

axel.

jewettaij · March 13, 2022, 9:45am

[…]

For completeness, I attached the log for the configuration step. (Let me know if you also want the log of the compilation step.)

thanks. please note that you will have to edit the kokkos-cuda.cmake preset file to match your GPU architecture.
you are currently compiling for MAXWELL50. Unlike the GPU package (which can use the JIT compiler to build a version of the kernel at runtime since all required data is included), KOKKOS requires providing the exact architecture.

Regarding the version of LAMMPS I am using, I ran into this problem when compiling the version of LAMMPS pulled from github on 2022-3-03.

If I run across a way to get around this issue with libdl, I will post it.

you can create a dummy libdl.so simply by doing:

touch empty.c
gcc -o libdl.so -shared -fpic empty.c

Thanks. That’s useful to know about.

I don’t know if it’s relevant, but there is a non-empty file in the same directory where the original “libdl.a” file was located named “libdl.so.2”. It appears to be the correct size (14kb):

$ ls -al /usr/lib/x86_64-linux-gnu/libdl*
-rw-r–r-- 1 root root 8 Feb 24 11:45 /usr/lib/x86_64-linux-gnu/libdl.a
-rw-r–r-- 1 root root 14432 Feb 24 11:45 /usr/lib/x86_64-linux-gnu/libdl.so.2

and then put it where the linker can find it. it might be sufficient to do this in the build folder.

Thanks for the help, Axel!

Alas, compiling this version of “libdl.so” and putting that file in my “build-kokkos-cuda/” directory (and then re-running “cmake --build .”) didn’t work. (I still get the same error message.) …Nor did adding that directory to my LD_LIBRARY_PATH (using “export LD_LIBRARY_PATH=”$LD_LIBRARY_PATH:$HOME/lammps_2022-3-03/build-kokkos-cuda").

I’m not in a hurry to get kokkos working. A new stable long-term version of ubuntu will be released in a month. I’ll try upgrading my OS and see if the issue persists. (And I’ll report back when I do.) If you have other suggestions, I’m also happy to try them.

Andrew

akohlmey · March 13, 2022, 11:26am

[…]

I don’t know if it’s relevant, but there is a non-empty file in the same directory where the original “libdl.a” file was located named “libdl.so.2”. It appears to be the correct size (14kb):

$ ls -al /usr/lib/x86_64-linux-gnu/libdl*
-rw-r–r-- 1 root root 8 Feb 24 11:45 /usr/lib/x86_64-linux-gnu/libdl.a
-rw-r–r-- 1 root root 14432 Feb 24 11:45 /usr/lib/x86_64-linux-gnu/libdl.so.2

that file is needed for backward compatibility. executables linked on older distributions require that file to be present.
it is likely empty or just a wrapper since the relevant functions are already present in libc.so.6. The larger size stems from the fact that any shared library needs to have that kind of size. an empty executable would be the same size, if not larger.

I’m not in a hurry to get kokkos working. A new stable long-term version of ubuntu will be released in a month. I’ll try upgrading my OS and see if the issue persists. (And I’ll report back when I do.) If you have other suggestions, I’m also happy to try them.

Here is an alternate suggestion to work around this issue. Inside your build folder, please do the following:

rm -f libdl.so empty.c
touch empty.c
gcc -fpic -c empty.c
ar rcsv libdl.a empty.o
cmake -DLIBDL_LIBRARY=$PWD/libdl.a .
make

Axel.

jewettaij · March 13, 2022, 10:14pm

Yes! That worked.
Thank you very much Axel.

Andrew

P.S. Here’s a transcript of the process:

$ cmake -DLIBDL_LIBRARY=$PWD/libdl.a .

touch empty.c
gcc -fpic -c empty.c
ar rcsv libdl.a empty.o
cmake -DLIBDL_LIBRARY=$PWD/libdl.a .
a - empty.o
– Running check for auto-generated files from make-based build system
– KOKKOS: Enabling CUDA LAMBDA function support
– Setting default Kokkos CXX standard to 14
– Setting policy CMP0074 to use _ROOT variables
– The project name is: Kokkos
– Compiler Version: 11.3.109
– Using -std=c++14 for C++14 standard as feature
– Built-in Execution Spaces:
– Device Parallel: Kokkos::Cuda
– Host Parallel: Kokkos::OpenMP
– Host Serial: SERIAL