Fatal error: EGL/egl.h: No such file or directory

mdigennaro · January 13, 2022, 4:54pm

Dear everyone,
I am trying to complile LAMMPS with the GPU package and cmake.

I have the following versions:

$ gmake --version
GNU Make 4.3
$ cmake --version
cmake version 3.22.1

The command I am using is the following:
cmake -D PKG_GPU=on …/cmake && cmake --build .
which produces the following error:

[  1%] Performing build step for 'opencl_loader'
CMake Error at .../lammps-29Sep2021/build_GPU/opencl_loader-prefix/src/opencl_loader-stamp/opencl_loader-build-RelWithDebInfo.cmake:49 (me ssage):
  Command failed: 2

   '/usr/bin/gmake'

  See also

    .../lammps-29Sep2021/build_GPU/opencl_loader-prefix/src/opencl_loader-stamp/opencl_loader-build-*.log

which contains this error:

 fatal error: EGL/egl.h: No such file or directory
 #include <EGL/egl.h>

I have found this solution on the web, which works (maybe) only for windows:

can anyone help me here?
PS the normal “cmake --build .” works fine, while KOKKOS produces the same error.
Thank you

Marco

akohlmey · January 13, 2022, 4:57pm

What platform are you compiling on?
What are your exact CMake commands?

akohlmey · January 13, 2022, 6:57pm

I am having trouble making sense of the error. Nothing in the OpenCL loader code includes this include file directly. It usually comes bundled with OpenGL support development support.

mdigennaro · January 14, 2022, 8:16am

I am compiling on a linux cluster, specificly from an nvidia-gpu node and using cuda 11. More information below.
Regarding the cmake command, I have the following:

cmake ../cmake 
cmake --build .    #everything works

while when using the GPU option:

 $  cmake -D PKG_GPU=on ../cmake
-- Appending /cm/local/apps/cuda/libs/current/lib64:/cm/shared/apps/cuda11.1/toolkit/11.1.1/targets/x86_64-linux/lib:/cm/shared/apps/slurm/18.08.9/lib64/slurm:/cm/shared/apps/slurm/18.08.9/lib64 to CMAKE_LIBRARY_PATH: /cm/local/apps/cuda/libs/current/lib64:/cm/shared/apps/cuda11.1/toolkit/11.1.1/targets/x86_64-linux/lib:/cm/shared/apps/slurm/18.08.9/lib64/slurm:/cm/shared/apps/slurm/18.08.9/lib64
-- Running check for auto-generated files from make-based build system
-- Checking for module 'mpi-cxx'
--   No package 'mpi-cxx' found
-- Downloading and building OpenCL loader library
-- Generating style headers...
-- Generating package headers...
-- Generating lmpinstalledpkgs.h...
-- Could NOT find ClangFormat (missing: ClangFormat_EXECUTABLE) (Required is at least version "8.0")
-- The following tools and libraries have been found and configured:
 * Git
 * OpenMP

-- <<< Build configuration >>>
   Operating System: Linux CentOS Linux 7
   Build type:       RelWithDebInfo
   Install path:     /home/mdi0316/.local
   Generator:        Unix Makefiles using /usr/bin/gmake
-- Enabled packages: GPU
-- <<< Compilers and Flags: >>>
-- C++ Compiler:     /cm/local/apps/gcc/8.2.0/bin/c++
      Type:          GNU
      Version:       8.2.0
      C++ Flags:     -O2 -g -DNDEBUG
      Defines:       LAMMPS_SMALLBIG;LAMMPS_MEMALIGN=64;LAMMPS_OMP_COMPAT=3;LAMMPS_GZIP;LAMMPS_FFMPEG;LMP_GPU
-- <<< Linker flags: >>>
-- Executable name:  lmp
-- Static library flags:
-- <<< GPU package settings >>>
-- GPU API:                  OPENCL
-- GPU precision:            MIXED
-- Configuring done
-- Generating done
-- Build files have been written to: /home/mdi0316/CODES/lammps-29Sep2021/build_GPU

then building fails:

$  cmake --build .
[  1%] Built target variable.h
Consolidate compiler generated dependencies of target mpi_stubs
[  1%] Built target mpi_stubs
[  1%] Performing build step for 'opencl_loader'
CMake Error at /home/mdi0316/CODES/lammps-29Sep2021/build_GPU/opencl_loader-prefix/src/opencl_loader-stamp/opencl_loader-build-RelWithDebInfo.cmake:49 (message):
  Command failed: 2

   '/usr/bin/gmake'

  See also

    /home/mdi0316/CODES/lammps-29Sep2021/build_GPU/opencl_loader-prefix/src/opencl_loader-stamp/opencl_loader-build-*.log


gmake[2]: *** [opencl_loader-prefix/src/opencl_loader-stamp/opencl_loader-build] Error 1
gmake[1]: *** [CMakeFiles/opencl_loader.dir/all] Error 2
gmake: *** [all] Error 2

Here’s more information on the gpu card.

$ nvidia-smi
Fri Jan 14 09:06:46 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla K20Xm         On   | 00000000:04:00.0 Off |                    0 |
| N/A   31C    P8    29W / 235W |      0MiB /  5700MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K20Xm         On   | 00000000:05:00.0 Off |                    0 |
| N/A   31C    P8    31W / 235W |      0MiB /  5700MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K20Xm         On   | 00000000:08:00.0 Off |                    0 |
| N/A   31C    P8    30W / 235W |      0MiB /  5700MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  Tesla K20Xm         On   | 00000000:09:00.0 Off |                    0 |
| N/A   33C    P8    30W / 235W |      0MiB /  5700MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   4  Tesla K20Xm         On   | 00000000:84:00.0 Off |                    0 |
| N/A   30C    P8    30W / 235W |      0MiB /  5700MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   5  Tesla K20Xm         On   | 00000000:85:00.0 Off |                    0 |
| N/A   31C    P8    31W / 235W |      0MiB /  5700MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   6  Tesla K20Xm         On   | 00000000:88:00.0 Off |                    0 |
| N/A   32C    P8    30W / 235W |      0MiB /  5700MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   7  Tesla K20Xm         On   | 00000000:89:00.0 Off |                    0 |
| N/A   32C    P8    30W / 235W |      0MiB /  5700MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

could the problem be the CMAKE_LIBRARY_PATH?
I have installed it locally and there is no global one.

Thank you
Marco

akohlmey · January 14, 2022, 9:30am

You are not compiling for CUDA but for OpenCL, which is the default for the GPU package. Please see the details in the LAMMPS manual.

mdigennaro · January 14, 2022, 12:42pm

Hello, thank you for spotting my mistake.

I am now trying:

cmake -D PKG_GPU=on -D GPU_API=cuda  ../cmake
...
-- Configuring done
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
CUDA_CUDA_LIBRARY (ADVANCED)
    linked by target "nvc_get_devices" in directory /home/mdi0316/CODES/lammps-29Sep2021/cmake
    linked by target "gpu" in directory /home/mdi0316/CODES/lammps-29Sep2021/cmake

-- Generating done
CMake Generate step failed.  Build files cannot be regenerated correctly.

Therefore I am trying to explicitely linking CUDA_CUDA_LIBRARY as follows:

 cmake -D PKG_GPU=on -D GPU_API=cuda -D CUDA_CUDA_LIBRARY=/cm/shared/apps/cuda11.1/toolkit/11.1.1/lib64  ../cmake
...
-- <<< GPU package settings >>>
-- GPU API:                  CUDA
-- CUDA Compiler:            /cm/shared/apps/cuda11.1/toolkit/11.1.1/bin/nvcc
-- GPU default architecture: sm_50
-- GPU binning with CUDPP:   OFF
-- CUDA MPS support:         OFF
-- GPU precision:            MIXED
-- Configuring done
WARNING: Target "gpu" requests linking to directory "/cm/shared/apps/cuda11.1/toolkit/11.1.1/lib64".  Targets may link only to libraries.  CMake is dropping the item.
WARNING: Target "lammps" requests linking to directory "/cm/shared/apps/cuda11.1/toolkit/11.1.1/lib64".  Targets may link only to libraries.  CMake is dropping the item.
WARNING: Target "lmp" requests linking to directory "/cm/shared/apps/cuda11.1/toolkit/11.1.1/lib64".  Targets may link only to libraries.  CMake is dropping the item.
WARNING: Target "nvc_get_devices" requests linking to directory "/cm/shared/apps/cuda11.1/toolkit/11.1.1/lib64".  Targets may link only to libraries.  CMake is dropping the item.
-- Generating done
-- Build files have been written to: /home/mdi0316/CODES/lammps-29Sep2021/build_GPU_CUDA

can you tell me how can I properly link these libraries?
Moreover, I cannot find explicitely the Tesla architecture in the documentation. Would that be a problem?

Once again, thank you
MDG

akohlmey · January 14, 2022, 12:55pm

That is only the second best option. You should rather fix your CUDA toolkit installation to properly set the necessary environment variables (CUDA_HOME, PATH, LD_LIBRARY_PATH etc.)

You are obviously not using -DCUDA_CUDA_LIBRARY correctly. The error message tells you what the issue is. Just use a bit of common sense here.

A summary of the supported Nvida architectures is given in the documentation: 3.7. Packages with extra build options — LAMMPS documentation

When compiling with CMake LAMMPS will attempt to build so-called “fat” executables with PTX code included for all architectures supported by the CUDA toolkit in use, so the choice of the correct architecture is less crucial. Please note that your hardware is about to be phased out by the non-legacy nvidia drivers and not going to be supported by future CUDA toolkits.

mdigennaro · January 14, 2022, 2:06pm

Dear Axel,

I now used this, which made the compilation work:

-DCUDA_CUDA_LIBRARY=/cm/shared/apps/cuda11.1/toolkit/11.1.1/targets/x86_64-linux/lib/stubs/libcuda.so

Nevertheless, I’d like to understand what was wrong with my environment variables.
Could something be interfering? (cuda current = cuda 11.1.1, so that should not be the problem)

$ echo $CUDA_PATH
/cm/shared/apps/cuda11.1/toolkit/11.1.1
$ echo $CUDA_HOME
/cm/shared/apps/cuda11.1/toolkit/11.1.1
$ echo $LD_LIBRARY_PATH
/cm/local/apps/cuda/libs/current/lib64:/cm/shared/apps/cuda11.1/toolkit/11.1.1/targets/x86_64-linux/lib:/cm/local/apps/gcc/8.2.0/lib:/cm/local/apps/gcc/8.2.0/lib64:/cm/shared/apps/slurm/18.08.9/lib64/slurm:/cm/shared/apps/slurm/18.08.9/lib64:/home/mdi0316/lib:/cm/shared/apps/hdf5/1.10.1/lib/:/cm/shared/apps/mvapich2/gcc/64/2.3.2/lib
$ echo $PATH
/cm/shared/apps/abinit-8.10.3/src/98_main:/home/mdi0316/anaconda3/condabin:/home/mdi0316/bin:/cm/shared/apps/TURBOMOLE/bin/em64t-unknown-linux-gnu_smp:/cm/shared/apps/TURBOMOLE/scripts:/cm/local/apps/cuda/libs/current/bin:/cm/shared/apps/cuda11.1/sdk/11.1.1/bin/x86_64/linux/release:/cm/shared/apps/cuda11.1/toolkit/11.1.1/bin:/cm/local/apps/gcc/8.2.0/bin:/cm/shared/apps/slurm/18.08.9/sbin:/cm/shared/apps/slurm/18.08.9/bin:/cm/local/apps/environment-modules/4.2.1/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/ibutils/bin:/sbin:/usr/sbin:/cm/local/apps/environment-modules/4.2.1/bin

Thank you for the support.
Marco

akohlmey · January 14, 2022, 2:15pm

It is extremely difficult to diagnose your issues from remote. I suggest you Google for “cmake findcuda.cmake” and see what cmake uses to locate the cuda toolkit settings. It works properly on all machines that I am compiling on and also your original egl.h error for OpenCL hints that there is something inconsistent in the setup of the machine you are compiling on. I did another check with a CentOS 7 container and did not see any reference to that when compiling the opencl loader library.