LAMMPS produce memory error with ReaxFF and OpenMP

When I try to run attached LAMMPS input with:
lmp -sf omp -pk omp 12 -in input.dat
I get a segmentation fault (LAMMPS 28Mar2023). Without OpenMP or on a GPU the calculation runs.

There is nothing attached to your message.

Please try the KOKKOS package instead of the OPENMP package for OpenMP threading.
If it worked for a GPU, it will likely also work for threads.

I can’t attach files.

You can put the files on Dropbox, Google Drive, Microsoft OneDrive or similar and provide a link.

If there is a bug in the OpenMP ReaxFF then we certainly want to fix it. That said, you may get better performance and have more stability with the Kokkos version of ReaxFF on CPUs, using either the OpenMP or Serial backend.

Thanks for the posting the input.

Unfortunately, I cannot reproduce any segfaults with this input deck.
I’ve run on two different machines with varying numbers of threads between 1 and 20 and all runs completed without a segfault.

What are your compilation settings? Can you post the first part of the output from lmp -h?

Here is what I have tested with:

Large-scale Atomic/Molecular Massively Parallel Simulator - 28 Mar 2023 - Development
Git info (develop / patch_28Mar2023-226-g76afaef)

Usage example: /tmp/akohlmey/build-lammps/lmp -var t 300 -echo screen -in in.alloy

[...]

OS: Linux "CentOS Linux 7 (Core)" 3.10.0-1160.53.1.el7.x86_64 x86_64

Compiler: GNU C++ 11.2.0 with OpenMP 4.5
C++ standard: C++11
MPI v1.0: LAMMPS MPI STUBS for LAMMPS version 28 Mar 2023

Accelerator configuration:

OPENMP package API: OpenMP
OPENMP package precision: double

Active compile time flags:

-DLAMMPS_GZIP
-DLAMMPS_PNG
-DLAMMPS_JPEG
-DLAMMPS_SMALLBIG
sizeof(smallint): 32-bit
sizeof(imageint): 32-bit
sizeof(tagint):   32-bit
sizeof(bigint):   64-bit

Available compression formats:

Extension: .gz     Command: gzip
Extension: .bz2    Command: bzip2
Extension: .zst    Command: zstd
Extension: .xz     Command: xz
Extension: .lzma   Command: xz
Extension: .lz4    Command: lz4


Installed packages:

KSPACE MANYBODY MOLECULE OPENMP REAXFF RIGID 

And:


Large-scale Atomic/Molecular Massively Parallel Simulator - 28 Mar 2023 - Development
Git info (develop / patch_28Mar2023-226-g76afaefe45-modified)

Usage example: ../lmp -var t 300 -echo screen -in in.alloy

[...]

OS: Linux "Fedora Linux 36 (Thirty Six)" 6.2.8-100.fc36.x86_64 x86_64

Compiler: GNU C++ 12.2.1 20221121 (Red Hat 12.2.1-4) with OpenMP 4.5
C++ standard: C++11
MPI v3.1: MPICH Version:	3.4.3
MPICH Release date:	Thu Dec 16 11:20:57 CST 2021
MPICH ABI:	13:12:1

Accelerator configuration:

GPU package API: CUDA
GPU package precision: double
OPENMP package API: OpenMP
OPENMP package precision: double

Compatible GPU present: yes

Active compile time flags:

-DLAMMPS_GZIP
-DLAMMPS_PNG
-DLAMMPS_JPEG
-DLAMMPS_FFMPEG
-DLAMMPS_EXCEPTIONS
-DLAMMPS_SMALLBIG
sizeof(smallint): 32-bit
sizeof(imageint): 32-bit
sizeof(tagint):   32-bit
sizeof(bigint):   64-bit

Available compression formats:

Extension: .gz     Command: gzip
Extension: .bz2    Command: bzip2
Extension: .zst    Command: zstd
Extension: .xz     Command: xz
Extension: .lzma   Command: xz
Extension: .lz4    Command: lz4

Installed packages:

AMOEBA ASPHERE AWPMD BOCS BODY BPM BROWNIAN CG-DNA CG-SPICA CLASS2 COLLOID 
COLVARS COMPRESS CORESHELL DIELECTRIC DIFFRACTION DIPOLE DPD-BASIC DPD-MESO 
DPD-REACT DPD-SMOOTH DRUDE EFF ELECTRODE EXTRA-COMPUTE EXTRA-DUMP EXTRA-FIX 
EXTRA-MOLECULE EXTRA-PAIR FEP GPU GRANULAR INTERLAYER KSPACE LATBOLTZ LEPTON 
MACHDYN MANYBODY MC MDI MEAM MESONT MISC ML-HDNNP ML-IAP ML-PACE ML-POD 
ML-RANN ML-SNAP MOFFF MOLECULE MOLFILE MPIIO MSCG OPENMP OPT ORIENT PERI 
PHONON PLUGIN PLUMED POEMS PTM PYTHON QEQ QTB REACTION REAXFF REPLICA RIGID 
SHOCK SMTBQ SPH SPIN SRD TALLY UEF VORONOI YAFF 

As Stan was already mentioning, the KOKKOS version is often a better choice for ReaxFF. This is particularly true when using a larger number of threads. the OPENMP package was developed at a time when CPUs had few (1-6) cores per socket and it is optimized for that. The KOKKOS acceleration offers more choices in how the work is distributed across threads and some of those are a big improvement with a larger number of threads.

Here is the configuration I have used:

Thanks for the config info. Nothing stands out as unusual or problematic.

Since I cannot reproduce the segfault, it is difficult to debug this for me.
There are several possible causes, but none of them are very likely.

There are a few things you can do to try narrow down the issue.

  • compile LAMMPS with debug info included (the default build type or -D CMAKE_BUILD_TYPE=Debug with CMake or using ‘-g’ in the CCFLAGS and LINKFLAGS with the traditional make) , then run under debugger and obtain a stack trace when the segfault happens.
env OMP_NUM_THREADS=12 gdb --args lmp -in input.dat -sf omp
[...]
(gdb) run
[...]
(gdb) where

Here is the debug output:
thread 4 “lmp” received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffecc42000 (LWP 15188)]
0x00000000028c30c0 in ReaxFF::Calculate_dCos_ThetaOMP (dvec_ji=0x7fff7713f8f0, d_ji=1.132655905643456, dvec_jk=0x7fff7713fc80, d_jk=2.054405207768419, dcos_theta_di=0x7fe4a4a6d588, dcos_theta_dj=0x7fe4a4a6d5a0, dcos_theta_dk=0x7fe4a4a6d5b8) at /tmp/lammps-28Mar2023/src/OPENMP/reaxff_valence_angles_omp.cpp:65
65 (*dcos_theta_di)[0] = dinv_jk - cdev_ji;
Missing separate debuginfos, use: zypper install libgcc_s1-debuginfo-12.2.1+git416-150000.1.5.1.x86_64 libgomp1-debuginfo-12.2.1+git416-150000.1.5.1.x86_64 libnuma1-debuginfo-2.0.14.20.g4ee5e0c-10.1.x86_64 libpython3_9-1_0-debuginfo-3.9.15-150300.4.21.1.x86_64 libstdc++6-debuginfo-12.2.1+git416-150000.1.5.1.x86_64 libz1-debuginfo-1.2.11-150000.3.36.1.x86_64 nvidia-compute-G06-debuginfo-525.105.17-lp153.6.1.x86_64
(gdb) p *dcos_theta_di
Cannot access memory at address 0x7fe4a4a6d588
(gdb)
Cannot access memory at address 0x7fe4a4a6d588
(gdb) bt
#0 0x00000000028c30c0 in ReaxFF::Calculate_dCos_ThetaOMP (dvec_ji=0x7fff7713f8f0, d_ji=1.132655905643456, dvec_jk=0x7fff7713fc80,
d_jk=2.054405207768419, dcos_theta_di=0x7fe4a4a6d588, dcos_theta_dj=0x7fe4a4a6d5a0, dcos_theta_dk=0x7fe4a4a6d5b8)
at /tmp/lammps-28Mar2023/src/OPENMP/reaxff_valence_angles_omp.cpp:65
#1 0x00000000028c492d in ReaxFF::Valence_AnglesOMP (system=0x7ffff7ba9a66, control=0x1892e2a0, data=0x18935650, workspace=0x18935250,
lists=0xb4000000b4) at /tmp/lammps-28Mar2023/src/OPENMP/reaxff_valence_angles_omp.cpp:373
#2 0x00007ffff7ba7cfe in ?? () from /usr/lib64/libgomp.so.1
#3 0x00007ffff5c786ea in start_thread () from /lib64/libpthread.so.0
#4 0x00007fffeec2ea6f in clone () from /lib64/libc.so.6

Valgrind produces:
==15307== Warning: set address range perms: large range [0x3227c040, 0x4ab3fde0) (undefined)
==15307== Warning: set address range perms: large range [0x657de040, 0x92a880c0) (undefined)
==15307== Thread 3:
==15307== Invalid write of size 8
==15307== at 0x28C30C0: ReaxFF::Calculate_dCos_ThetaOMP(double*, double, double*, double, double () [3], double () [3], double () [3]) (reaxff_valence_angles_omp.cpp:65)
==15307== by 0x28C492C: ReaxFF::Valence_AnglesOMP(ReaxFF::reax_system
, ReaxFF::control_params*, ReaxFF::simulation_data*, ReaxFF::storage*, ReaxFF::reax_list**) [clone ._omp_fn.0] (reaxff_valence_angles_omp.cpp:373)
==15307== by 0x195EFCFD: ??? (in /usr/lib64/libgomp.so.1.0.0)
==15307== by 0x1B51D6E9: start_thread (in /lib64/libpthread-2.31.so)
==15307== by 0x225AFA6E: clone (in /lib64/libc-2.31.so)
==15307== Address 0x646efe58 is 824,104 bytes inside an unallocated block of size 984,240 in arena “client”

and a lot more of these messages all referring to the same unallocated block. Finally:

==15307== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==15307== Access not within mapped region at address 0x1BBD82A058
==15307== at 0x28C30C0: ReaxFF::Calculate_dCos_ThetaOMP(double*, double, double*, double, double () [3], double () [3], double () [3]) (reaxff_valence_angles_omp.cpp:65)
==15307== by 0x28C492C: ReaxFF::Valence_AnglesOMP(ReaxFF::reax_system
, ReaxFF::control_params*, ReaxFF::simulation_data*, ReaxFF::storage*, ReaxFF::reax_list**) [clone ._omp_fn.0] (reaxff_valence_angles_omp.cpp:373)
==15307== by 0x195EFCFD: ??? (in /usr/lib64/libgomp.so.1.0.0)
==15307== by 0x1B51D6E9: start_thread (in /lib64/libpthread-2.31.so)
==15307== by 0x225AFA6E: clone (in /lib64/libc-2.31.so)
==15307== If you believe this happened as a result of a stack
==15307== overflow in your program’s main thread (unlikely but
==15307== possible), you can try to increase the size of the
==15307== main thread stack using the --main-stacksize= flag.
==15307== The main thread stack size used in this run was 8388608.

The stack size is unlimited.

What happens, if you use the pre-compiled, statically linked executable?

That seems to work, at least it is now past the point of the segmentation fault.

That would explain why I cannot reproduce the segfault.

The logical conclusion would be that either your compiler or compilation settings (like optimization settings) are such that the multi-threaded ReaxFF code in the OPENMP folder gets miscompiled, or that you have modifications/packages included that are not present on my machine.