Abnormal Force Drop Between Two run Commands in Self-Compiled LAMMPS 20240829 – Suspected Compilation Issue

I compiled LAMMPS 20240829 Update 4 on my local HPC environment.

The simulation setup is similar to a modified fix smd approach:

  • A spring-like loading method is used

  • A plate is pulled at constant loading velocity

  • The force is calculated via variables

  • Force is applied using fix addforce

My script controlling the loading looks like this:

variable l0 equal c_1[1]
run 0
print “${l0}” file l0.txt screen no
variable cl0 equal ${l0}
variable dx equal " c_1[1] - (v_cl0 - (step-3000000)**1e-08**0.1) "
variable pull_force equal -110000/240*v_dx
fix pull balattice addforce v_pull_force 0.0 0.0

Observed Phenomenon

At the transition between:

run 512100 
run 56900 

There is a clear sudden drop (discontinuity) in the shear force.

Important observations:

  • This force drop consistently appears in my self-compiled version.

  • The same input script executed with the platform-compiled LAMMPS version does NOT show this artifact.

  • The model, input script, and parameters are identical.

Compiler Information – Self-Compiled Version

Large-scale Atomic/Molecular Massively Parallel Simulator - 29 Aug 2024 - Update 4

Usage example: lmp -var t 300 -echo screen -in in.alloy

List of command line options supported by this LAMMPS executable:

-echo none/screen/log/both : echoing of input script (-e)
-help : print this help message (-h)
-in none/filename : read input from file or stdin (default) (-i)
-kokkos on/off … : turn KOKKOS mode on or off (-k)
-log none/filename : where to send log output (-l)
-mdi ‘’ : pass flags to the MolSSI Driver Interface
-mpicolor color : which exe in a multi-exe mpirun cmd (-m)
-cite : select citation reminder style (-c)
-nocite : disable citation reminder (-nc)
-nonbuf : disable screen/logfile buffering (-nb)
-package style … : invoke package command (-pk)
-partition size1 size2 … : assign partition sizes (-p)
-plog basename : basename for partition logs (-pl)
-pscreen basename : basename for partition screens (-ps)
-restart2data rfile dfile … : convert restart to data file (-r2data)
-restart2dump rfile dgroup dstyle dfile …
: convert restart to dump file (-r2dump)
-restart2info rfile : print info about restart rfile (-r2info)
-reorder topology-specs : processor reordering (-r)
-screen none/filename : where to send screen output (-sc)
-skiprun : skip loops in run and minimize (-sr)
-suffix gpu/intel/kk/opt/omp: style suffix to apply (-sf)
-var varname value : set index style variable (-v)

OS: Linux “Anolis OS 8.6” 4.19.91-26.an8.x86_64 x86_64

Compiler: GNU C++ 8.5.0 20210514 (Anolis 8.5.0-10.0.1) with OpenMP 4.5
C++ standard: C++11
MPI v3.1: Intel(R) MPI Library 2021.11 for Linux* OS

Accelerator configuration:

OPENMP package API: OpenMP
OPENMP package precision: double
OpenMP standard: OpenMP 4.5

FFT information:

FFT precision = double
FFT engine = mpiFFT
FFT library = FFTW3

Active compile time flags:

-DLAMMPS_GZIP
-DLAMMPS_SMALLBIG
sizeof(smallint): 32-bit
sizeof(imageint): 32-bit
sizeof(tagint): 32-bit
sizeof(bigint): 64-bit

Available compression formats:

Extension: .gz Command: gzip
Extension: .bz2 Command: bzip2
Extension: .zst Command: zstd
Extension: .xz Command: xz
Extension: .lzma Command: xz
Extension: .lz4 Command: lz4

Installed packages:

MOLECULE KSPACE EXTRA-COMMAND EXTRA-COMPUTE EXTRA-DUMP EXTRA-FIX GRANULAR
MANYBODY OPENMP OPT RIGID

Compiler Information – Working Version (Platform Build)

Large-scale Atomic/Molecular Massively Parallel Simulator - 29 Aug 2024

Usage example: lmp -var t 300 -echo screen -in in.alloy

List of command line options supported by this LAMMPS executable:

-echo none/screen/log/both : echoing of input script (-e)
-help : print this help message (-h)
-in none/filename : read input from file or stdin (default) (-i)
-kokkos on/off … : turn KOKKOS mode on or off (-k)
-log none/filename : where to send log output (-l)
-mdi ‘’ : pass flags to the MolSSI Driver Interface
-mpicolor color : which exe in a multi-exe mpirun cmd (-m)
-cite : select citation reminder style (-c)
-nocite : disable citation reminder (-nc)
-nonbuf : disable screen/logfile buffering (-nb)
-package style … : invoke package command (-pk)
-partition size1 size2 … : assign partition sizes (-p)
-plog basename : basename for partition logs (-pl)
-pscreen basename : basename for partition screens (-ps)
-restart2data rfile dfile … : convert restart to data file (-r2data)
-restart2dump rfile dgroup dstyle dfile …
: convert restart to dump file (-r2dump)
-restart2info rfile : print info about restart rfile (-r2info)
-reorder topology-specs : processor reordering (-r)
-screen none/filename : where to send screen output (-sc)
-skiprun : skip loops in run and minimize (-sr)
-suffix gpu/intel/kk/opt/omp: style suffix to apply (-sf)
-var varname value : set index style variable (-v)

OS: Linux “Rocky Linux 8.10 (Green Obsidian)” 4.18.0-553.el8_10.x86_64 x86_64

Compiler: Intel LLVM C++ 202402.0 / Intel(R) oneAPI DPC++/C++ Compiler 2024.2.1 (2024.2.1.20240711) with OpenMP 5.1
C++ standard: C++11
MPI v3.1: Intel(R) MPI Library 2021.13 for Linux* OS

Accelerator configuration:

OPENMP package API: OpenMP
OPENMP package precision: double
OpenMP standard: OpenMP 5.1

FFT information:

FFT precision = double
FFT engine = mpiFFT
FFT library = FFTW3

Active compile time flags:

-DLAMMPS_GZIP
-DLAMMPS_CURL
-DLAMMPS_SMALLBIG
sizeof(smallint): 32-bit
sizeof(imageint): 32-bit
sizeof(tagint): 32-bit
sizeof(bigint): 64-bit

Available compression formats:

Extension: .gz Command: gzip
Extension: .bz2 Command: bzip2
Extension: .zst Command: zstd
Extension: .xz Command: xz
Extension: .lzma Command: xz
Extension: .lz4 Command: lz4

Installed packages:

AMOEBA ASPHERE BOCS BODY BPM BROWNIAN CG-DNA CG-SPICA CLASS2 COLLOID COLVARS
CORESHELL DIELECTRIC DIFFRACTION DIPOLE DPD-BASIC DPD-MESO DPD-REACT
DPD-SMOOTH DRUDE EFF EXTRA-COMMAND EXTRA-COMPUTE EXTRA-DUMP EXTRA-FIX
EXTRA-MOLECULE EXTRA-PAIR FEP GRANULAR INTERLAYER KSPACE MANYBODY MC MEAM MISC
ML-IAP ML-POD ML-SNAP ML-UF3 MOFFF MOLECULE OPENMP OPT ORIENT PERI PHONON
PLUGIN POEMS QEQ REACTION REAXFF REPLICA RIGID SHOCK SPH SPIN SRD TALLY UEF
YAFF

Question

Since the model, input, and parameters are identical but the results differ significantly, and the artifact only appears in my self-compiled version:

What compilation-related issues could cause this type of force discontinuity between successive run commands?

I would appreciate guidance on what to check.

Is this dip deterministic, i.e. if you run the simulation again, it has the exactly same magnitude?

You haven’t posted your files (see Please Read This First: Guidelines and Suggestions for posting LAMMPS questions to enhance your chances to get answers), so I cannot reproduce your simulation. Therefore, the only other piece of advice I have is to output value of v_pull_force through thermo_style custom command and see if the value abruptly changes between runs.

Yes, after the run command, a stress/shear drop appears in the results. The behavior is reproducible within the same simulation under identical conditions, and it can also be observed across different confining pressures and shear rates. I have marked the location of the first run step in the input file to clearly indicate where the drop occurs.

Based on these observations, I believe this issue is likely related to the compiled version of the code rather than a physical or numerical setup difference, since the same input files do not exhibit this phenomenon when executed with builds compiled on this and another platforms.

I have attached the reproducible input and run files for reference. Additionally, the restart files need to be placed inside the restart directory to ensure the simulation can properly read and resume from them. Please let me know if you would like me to provide additional output data or diagnostic information to help trace the source of the discrepancy.

restart3000000.bishear (4.5 MB)
steady_slide.bishear (3.8 KB)

How did you get such “odd” numbers for the run?
They are different from the numbers in the input you posted, too.

Your input is too complex and convoluted to check.

What is the property from the input that is plotted in your graph?
There is no mention of a “mu” in your input.

Can you be certain that the “issue” is caused by the compiler and not just by some chance does not show in one case and does in the others?

Please note that your two LAMMPS versions differ and they are outdated, too.

The best approach would be to set up a very simple test system with only a few hundred to a few thousand particles and where you remove everything that is not necessary to reproduce the issue and where you can reproduce it with only a few thousand MD steps.

As shown in the script, I inserted some modifications to the dump output frequency between the two run commands. In the final stage, I would like to obtain more detailed output. I have now corrected the run step numbers to match my submission file.

The property plotted in the figure is the shear force of the loading plate divided by the normal force. In the fix output, it corresponds to f_brlc[1] / f_brlc[3] from fix ave/data. The variable lx represents the displacement of the loading point, which corresponds to

variable dx equal “c_1[1] - (v_cl0 - (step-3000000) * 1.1e-08 * 0.3)”

where 0.3 is the loading velocity [m/s]. The total loading distance is fixed as 0.0002 m.

I simplified the input file to retain only the essential information. The model can be understood as a spring-driven granular shear layer, loaded at a velocity of 0.3 m/s. The average particle diameter is 0.0001 m, and the total loading distance is 0.002 m.

The friction coefficient, μ, is calculated by dividing the shear force by the normal force. I have also provided an additional post-processing Python script to obtain the results.

Using 12 CPU cores, the total computation time for this case is approximately 15 minutes.

My question is that, from a physical point of view, there should not be any transition associated with the two run commands, since the only change between them is the output settings. Therefore, the shear curve is expected to remain continuous and smooth.

However, this abrupt drop is particularly evident in the averaged results over seven different initial samples, as shown in the figures I submitted.

My initial guess was that this issue might be caused by floating-point errors. However, changing the compiler optimization level from -O3 to -O2, as well as compiling with the FFTW3 library, did not lead to any noticeable change in the results.

For the same input file, when I use a version compiled by the supercomputing support team, this abrupt drop does not appear.
steady_slide.bishear (2.3 KB)
restart3000000.bishear (4.5 MB)
plot.py (2.3 KB)

This phenomenon becomes even more pronounced at lower loading velocities, for example when the system is loaded at 0.1 m/s.
Output result of the self-compiled version:


Output result of the version compiled by the platform support team:

This is far too long for proper debugging. Ideally, you prepare a case that reproduces the issue in seconds. Some advanced debugging tools slow down execution by a factor of 100 or more.
Also binary restart files are not always portable.

If you want me to take a serious look, you need to do what I ask. What you are doing is just poking into the dark and hoping to get lucky.

I have prepared a minimal test case that clearly demonstrates the issue I am encountering.

First, I shear the granular layer until it reaches a steady state. From that point, I restart the simulation and perform three different calculations:

  1. Self-compiled version, two separate runs (10000 steps each).
    The simulation is executed with two consecutive run 10000 commands.
    The output is shown below.


  2. Self-compiled version, single run (20000 steps total).
    The simulation is executed with one run 20000 command.
    The output is shown below.


  3. Platform-provided version, two separate runs (10000 steps each).
    The simulation is executed with two consecutive run 10000 commands using the platform-compiled executable.
    The output is shown below.


I have added the thickness of the sheared granular layer as an additional output to further illustrate the results.

From the results, it can be seen that:

  • For Run 1, both the restart and the second run command trigger a noticeable drop in shear force, resulting in two force drops.
  • For Run 2, the restart causes an initial drop in shear force.
  • For Run 3, neither the restart nor the additional run command leads to any physical change in the results.

Additionally, I would like to mention that my self-compiled version was built using a mixed toolchain (GNU compiler together with Intel MPI). I am not sure whether the issue could be related to this mixed compilation setup.

plot.py (2.6 KB)
steady_slide.bishear (2.4 KB)
restart3470900.bishear (4.7 MB)

Sorry, but this is all useless to me. Remember that I know nothing about your workflows, your research, and what you do different between the different “runs”. I cannot watch over your shoulder and thus cannot see what you do and how. Thus you have to explain everything to me.

I also already mentioned, that binary restart files are useless for testing with a different LAMMPS version.

You talk about 3 different runs, but there is only 1 input with only 1 run command.

I don’t have the time and patience to go over this so many times until you produce something that I can work with. I am not asking for something very complex or difficult, I just expected you to apply some common sense. So we are at a dead end here.

The only other recommendation that I can make is to suggest you download the platform independent serial Linux binary from: https://github.com/lammps/lammps/releases/download/patch_11Feb2026/lammps-linux-x86_64-11Feb2026.tar.gz
and test with that. This is compiled with a known to work correctly GCC compiler and should provide you with a reliable reference executable. If you’d rather use a binary matching your LAMMPS version, you can download it from: https://github.com/lammps/lammps/releases/download/stable_29Aug2024_update4/lammps-linux-x86_64-29Aug2024_update4.tar.gz

Thank you for your patient response — the issue has now been resolved.

As I mentioned earlier, I did not observe this non-physical sudden drop when running the same LAMMPS version on other platforms. After carefully rechecking, I realized that this was due to my own oversight, which also explains why I initially focused on the compilation options.

Based on my current tests, I summarize the results as follows:

For the 29 Aug 2024 and 29 Aug 2024 (update 4) versions, both GNU and Intel compilers reproduce the shear force discontinuity associated with the restart.

However, with the 4 Feb 2025 version, the results remain continuous and no such discontinuity appears.

Therefore, it seems that the difference is likely caused by changes in the source code between these versions.

1 Like