Segmentation fault: address not mapped to object at address 0xc2cfb87c

Dear all,
I have met a problem about Segmentation fault.
I don’t know if it’s because of the lack of RAM on the Linux system, or if it’s because the ffield is not the right one
I’m grateful for your help :smile:

Following are some details :

input files

# 3D LAMMPS simulation of oxygen and hydrogen combustion
units           real
dimension       3
boundary        p p p
neighbor 2.0 bin
neigh_modify	every 1 delay 0 check yes

# Set atom style and read data files
atom_style      full
read_data 6.data

# Set potential
pair_style   reaxff NULL 
pair_coeff      * * CHOFeAlNiCuSCrXNCl.ff  C H O Fe N
fix  8  all   qeq/reax 1 0.0 10.0 1e-6 reaxff

region mobile block INF INF INF INF 13 87 units box
group mobile region mobile
group C type 1
group H type 2
group O type 3
group N type 4
region boundary block INF INF INF INF 0 13 units box
group boundary region boundary
group Fe_boundary type 5
group O_boundary intersect boundary O
group O_boundary type 6

velocity boundary set 0.0 0.0 0.0
fix 5 Fe_boundary setforce 0.0 0.0 0.0
fix 6 O_boundary setforce 0.0 0.0 0.0

min_style cg
minimize 1.0e-4 1.0e-6 100 1000
reset_timestep 0

fix  9    all nvt  temp 1500.0 1500.0 100
fix  10  all reax/c/species 1 1 100 species.out element C H O Fe N
fix  11  all reaxff/bonds 1000 bonds.reaxff

# Set up simulation
timestep        0.1
thermo          1000
thermo_style    custom step temp press pe ke etotal cpu time
thermo_modify flush yes
dump         1 all custom 1000 dump.lammpstrj id mol type q x y z

restart 500000 restart.equil
run  5000000

error message

srun: ROUTE: split_hostlist: hl=b01r3n[0203-0204] tree_width 0
LAMMPS (3 Aug 2022)
OMP_NUM_THREADS environment is not set. Defaulting to 1 thread. (src/comm.cpp:98)
  using 1 OpenMP thread(s) per MPI task
Reading data file ...
  orthogonal box = (0 0 0) to (33.7759 33.7759 112.7503)
  2 by 3 by 8 MPI processor grid
  reading atoms ...
  1252 atoms
Finding 1-2 1-3 1-4 neighbors ...
  special bond factors lj:    0        0        0       
  special bond factors coul:  0        0        0       
     0 = max # of 1-2 neighbors
     0 = max # of 1-3 neighbors
     0 = max # of 1-4 neighbors
     1 = max # of special neighbors
  special bonds CPU = 0.011 seconds
  read_data CPU = 0.085 seconds
WARNING: Changed valency_val to valency_boc for X (src/REAXFF/reaxff_ffield.cpp:296)
576 atoms in group mobile
80 atoms in group C
360 atoms in group H
484 atoms in group O
288 atoms in group N
672 atoms in group boundary
40 atoms in group Fe_boundary
384 atoms in group O_boundary
384 atoms in group O_boundary

CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE

Your simulation uses code contributions which should be cited:
- pair reaxff command:
- fix qeq/reaxff command:
The log file lists these citations in BibTeX format.

CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE

Neighbor list info ...
  update every 1 steps, delay 0 steps, check yes
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 12
  ghost atom cutoff = 12
  binsize = 6, bins = 6 6 19
  2 neighbor lists, perpetual/occasional/extra = 2 0 0
  (1) pair reaxff, perpetual
      attributes: half, newton off, ghost
      pair build: half/bin/newtoff/ghost
      stencil: full/ghost/bin/3d
      bin: standard
  (2) fix qeq/reax, perpetual, copy from (1)
      attributes: half, newton off
      pair build: copy
      stencil: none
      bin: none
Setting up cg style minimization ...
  Unit style    : real
  Current step  : 0
Per MPI rank memory allocation (min/avg/max) = 11.67 | 18.07 | 23.56 Mbytes
   Step          Temp          E_pair         E_mol          TotEng         Press     
         0   0             -124476.98      0             -124476.98      1567.452     
        13   0             -128062.43      0             -128062.43     -4093.4537    
Loop time of 0.974932 on 48 procs for 13 steps with 1252 atoms

98.5% CPU use with 48 MPI tasks x 1 OpenMP threads

Minimization stats:
  Stopping criterion = energy tolerance
  Energy initial, next-to-last, final = 
     -124476.980803798  -128057.742224055  -128062.430651954
  Force two-norm initial, final = 2431.7458 362.80936
  Force max component initial, final = 264.74512 89.090493
  Final line search alpha, max atom move = 0.0017649401 0.15723938
  Iterations, force evaluations = 13 26

MPI task timing breakdown:
Section |  min time  |  avg time  |  max time  |%varavg| %total
---------------------------------------------------------------
Pair    | 0.011336   | 0.1172     | 0.47499    |  36.4 | 12.02
Bond    | 1.9922e-05 | 2.6383e-05 | 3.866e-05  |   0.0 |  0.00
Neigh   | 0          | 0          | 0          |   0.0 |  0.00
Comm    | 0.0028405  | 0.062982   | 0.16893    |  22.2 |  6.46
Output  | 0          | 0          | 0          |   0.0 |  0.00
Modify  | 0.41541    | 0.41968    | 0.42008    |   0.1 | 43.05
Other   |            | 0.375      |            |       | 38.47

Nlocal:        26.0833 ave         124 max           0 min
Histogram: 24 8 5 4 1 0 0 0 4 2
Nghost:        537.146 ave         947 max         126 min
Histogram: 6 2 10 2 10 0 0 6 2 10
Neighs:        3691.46 ave       25325 max           0 min
Histogram: 37 5 0 0 0 0 0 0 4 2

Total # of neighbors = 177190
Ave neighs/atom = 141.52556
Ave special neighs/atom = 0
Neighbor list builds = 0
Dangerous builds = 0
Setting up Verlet run ...
  Unit style    : real
  Current step  : 0
  Time step     : 0.1
Per MPI rank memory allocation (min/avg/max) = 23.54 | 30.79 | 36.44 Mbytes
     ''''
  72000   1486.503      -3404.3489     -124513.62      5543.16       -118970.46      1496.1602      7200         
     73000   1485.241      -4173.5483     -124675.55      5538.4537     -119137.1       1516.427       7300         
     74000   1506.7218     -3471.485      -124643.93      5618.5558     -119025.37      1537.69        7400         
     75000   1480.0038     -3514.2526     -124364.7       5518.9243     -118845.78      1558.8846      7500         
     76000   1551.5321     -3857.8025     -124440.07      5785.653      -118654.42      1579.7286      7600         
     77000   1502.3437     -3884.5452     -124329.25      5602.2296     -118727.02      1600.5859      7700         
     78000   1472.3741     -4372.4408     -124384.48      5490.4731     -118894.01      1622.7955      7800         
     79000   1436.3456     -3704.0996     -124189.29      5356.1232     -118833.16      1644.811       7900         
     80000   1505.563      -3419.6018     -124377.02      5614.2344     -118762.78      1666.3343      8000         
     81000   1533.7328     -3532.3972     -124342.17      5719.2796     -118622.89      1688.3002      8100         
     82000   1489.3653     -3821.754      -124287.71      5553.8333     -118733.88      1709.5499      8200         
[b01r3n0203:14848:0:14848] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xc2cfb87c)
==== backtrace (tid:  14848) ====
 0 0x000000000004d455 ucs_debug_print_backtrace()  ???:0
 1 0x00000000011c090b _INTERNALbecf47cc::ReaxFF::Validate_Lists()  /work/home/jsyadmin/L4/wys/LAMMPS/shangcheng/intel/lammps-3Aug2022_install_intel/source/src/REAXFF/reaxff_forces.cpp:112
 2 0x00000000011c090b _INTERNALbecf47cc::ReaxFF::Validate_Lists()  /work/home/jsyadmin/L4/wys/LAMMPS/shangcheng/intel/lammps-3Aug2022_install_intel/source/src/REAXFF/reaxff_forces.cpp:112
 3 0x00000000011c0681 ReaxFF::Compute_Forces()  /work/home/jsyadmin/L4/wys/LAMMPS/shangcheng/intel/lammps-3Aug2022_install_intel/source/src/REAXFF/reaxff_forces.cpp:374
 4 0x00000000011c0681 ReaxFF::Compute_Forces()  /work/home/jsyadmin/L4/wys/LAMMPS/shangcheng/intel/lammps-3Aug2022_install_intel/source/src/REAXFF/reaxff_forces.cpp:377
 5 0x00000000011ad87a LAMMPS_NS::PairReaxFF::compute()  /work/home/jsyadmin/L4/wys/LAMMPS/shangcheng/intel/lammps-3Aug2022_install_intel/source/src/REAXFF/pair_reaxff.cpp:482
 6 0x00000000009ff3f2 LAMMPS_NS::Verlet::run()  /work/home/jsyadmin/L4/wys/LAMMPS/shangcheng/intel/lammps-3Aug2022_install_intel/source/src/verlet.cpp:316
 7 0x000000000098deec LAMMPS_NS::Run::command()  /work/home/jsyadmin/L4/wys/LAMMPS/shangcheng/intel/lammps-3Aug2022_install_intel/source/src/run.cpp:176
 8 0x0000000000791748 LAMMPS_NS::Input::execute_command()  /work/home/jsyadmin/L4/wys/LAMMPS/shangcheng/intel/lammps-3Aug2022_install_intel/source/src/input.cpp:845
 9 0x000000000078eb0b LAMMPS_NS::Input::file()  /work/home/jsyadmin/L4/wys/LAMMPS/shangcheng/intel/lammps-3Aug2022_install_intel/source/src/input.cpp:301
10 0x000000000040416f main()  /work/home/jsyadmin/L4/wys/LAMMPS/shangcheng/intel/lammps-3Aug2022_install_intel/source/src/main.cpp:98
11 0x00000000000223d5 __libc_start_main()  ???:0
12 0x0000000000404029 _start()  ???:0
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 0 PID 14842 RUNNING AT b01r3n0203
=   KILLED BY SIGNAL: 9 (Killed)

Thank you in advance. :grin:

  • Have you visualized the trajectory?
  • Have you researched other discussions on problems with ReaxFF in the forum and mailing list archive?
  • Have you tried the latest LAMMPS release (which has two years worth of bugfixes compared to the one your are using)?
  • Have you tried using the KOKKOS variant of ReaxFF (no GPU needed, if compiled accordingly) which is known to be more reliable?
  • Have you tried breaking your run into sections, i.e. use multiple run commands with fewer steps?
  • Have you tried running without fix reaxff/species and fix reaxff/bonds?

Thank you very much for your patient and professional advice I’ll answer these points you raised as follows

  • I’ve seen the trajectory in ovito and it seems to be working fine.
  • I’ve also browsed the forums for answers to this part of the problem, and most of the advices were that the node’s memory was full, so I switched to a single-core run in windows, it seems to be working reliably now, even if it’s slow.
mpiexec -np 1 lmp -in in.--
  • I haven’t used the latest version of lammps, we were taught that previous versions of lammps were probably more stable, so I didn’t pay much attention to the updates.
  • As for the KOKKOS version of reaxff, I don’t know much about it yet, so I’ll add my expertise on that in the future.
  • I’ve tried reducing the step size from 0.1fs to 0.01fs, but that doesn’t seem to help much.
  • I haven’t tried running fix reaxff/species and fix reaxff/bonds because I’m focused on the output of combustion process of ammonia and ethanol under Fe3O4. I’m not sure that’s going to get it right. But I’ll try that method later.
  • Limited by simulation conditions, I usually use the windows version of lammps if it seems to work I’ll only run the file on a supercomputer Linux system

Thanks again for your valuable advice, and I hope I’m not interrupting anything, I’ll be practicing it and trying to add to my knowledge.
[/quote]

It is very unlikely that a segmentation fault will be cause by an out-of-memory scenario these days and the fact that you can run with a single process confirms that. This rather hints at a design choice of the original ReaxFF code (in C): it was making guesses how large certain arrays would become during a run at the most. However, this assumption is only valid for pre-equilibrated bulk systems. When running in parallel and starting from a non-equilibrated configuration the changes can be larger than that and they are relatively larger the more processes are using in parallel. This is where the KOKKOS variant is to be preferred because it uses a more robust memory management approach.

That statement is nonsense. If anything, LAMMPS has become better at spotting situations where there is memory corruption or invalid settings are used and thus it will terminate with an error instead of continuing and producing bogus results.
Making such statements and not providing information about how more recent versions of LAMMPS are less stable to the LAMMPS developers is also very bad practice.

If the cause is the change of geometry, as I am speculating above, then changing the timestep will only delay the incident of the segfault as it will take 10x more timesteps to reach a situation where the geometry change requires changes in the data storage beyond the initially guess thresholds.

1 Like

I highly suggest using the KOKKOS package for ReaxFF, works in serial for CPUs too.

1 Like

A general comment – yes, computer software does generally degenerate with time, as Wirth’s Law states.

But that’s because most computer software is the hard way to do easy things. Microsoft Word is a gigabyte-sized typewriter; it’s not particularly obvious how you can make a better typewriter. (After forty years, presumably wandering in the wilderness, they now have a shortcut for pasting plaintext.)

Scientific software, on the other hand, is the easy way to do hard things. MD packages are the only way to numerically integrate equations of motion to sample the dynamics of a particle system of any sensible size. Simulation fidelity to physical models and particle-steps per CPU-hour (through strong and weak scaling) are objective metrics of improvement.

As such, for most scientific software, the latest version is usually the fastest and most stable version (yes, even when you Get Really Ornery Mastering Advanced Chemical Simulations). LAMMPS in particular houses many niche techniques for which the initial code is generally robust and performant, but as the technique gains wider acceptance, more people test it for more diverse systems and reveal edge case bugs that are then patched. That’s when using the latest version is particularly important to make use of these improvements.

Indeed, if you or your seniors have concrete cases of a new version of LAMMPS “breaking” older simulations, you should let us know. (Please do the same and inform developers of bugs for any scientific software you use!!) Software correctness is fundamental to reproducible computational science. A minus turned into a plus can warrant paper retractions. Reporting software bugs is vital community service – sadly, it is not rewarded with papers and citations, but that’s academia’s fault, not ours.

4 Likes

I do appreciate your timely help. Everything is looking up. :smiling_face_with_three_hearts: :smiling_face_with_three_hearts:

I’m very much indebted to you for you friendly advice.This is very inspiring to me. :smiling_face_with_three_hearts: :smiling_face_with_three_hearts:

In order to help other LAMMPS users like you, could you please elaborate on what the problem was and how you solved or worked around it.

The forum is archived (including the discussions from the mailing list we used before that) so that people can search for help in it for all those little things that are too specific to write them into the manual.

Thanks in advance.