LAMMPS stuck without any error

ljpss43 · February 2, 2023, 2:53am

ffield.Al&RDX&HMX.ReaxFF (15.5 KB)
Dear LAMMPS users:
I am running a reaxFF MD simulation and I expect to analyze the products by fix reaxff/species command. The timestep is 0.1 fs.
RUN.in (4.3 KB)

If I remove the fix reaxff/species command, the simulation can run without any problem. However, once I active the fix reaxff/species command, the simulation will stuck at a certain step without any error or warning. I also try to perform the same input file with different number of processors, but the only difference is the simulation will get stuck in different steps.
I know fix reaxff/species command will change the neighbor settings, so I choose a conservative setting as: 1 10 5000 for Nevery Nrepeat Nfreq respectively.
Could anyone explain this problem? Why fix reaxff/species command will affect the running of simulation?
HTPBsys.data (1.7 MB)

PS: I run simulations on both LAMMPS/3Aug2022 and LAMMPS/23Jun2022_update2 version, but the problem is the same.

akohlmey · February 2, 2023, 3:28am

Which step? Is it always the same step?

ljpss43 · February 2, 2023, 4:20am

Yes! Exactly, the 12000th step for 64 processors. If I change the processors from 64 to 48, it will stop at the 13000th step.

akohlmey · February 2, 2023, 4:27am

That is too many steps and too many processors for me to be able to reproduce and debug it on my desktop machine.

If you can run it again to make it get stuck. Can you then log into the compute node(s), look up the process IDs, attach the gdb debugger to the running processes with “gdb -p ”, and then get a few representative stack traces with “where” once inside gdb. Most are likely very similar, so I only need to see one or two of them, but the one for the “master process” (the one where lmp->comm->me == 0) may be different. This information could be helpful to build a hypothesis about what may go wrong.