I am running a free energy calculation on Rg for a polymer in water in LAMMPS using the COLVAR package. It is an NPT simulation with intel acceleration with an ABF acting on Rg.
I am seeing an error which seems to take place AFTER the simulation is done running. I don’t understand why this ought to happen. I have attached my simulation output. This is the final output message:
Ave neighs/atom = 373.35589
Ave special neighs/atom = 2.1235554
Neighbor list builds = 509
Dangerous builds = 0
colvars: Resetting the Collective Variables module.
Total wall time: 0:00:49
srun: error: stellar-i10n4: tasks 1-10,12,15,17-20,25,27,29,31-32,34-35,37-40,42-43,46-58,60-65,67-71,73-85,88-89,92-95: Segmentation fault (core dumped)
srun: Terminating StepId=968088.0
slurmstepd: error: *** STEP 968088.0 ON stellar-i10n4 CANCELLED AT 2023-09-19T17:37:58 ***
srun: error: stellar-i10n4: tasks 0,11,13-14,16,21-24,26,28,30,33,36,41,44-45,59,66,72,86-87,90-91: Terminated
srun: Force Terminated StepId=968088.0
You can see this in the file npt.out
.
As you can see, LAMMPS has also reported the total run time, so I assume the simulation has run its course, but then crashes out right after. What could be causing this?
I am running the following command on my cluster:
srun --ntasks=96 --nodes=1 --cpus-per-task=1 --exclusive lmp_colvar -sf intel -in npt.in > npt.out 2>&1
.
where sys.npt.data
is my data file, sys.pnipam.water.settings
is my settings file, colvars.inp
is my colvars input file, and npt.in
is my LAMMPS input file. I have attached all my input files to this message.
I would appreciate any advice you have for me.
npt.in (3.8 KB)
sys.npt.data (7.1 MB)
sys.pnipam.water.data (4.4 MB)
sys.pnipam.water.settings (6.4 KB)
colvars.inp (444 Bytes)