I am facing difficulties in completion of some of my LAMMPS jobs when I perform biased simulations using Colvars. The MD simulations are completed in a timely manner but for some reason, the jobs do not terminate. Instead, the jobs keep running until the wall time and then they get killed. Consequently, I do not get any output or the configuration files. What I receive is just an output file from my supercomputer (.o file) in which the thermo_style parameters (timesteps, temperature, etc. whatever I indicate) are printed. This is how I know that the simulations are finished and that the allotted wall time wasn’t inadequate. However, I do not receive the dump configurations or the trajectory files.
It is surprising to note that this issue occurs in only ~15% of my Colvars job submissions, i.e., even if I run the same job over and over again, it would successfully finish ~85% of the times but won’t finish ~15% of the times. Also, this issue is not specific to any particular simulation system. I have faced this issue in different simulation systems in which I had different collective variables, e.g. distanceZ and coordNum collective variables. I am attaching here a sample of unfinished jobs of both these kinds (haven’t attached the data files because of space constraints).
I tried different versions of LAMMPS starting from 2018 through 2020 but faced this problem in all these versions. I do not face this issue in straightforward MD simulations, i.e. simulations not using Colvars. I have asked my supercomputer helpdesk for any possible installation bugs but they have checked and verified that there is no such fault in their LAMMPS software deployment.
Any help is much appreciated!
sample_unfinished_jobs.rar (277 KB)