MSD computation w/ msd/chunk and restart

Hello,

At NIST we are experiencing a problem where due to a pending HPC upgrade, the existing system can only carry a simulation for ~12-15 hours before crashing. This is transitory problem that will eventually be remedied but in the meantime it is creating some hiccups.

For equilibrium simulations we can just use restarts to continue jobs OK, but for diffusive calculations based on the equilibrium output it creates a problem. On the diffusive restart, the chain MSD computed with msd/chunk are reset to zero in contrast to the monomer MSD which remains continuous. We are using the same fix ID’s etc. in the restart file, as stated in the compute_msd_chunk documentation, so our restart procedure seems to be sound.

What we have come to is that the problem is that we compute the chain MSD as an average of a compute in a variable:

# Compute for molecules as chunks
compute cc1 all chunk/atom molecule

# compute average msd/chunk (for all molecules)
compute msd_molecule all msd/chunk cc1
variable msd_mol equal ave(c_msd_molecule[4])

We have tried modifying this for the restart in several ways using fix store/state.

  1. In the first approach we tried to store the average of all chain MSDs in a variable that stays in the restart file:

compute msd_molecule all msd/chunk cc1
variable msd_mol equal ave(c_msd_molecule[4])
fix store_msd all store/state 200 v_msd_mol

The calculation crashes with the following error:

ERROR: Fix store/state variable msd_mol is not atom-style variable (…/fix_store_state.cpp:288)

OK, we realized the error of our ways there.

  1. In a second approach we used a compute to store each chain’s MSD with its associated atoms and keep this in the restart file:

compute msd_molecule all msd/chunk cc1
compute spread_chains all chunk/spread/atom cc1 c_msd_molecule[4]
compute msd_chain all reduce ave c_spread_chains
fix store_msd all store/state 200 c_spread_chains

The calculation runs, but c_msd_chain still resets to zero after the restart, according to the thermo output.

Is there a method for doing a continuous chain MSD calculation with msd/chunk using restarts, or is that currently impossible?

Thanks,

~Fred

I am not sure what version of LAMMPS you are using. The documentation for compute msd/chunk says that using the same compute name should smoothly resume the MSD calculation from a restart file, so this is undocumented behaviour that counts as a bug.

Having said that, it is easy to patch MSDs together “in post”. With ∆r1 and ∆r2 being the displacements before and after the restart point, we have:

|∆r1 + ∆r2|² = |∆r1|² + (∆r1)•(∆r2) + |∆r2|².

These quantities are all easily available during postprocessing and the result is cheap to calculate.

Thanks for your response.

Our version of LAMMPS from the log files: LAMMPS (2 Aug 2023 - Update 1)

To recap, we do a compute for molecules as chunks:

compute cc1 all chunk/atom molecule

Then, compute the chain msd using msd/chunk for all molecules:

compute msd_molecule all msd/chunk cc1

In our first pass at computing average chain MSD vs. time, we created a variable that is the average of the chain MSD but this variable IS NOT continuous on restart:

variable msd_mol equal ave(c_msd_molecule[4])

So, we have now tried just putting out all the chain MSD data into its own file:

fix chain_msd all ave/time 10000 1 10000 c_msd_molecule[4] file ${FILENAME}.lammpscomp mode vector

But we were surprised to find that these values also reset on restart. Should it?! From the documentation is seems like it should be continuous with the correct fix ID’s which of course we have.

Interestingly, we had the idea to copy the last procedure from something we use on equilibration for chain Rg. We compute average chain Rg vs. time by averaging the rows in the following output and it is continuous on restart:

compute chain_rg all gyration/chunk cc1
fix 4 all ave/time 10000 1 10000 c_chain_rg file ${FILENAME}_Rg.lammpscomp mode vector

Why the different behavior for Rg and MSD wrt molecular chunks?

Finally, I also derived a formula that looks similar to what you have there to look at the relationship between the old MSD data and the new reset data. I guess that can be used but we would really just like to use LAMMPS natively if possible, unless absolutely forced.

It feels like this is an important issue for the entire LAMMPS community if it is a bug but I never like to jump to that conclusion as it is always possible that we are missing some important part of the procedure.

Thanks.

If you want somebody to look into that, you have to make it lot easier to debug this:

  • set up a very small, simple, and easy to run test system.
  • provide complete input decks and a lucid and easy to follow description for how to reproduce the issue
  • provide a detailed and easy to comprehend explanation of what output you expect (and why) and how this differs from what you get.
  • make certain that the issue can be reproduced with the latest LAMMPS feature release and the latest bugfix update to the stable release. Nobody wants to fix a bug that has already been fixed.

Post this all as a Bug Report issue on GitHub at: Issues · lammps/lammps · GitHub

What you have currently is something for a chat with colleagues, but too vague and too verbose.

Hi - this was indeed a bug in LAMMPS. The fix is posted as PR #4270 on the LAMMPS GitHub page. It is just a couple of lines of code in src/fix_store_global.cpp so if you like you can make those changes and try the attached 2 test files. Which are based on the info in your messages here. Thanks for calling our attention to the issue.
in.chain.restart (592 Bytes)
in.chain (820 Bytes)

PS: compute rg/chunk (nor other compute chunk commands) does not store time=0 state, like the msd/chunk command requires. Hence there was no need for that info to be restored when restarting.