Setting seed with NVT simulation (Reproducibility)

arianaqyp · February 10, 2024, 12:03am

Hi everyone,

I’m currently attempting to replicate an NVT simulation. I’ve set the seed for the initial velocity and confirmed that the initial velocity is consistent. However, I’ve noticed that up to a certain time-step, the positions and velocities start to deviate. I understand that the default thermostat is Nose-Hoover, which is deterministic. Therefore, I would expect my simulation to be identical when run on the same machine.

My question is: where is the randomness coming from, and is there a way to address it? It’s a mystery because the positions and velocities remain the same up to a certain point. Any insights would be appreciated.

“”"
units metal #Use this units for DynaPhoPy
atom_style atomic

boundary p p p #Periodical conditions

read_data …/…/data/structure_SbSe.in #Structure file

pair_style deepmd …/…/data/pot_3.pb #trained potential
pair_coeff * * #Empirical potential

variable t equal 500 #Define the temperature

neighbor 2 bin
neigh_modify delay 0

timestep 0.002 #Define time step

thermo_style custom step etotal temp vol press
thermo 100

velocity all create $t 3627941 dist gaussian mom yes
velocity all scale $t

fix 1 all nvt temp $t $t 0.1 #Canonical ensemble

dump 1 all custom 1 nvtpositions.lammpstrj id xu yu zu
dump_modify 1 sort id
dump_modify 1 format line “d .16f .16f .16f”

dump 2 all custom 1 velocity.lammpstrj id vx vy vz
dump_modify 2 sort id
dump_modify 2 format line “d .16f .16f .16f”

run 40000 #Number of timesteps
“”"

akohlmey · February 10, 2024, 2:00am

MD simulations are solving coupled partial differential equations and thus are essentially chaotic systems and thus subject to the so-called Lyapunov instability, which is more commonly known as the “Butterfly effect”. In other words, the tiniest difference will eventually grow exponentially and trajectories will after some time be decorrelated.

This is often facilitated by the fact that we are using double precision floating point math where the order of operations matters since it is not associative. Thus anything that changes the order of atoms or order in which data (like forces) is summed will eventually lead to the kind of divergence you are noticing. This is particularly possible with GPUs, but also when changing the number of processors.

True reproducibility and trajectories that can be fully reversed require fixed precision math, not using any accelerators, using the exact same number of processors and fix nve.

Also, the LAMMPS developers have no knowledge about the implementation of deepmd since this is developed and distributed independent from LAMMPS.

Difficult to say since there are several unknowns, e.g. are you using the exact same number of processors, the exact same executable. Your velocity command, for example, will create the same velocity distribution but not the exact same individual velocities with a different number of MPI ranks. You have to use the additional keywords “loop geom” (see the docs for details) for that. Yet the slow creeping divergence cannot be avoided for different processor counts.

Not really without using a completely different software implementation.

It really doesn’t matter since the distributions of the observables remain the same once there is enough data accumulated, even if divergent, since you are sampling the same phase space. In fact, there are methods like PRD that take advantage of that by running multiple decorrelated trajectory segments on different processor groups concurrently.

arianaqyp · February 10, 2024, 8:59pm

Hi akolmey,

Thank you so much for your prompt reply.

To provide more information about my simulation setup, I’m using a serial (non-MPI) executable with LAMMPS (‘lmp_serial’), which means I’m running it on a single processor. I’m not very familiar with MPIs, but I don’t think it matters in this case since I’m only using a single processor.

With that in mind, do you have any guesses on what could be happening? I’m considering round-off compounding errors in the system, but I just can’t pinpoint where they might be occurring. Nonetheless, your explanation has been very detailed and insightful, so thank you!!

Also, when you mention “fix nve,” how does that relate to the NVT simulation? I came across a thread on “[Difference between nve+langevin and nvt dynamics!]” Do you mean conducting an “alternative” NVT simulation or adding an additional fix command?

Ariana

akohlmey · February 11, 2024, 1:21am

As a trained scientist, I don’t like to guess, but rather prefer to draw conclusions from data that I can see or can reproduce.

To re-iterate, to get the exact same results for an input file, you need to use the exact same executable, with the exact same external libraries and packages it uses, with the exact same command line, on the exact same computer, and with the exact same input deck. Any difference (e.g. an update of the math library or c library) can lead to those tiny differences required for those exponential divergences.

Since your pair style is not part of LAMMPS, I cannot say anything about it. You need to contact its developers. But if you can reproduce the same kind of divergences from any of the example inputs bundled with LAMMPS, e.g. the input decks in the “bench” folder, please let us know how to reproduce this and what exact differences compared to what you see where.

If you comply to the conditions I’ve listed here, round-off should not matter, since it will be the same roundoff every time. After all, you are using a computer and computers always compute exactly the same thing if they get fed the same thing. The only exception to that is when your computer has a defect or misconfiguration (e.g. gets too hot so that individual bit can flip randomly). But most of those will lead to random crashes and not these small, exponentially growing divergences.

What I mean is: only an NVE ensemble simulation with a fixed precision MD simulation code (i.e. not LAMMPS) can be perfectly reversed. That means if you flip the sign of the velocities and then run the same number of steps a second time, you will get to exactly the initial state. Any “manipulation” of forces or velocities, e.g. a Nose-Hoover thermostat or a Langevin thermostat will interfere with that.