First of all, I apologize for asking this question, as it is more of a conceptual concern than a concrete software-related problem.
I am working with a system consisting of solid walls and a confined fluid between them, which I have carefully validated against a published paper. Out of curiosity, I ran several simulations using different random seeds for the Langevin thermostat and for the initial velocity assignment (I also slightly changed the numerical value of the initial temperature from 300 K to 299, 301, 298, and 302 K). Fortunately, the heat flux results from most of these simulations are very similar and close to each other.
However, in a few cases, the results fall outside the acceptable range (with differences of nearly 10%), which has raised some concerns for me.
I have reviewed several discussions in this forum regarding possible issues related to time step size, interatomic potentials, and neighbor lists, but all of my simulation parameters seem to be within commonly accepted ranges for research work :
SHAKE algorithm for water molecules
Time step: 1 fs
Neighbor list: 3.0 bin
Simulation protocol:
2 ns equilibration under NVT at 300 K
2 ns under NVE
6 ns with Langevin thermostat and applied heat source and sink
5 ns for averaging
To remove any remaining doubts and to ensure that a steady state is reached, I increased the Langevin thermostatting stage from 6 ns to 15 ns, but the results remain similar.
Do you have any recommendations regarding this issue? In some sources, I have read that reporting a result requires running multiple simulations with different initial conditions and averaging over them. Does changing the initial conditions simply mean changing the random seed and the initial particle velocities?
Am I facing an ambiguous or problematic situation, or is this type of behavior expected? How should I deal with those few simulations that show significantly different results?
First, you need to find out if this “fall outside the acceptable range” is simply statistical errors. After looking at your description this is the first thing comes to my mind (as you did not provide input files that allow others to reproduce your problem).
These are something I would consider to see if statistical errors are to be blamed:
plot the heat flux over time during the “averaging” part. Compare the curve between different runs and see if you observe systematic differences between them.
try again with longer time for averaging.
if computationally feasible, do e.g. 100 independent runs with different random seeds, and plot the distribution of results. Are you observing outliers?
I don’t know how large your system is, but maybe try increasing the system size.
If it is indeed not statistical errors (e.g., if you find several outliers in step 3), then the next step is to find the cause of the discrepancy. Again, it’s hard to make an educated guess without seeing the input files.
Your guidance was very helpful in clarifying this issue. I initially (and incorrectly) assumed that the heat flux itself was responsible for the discrepancies. In fact, I am calculating the Kapitza resistance (defined as the temperature jump divided by the heat flux). By plotting the heat flux as a function of time, I realized that the main source of the problem is not the heat flux, but rather the variability in the temperature of the last solid layers in contact with the fluid.
In other words, the random seed and the initial velocities affect the temperature of these interfacial solid layers, and it is this temperature difference that leads to out-of-range results, rather than fluctuations in the heat flux itself.
I sincerely appreciate your insightful guidance, as it helped me recognize a clear mistake in my earlier interpretation. I now need to address the issue of how to control or stabilize the layer temperatures.
I would also like to ask again: when reporting a final result, do you typically run multiple simulations with different initial conditions and average over them?
The basic idea is that your result must be physically meaningful. Usually it means that your result should not strongly dependent on things like random seeds, since they have no meaning / cannot be controlled in the real world. If my own simulation is strongly affected by these settings, the first thing I would consider is if the simulation setup makes sense at all. For example, maybe the equilibration time is not long enough, or unexpected events happened in the simulation. If the simulation setup is fine, then I would consider potential ways to reduce the discrepancy. For example, by using a larger system or longer simulation time.
Sometimes large differences between runs are unavoidable, e.g. measuring the average time needed to overcome an energy barrier. This is caused by the nature of the process studied, and cannot be really “solved”. In these cases you should have multiple runs and report the distribution of results.
I will carefully check and take into account all the points you mentioned. I sincerely appreciate your guidance and thank you for the time you took to respond to my questions.