Properly calculate average box volume

Hello, I am trying to calculate the average volume of my box during a NPT calculation, and compare the result from the output data file to be sure that I get what I want, but these two do not agree.

I use a fix all ave/time command with the same Nevery, Nrepeat, and Nfreq both to print the out on the file.dat and to calculate the average volume inside Lammps. The time is resetted to zero before the definition of the two fixes.

For instance, I am setting:

variable  Nevery         equal 10
variable  Nrepeat        equal 10
variable  Nfreq          equal 1e2
variable  print_thermo   equal 1e3
variable  nvt_nsteps     equal 1e4
variable  Vol            equal vol

thermo    ${print_thermo}

fix       0 all ave/time ${Nevery} ${Nrepeat} ${Nfreq} c_temper c_press v_Vol file equilibration.dat
fix       1 all npt temp 300 300 100.0 iso 1.0 1.0 1000.0
fix       2 all ave/time ${Nevery} ${Nrepeat} ${Nfreq} v_Vol #ave one
variable  mean_Vol equal f_2
run       ${nvt_nsteps}
print     "Average volume after pre-NPT = ${mean_Vol}"

Actually I think the results should be the same independently of the specific N chosen. I tried with the options ave one, ave window 1 or just none, and it seems the result isn’t the same.

For more context, I want to calculate the average volume in NPT so that I can rescale the box and fix the volume in NVT to calculate GK viscosity. If I use the one printed by the fix, the cell looks too big (and the average pressure slightly negative).

Thank you
Marco

tmp

Hi Marco,
I can’t comment on the use of a variable to call a fix ave/time (f_2), as I never use it this way. I can however confirm that measuring the average value from f_0 as you do is a valid approach, as long as you disregard the beginning of the simulation for which the volume did not converge to its equilibrium value…
Simon

edit: After looking closer at your input, the value of ${mean_Vol} from the print command makes sense, the variable mean_Vol is called at the very end of the simulation, so it returns its last value that was obtained by averaging over the last 100 steps.

There is not enough information here to make a meaningful assessment.
For example, have you established that the system is already fully equilibrated?
how large is the system and how large the volume fluctuations and what is the typical frequency of those?

Are you sure that 10000 MD steps is a long enough simulation to get a proper average? Specifically for volume fluctuations the characteristic times can be rather long. It is much easier to expand a dense system than to condense it (again).

Thank you for your answers. Indeed in the real calculation the volume is averaged after equilibration is met. The system contains 1039 atoms and the input files were generated with packmol/moltemplate with input files from the ATB.

The full simulation includes (1) energy minimisation, (2) NVT equilibration, (3) pre-NPT equilibration to reach the equilibrium volume, (4) NPT equilibration and finally (5) MD run (NVT to calculate GK). The time step is 1fs.

Please find below the full calculations. Note that I rescale the last volume before fixing NVT in (5) in order to meet the average volume obtained in step (4). In the figure, the average of the volume (blue) does not correspond to the output of the fix 2. Thus, the volume of the MD production run sits on the wrong value.

@simongravelle if you have other suggestions to calculate the average I am happy to consider them. I imagined defining a volume counter to increment at every step and divide by the number of steps, but this is deprecated in a number of lammps posts.

Thank you

# ----------------- Init Section -----------------

include        "system.in.init"
read_data      "system.data"
include        "system.in.settings"

# Thermodynamical variables
variable        TK equal 300
variable        PBAR equal 1.0
velocity        all create ${TK} 12345

# Box variable (used for rescaling)
variable        lx equal lx
variable        ly equal ly
variable        lz equal lz
variable        Vol equal vol

# Time settings
variable        dt           equal 1      # 1fs
variable        Nevery       equal 10
variable        Nrepeat      equal 100
variable        Nfreq        equal 1000
variable        print_thermo equal 1000
variable        nvt_nsteps   equal 1e5
variable        npt_nsteps   equal 1e6
variable        md_nsteps    equal 1e6

# Compute
compute         temper  all temp
compute         press   all pressure temper
compute         pot_ene all pe
compute         kin_ene all ke
variable        tot_ene equal c_kin_ene+c_pot_ene

# Print settings
neigh_modify    one 10000  page 120000
timestep        ${dt} # Timestep of in fs
thermo          ${print_thermo}
thermo_style    custom step time temp press vol density pe ke v_tot_ene ecoul epair ebond eangle edihed
variable        aver_freq equal (v_npt_nsteps)/(v_print_thermo)
print           "aver_freq = ${aver_freq} "

# Energy Minimization
print "--------------- EM starts ------------"
minimize        1e-4 1e-6 1000 10000
print "---------------  EM ends  ------------"

reset_timestep  0
fix             0 all ave/time ${Nevery} ${Nrepeat} ${Nfreq} &
                c_temper c_press v_Vol c_pot_ene c_kin_ene v_tot_ene file equilibration.dat
dump            0 all custom 1000 equilibration.lammpstrj id type x y z

# NVT Equilibration
print "--------------- NVT starts ------------"
fix             1 all nvt temp 300 300 100.0
run             ${nvt_nsteps}
unfix           1
print "--------------- NVT ends  ------------"

# NPT Equilibration
print "--------------- pre-NPT starts ------------"

# pre-npt equilibration to reach average volume value
fix             1 all npt temp 300 300 100.0 iso 1.0 1.0 1000.0
fix             2 all ave/time ${aver_freq} 1 ${nvt_nsteps} v_Vol ave one
variable        mean_Vol equal f_2
run             ${nvt_nsteps}
print           "Average volume after pre-NPT = ${mean_Vol}"
unfix           2
print "--------------- pre-NPT ends ------------"

print "--------------- NPT starts ------------"
# npt equilibration with volume averaging around equilibrium
fix             3 all ave/time 1 ${aver_freq} ${npt_nsteps} v_Vol ave one
variable        mean_Vol equal f_3
run             ${npt_nsteps}
print           "Average volume after NPT = ${mean_Vol}"
print           "Last volume after NPT = ${Vol}"

# Calculate the scale factors for each dimension
variable        scaleX equal (v_mean_Vol/v_Vol)^(1.0/3.0)
variable        scaleY equal (v_mean_Vol/v_Vol)^(1.0/3.0)
variable        scaleZ equal (v_mean_Vol/v_Vol)^(1.0/3.0)

print           "scaleX = ${scaleX}"
print           "scaleY = ${scaleY}"
print           "scaleZ = ${scaleZ}"

# Apply the scaling to fix the volume
change_box  all x scale ${scaleX}      &
                y scale ${scaleY}      &
                z scale ${scaleZ}      &
                remap

print           "New volume value after change_box = ${Vol}"
print "--------------- NPT ends  ------------"

undump          0

unfix           0
unfix           1
unfix           3

reset_timestep  0

# Production run
print "--------------- MD starts ------------"

dump            myDump all custom 1000 md_run.lammpstrj id type x y z vx vy vz fx fy fz
fix             1 all ave/time ${Nevery} ${Nrepeat} ${Nfreq} &
                c_temper c_press v_Vol c_pot_ene c_kin_ene v_tot_ene &
                file md_run.dat
fix             2 all nvt temp 300 300 100 # Nose-Hoover

# Diffusion coefficient for Lithium
# Average the msd compute over time and output to a file
# Using vector syntax (not array)

group           Li type 1
compute         msdLi Li msd
fix             msd all ave/time ${Nevery} ${Nrepeat} ${Nfreq} &
                c_msdLi file msd_Li.txt mode vector

variable        p equal 100     # correlation length
variable        s equal 5       # sample interval
variable        d equal $p*$s   # dump interval

# convert from LAMMPS real units to SI

variable        kB equal 1.3806504e-23    # [J/K] Boltzmann
variable        atm2Pa equal 101325.0
variable        A2m equal 1.0e-10
variable        fs2s equal 1.0e-15
variable        convert equal ${atm2Pa}*${atm2Pa}*${fs2s}*${A2m}*${A2m}*${A2m}

variable        pxy equal pxy
variable        pxz equal pxz
variable        pyz equal pyz
fix             SS all ave/correlate $s $p $d &
                v_pxy v_pxz v_pyz type auto ave running
variable        scale equal ${convert}/(${kB}*${TK})*${Vol}*$s*${dt}
variable        v11 equal trap(f_SS[3])*${scale}
variable        v22 equal trap(f_SS[4])*${scale}
variable        v33 equal trap(f_SS[5])*${scale}

variable        timestep equal step
compute         myStress all stress/atom NULL
dump            stress all custom 10000 stress.txt &
                c_myStress[1] c_myStress[2] c_myStress[3] &
                c_myStress[4] c_myStress[5] c_myStress[6]

run             ${md_nsteps} #0.1ns
variable        eta equal (v_v11+v_v22+v_v33)/3.0
variable        ndens equal count(all)/vol
print          "average viscosity @ ${TK} K = ${eta} (Pa.s) / ${ndens} (atoms/A^3)"

print "---------------  MD ends  ------------"

When I look at the data, it seems to me, that you need to do more equilibration and also need to average over more data to get a meaningful average of the volume. Only the second half of “step(4)” seems to be somewhat consistent (the first half may still have some memory of some initial setup non-equilibrium configuration) and in that second half you appear to have only about 5 periods of the characteristic vibration of the cell volume. That is clearly not enough to determine whether you have reached and maintained equilibrium. Of course you would need to average over the entire step(4) data starting from 500 time units to get a reliable average.

What is making the situation much more difficult is the extremely small size of the system with only about 1000 atoms. This will lead to rather large pressure fluctuations, and thus larger volume fluctuations and there are likely finite size issue. Thus for reliable statistics and data you would not only need to run for a longer time, but also over a significantly larger system (easily 10x larger, but 100x larger would not be excessive either).

Thank you for your suggestions. I tested the same structure with 10k atoms and with larger time - and it’s clearer that after 5M timesteps of 1fs the average volume is still decreasing. The numpy average of the pressure values in the MD phase is still 40 atm far off.

I increased the NPT pre equilibration to a total of 1e7 time steps, and I get an average pressure around 1 atm - I guess I can take this as a good result?

I stopped this calculation and resent it with a lower output frequency and waiting for its results (it is still in the pre-NVT phase), although having less points make the average probably worse.

However, my initial question was about the correctness of the analytical derivation. Shouldn’t the average calculated by the fix 2 be analytically the same as the average of the values printed out in equilibration.dat no matter the number of points/state of the equilibration?

Thank you

It is not easily possible to figure out what you are comparing to what here since your input file is complex and riddled with variables and thus extremely difficult to read and figure out from the outside.

You may notice that in every case you have investigated so far, the printed value of f_2 seems unrelated to the average of the f_0 printouts and suspiciously close to the very last value in the f_0 printout instead.

This may give you enough of a hint to attempt the scientific method. It always helps to have a short example to quickly iterate over hypotheses, so maybe you can experiment with a short run, like 100 steps?

Thank you @srtee, I believe you are right about try the method on a short run, and I have tried a very short run at the very beginning. What do you think is to be modified there?

Thanks