On using small simulation cells in LAMMPS

AnzeH · February 23, 2023, 12:56pm

Dear all,

as a bit of a background, I am working on water adsorption in some zeolite (monoclinic unit cell, lattice constants of about 10 Angstrom). I am describing water-zeolite interactions using pair style lj/cut/coul/long. My task is to tune the LJ parameters since using the published ones do not reproduce the experimental data I have available.
I have computed water-zeolite g(r)s using ab-inito MD and my strategy is to tune the LJ parameters so as to match the AIMD g(r)s with classical ones as much as possible. For AIMD runs I was obviously using only a single unit cell so my first attempt was to use a single unit cell also for classical runs. Here I ran into some questions which I was not able clarify from the manual.

Let’s say I am using 12.0 Ang. cutoff for all interactions. With codes that use minimum image convention this would not be possible, but the author of the paper in this link (Cookie Absent) claims that in LAMMPS one can use such cutoff descpite using small simulation cells (the same in the docs: 4.4.2. Communication — LAMMPS documentation). But if I understand correctly, one will still see artefacts due to PBCs?
I also ran some simulations with 2x2x2, 3x3x3 and 4x4x4 supercells. Nonetheless, one could still argue that in terms of number of atoms these simulations boxes are small (1x1x1 unit cell contains 72 atoms). I tried to run these in parallel using MPI with -np 2, -np 4 and -np 8. I am seeing quite significant changes in g(r) for different np despite keeping the box size the same. Should I be using neighbor nsq instead of neighor bin? The docs recommend this setting for “small and sparse” systems and I was wondering if this could be the reason I am seeing such behavior. I am aware though that in terms of efficiency parallel runs are useless for such small systems but I also plan to run larger cells and I wanted to clarify this behavior first.

Thank you,
Anze

akohlmey · February 23, 2023, 2:38pm

Yes, you have “finite size effects”. Those can be direct, that is the atom to the right is a copy of the atom to the left, or indirect, that is the atom to the right is directly interacting with the atom to the left. But there is also the limitation that the box size limits the wavelength of any phonons that can form. Since the coulomb interactions in water are strong those finite size effects must not be underestimated.

With water/zeolite systems, you are opening a particularly difficult to deal with can of worms, too. Because of the regular structure of the zeolites, you have a long-range structure “imprinted” of the water that is interacting with them and thus it will be much more difficult to asses whether a system is properly equilibrated and you have good statistical sampling compared to bulk water (shameless plug: I worked on long-range structure in bulk water as an undergrad. Long-range Structures in Bulk Water. A Molecular Dynamics Study ). In addition, water zeolite systems are notorious for being extremely difficult to equilibrate. It has been a long time since this was a “hot” topic and I discussed with people about this, but is my recollection that people tend to require hybrid MD/MC calculations to get meaningful results or resolve to plain MC even, since the time scales of MD to induce the necessary local fluctuations are too long.

This choice should not affect the content of the neighbor list, only how much time it takes to assemble them.

No. When you run with a different number of processors (be it using MPI or OpenMP or GPU), you are summing up forces in a different order and because you are using floating point math, which is not associative, this can result in small differences in the resulting forces per atom (depending on the variation of magnitude of individual contributions). Those will lead to (exponentially) diverging trajectories (since MD is chaotic), which is sometimes also referred to as Lyapunov instability. You can see similar effects from using different random number seeds for the initial velocity assignments. In the long run and averaging over a sufficient number of trajectory frames, the results should converge, assuming that your simulation is able to properly sample the available phase space. But this is where the water-zeolite system comes back to haunt you, since the potential hypersurface is extremely rugged in this case and thus good sampling is almost impossible unless you “push” the system along using MC moves (which will have the downside that your time scale and dynamics are no longer meaningful).

Please note, that doing larger systems will make it even harder to a) equilibrate the systems and b) maintain ergodicity (i.e. have a sufficiently representative sampling of the available phase space). So you will have to do much longer trajectories. I would not be surprised if you need about an order of magnitude longer for doubling the system. Of course, you must use larger systems to reduce the finite size effects.

Bottom line: you are between a rock and a hard place.

AnzeH · February 23, 2023, 3:16pm

Thank you a lot, this was extremely informative.