[lammps-users] PPPM dispersion water model for highly inhomogeneous system - parameter setup advice

ericabird · April 3, 2022, 5:51pm

Hello,

I have successfully implemented a PPPM dispersion water model which is essentially identical to the approach taken in: R. E. Isele-Holder, W. Mitchell, J. R. Hammond, A. Kohlmeyer, and A. E. Ismail, “Reconsidering Dispersion Potentials: Reduced Cutoffs in Mesh-Based Ewald Solvers Can Be Faster Than Truncation,” J. Chem. Theory Comput., vol. 9, no. 12, pp. 5412–5420, Dec. 2013, doi: 10.1021/ct4004614.

The model works quite well. However, I am now trying to implement that model in a highly inhomogeneous system consisting of a 20-nm diameter water droplet in the center of a spherical vapor region with a diameter of 300 nm. The vapor is filled with N2 gas at 10 atm. So, there are around 1.1M water molecules and 2.2M N2 molecules.

I am trying to get acceptable performance out of this system. I first ran a homogenous water system with the same number of water molecules (~1.1M) and compared that performance with the inhomogeneous system. I expect that the inhomogeneous system should (optimally) be only slightly more expensive than the homogeneous system because the bulk of the computation is for the water, not the nitrogen gas. However, I am finding the inhomogeneous system is upwards of 10x longer. This is after balancing the inhomogeneous system from an imbalance factor of 80 à 1.6.

It appears the issue is related to the dispersion grid / dispersion parameters. The inhomogeneous system is spending 80% of computational time on kspace, and as I increase the Gewald parameter (0.27à0.4) and increase the dispersion grid count, the computational time increases. This makes sense because only a small volume of the total is occupied by water.

I am seeking advice for how I can set up the dispersion parameters so the performance is optimal for the inhomogeneous system. Is there a method to “balance” the dispersion grid in an analogous way to the processor grid balance?

I have included some files for reference if anyone is interested in helping me solve this problem and wants more details. The .lmp file is the source code for a smaller system (5 nm droplet radius, 40 nm vapor region radius) that I ran on 80 MPI tasks. Two log files are included: one with the vapor region (inhomogeneous system) and one with just the droplet (homogeneous). The template files are also included. For this smaller system, the inhomogeneous is about 2x slower than the homogeneous. The nitrogen computation is negligible, so optimally the runtimes should be the same. This system can be easily scaled down further by changing the variables “R” and “Rbig”.

Thank you,

Eric

code_droplet.lmp (6.04 KB)

log_homogeneous.txt (11 KB)

log_inhomogeneous.txt (11.2 KB)

T.Nitrogen.dat (281 Bytes)

T.Water_TIP4P2005.dat (509 Bytes)

akohlmey · April 3, 2022, 9:34pm

Hello,

[…]

I am trying to get acceptable performance out of this system. I first ran a homogenous water system with the same number of water molecules (~1.1M) and compared that performance with the inhomogeneous system. I expect that the inhomogeneous system should (optimally) be only slightly more expensive than the homogeneous system because the bulk of the computation is for the water, not the nitrogen gas.

this is where you are wrong. the computational cost of PPPM is determined by the volume, much less so by the number of particles.
this is easily inferred from the algorithm, you have four steps: 1) computing the charge density 2) transform to reciprocal space and convolution plus transform back to real space, 3) application of the field. steps 1) and 3) are O(N) by the number of particles, step 2) is O(N log(N)) with the number of grid points. for a larger volume you have many more gridpoints.

…and this is disregarding the impact of communication. but that is a topic that is only relevant when using a large number of MPI ranks.

However, I am finding the inhomogeneous system is upwards of 10x longer. This is after balancing the inhomogeneous system from an imbalance factor of 80 à 1.6.

It appears the issue is related to the dispersion grid / dispersion parameters. The inhomogeneous system is spending 80% of computational time on kspace, and as I increase the Gewald parameter (0.27à0.4) and increase the dispersion grid count, the computational time increases. This makes sense because only a small volume of the total is occupied by water.

I am seeking advice for how I can set up the dispersion parameters so the performance is optimal for the inhomogeneous system. Is there a method to “balance” the dispersion grid in an analogous way to the processor grid balance?

increasing gewald will do the opposite. the only way to reduce the relative cost of Kspace is to increase the real space cutoff and correspondingly reduce gewald and also increase the order parameter so you can reduce the number of grid points. however, doing all this and maintaining a desired accuracy is tricky.

Plus for the kind of system you are simulating, you have to ask yourself, if using a Kspace style is the best solution and whether it would not be better to use no Kspace and a much larger cutoff.
For the Kspace you need to tune your accuracy so that the dense parts of the system are treated with the correct accuracy. the automatic heuristics of LAMMPS are not much of a help since they adjust the accuracy for the average density assuming a homogeneous density.

akohlmey · April 3, 2022, 9:47pm

Also, please explain why you are using pppm/disp for this kind of system. that seems counterintuitive to me.

ericabird · April 3, 2022, 9:54pm

Hi Axel,

Thanks for your explanation. Originally I used a SPC/E wolf water model with no kspace and a large cutoff as you alluded to, which ran fine. However I found the PPPM dispersion TIP4P/2005 ran much faster than my wolf model on homogeneous systems and produced much better water properties / surface tension, so I am trying to see if I can run this inhomogeneous system with the PPPM dispersion model with better performance than the wolf.

I can try adjusting the parameters as you suggested, but I am now thinking I may not get better performance than my wolf model.

Regards,

Eric

akohlmey · April 3, 2022, 10:16pm

there are two things to take into account here:
using pppm for coulomb and using pppm/disp for coulomb and the sixth power LJ term.
the benefit of using pppm/disp over pppm is that you can lower the real-space cutoff without reducing accuracy.

if you have an average lower density, there are fewer interactions to be computed in real space while the cost of reciprocal space remains the same.
with switching from pppm/disp to pppm you already reduce the number 3dFFTs required with only moderate additional computational cost since most atoms have much fewer neighbors so you can afford a longer cutoff. for LJ a 12-14 \AA cutoff is very accurate already.

with wolf sum versus pppm you have to deal with the much less reduced accuracy of a cutoff coulomb versus the LJ interaction due to the long-range nature of coulomb.
using a wolf sum doesn’t help much because it is most suitable for homogeneous crystalline systems where you can rely on error cancellation if you are damping the coulomb in the right way.

nevertheless, with a finite size subsystem requiring coulomb interactions only, you may be able to use a much longer than usual cutoff, since only the atoms in the core of your system will have a much larger number of neighbors. thus the error from using a cutoff coulomb, especially with a damping scheme, may not be so bad. on top of that, you will end up with a computation that will parallelize much better than pppm. so you can counteract the added cost by just using more MPI processes. with pppm this is limited, since the parallel efficiency is reduced with more MPI ranks and instead the communication cost increases exponentially. not to mention that pppm can only be distributed in 2d and not in 3d.