trivial PPPM optimization

Hello list,
I'm running simulations with long range electrostatics, where a large
fraction of my atoms are actually neutral. It occurred to me that in
PPPM::make_rho(), where the diffuse charge stencil is applied to the
FFT grid (with a triply nested loop) a lot of work is wasted on
neutral particles.
I propose adding a simple check along the lines of

--- a/src/KSPACE/pppm.cpp
+++ b/src/KSPACE/pppm.cpp
@@ -1923,6 +1923,8 @@ void PPPM::make_rho()

   for (int i = 0; i < nlocal; i++) {

+ if( q[i] == 0.0 ) continue;

This already exists in an even better way and is called pppm/cg
Axel

Another optimization for such a system is to use pair style hybrid overlay, with coul/long. And then crank up the real space cutoff. You need neighbor style multi. And communicate style multi for maximum effect.

I see. Thanks for the pointer to pppm/cg. Indeed it is even better to
forego the check my keeping a table of charge particles. However it is
not the best choice for my system (I have a ceramic with non neutral
atoms, and occasionally I insert a crap ton of neutral test particles
which are quickly removed). I'm already using hybrid coul/long, but I
assume cranking up the real space cutoff is only a good solution, if
your charged particle density is low (which it is not in my case).
I'll just keep my patch in my local version. It seems that it would
not be of too much help for anyone else (even though I'm curious with
which ratio of charged to uncharged particles the simple if would
start giving a benefit. My guess is that one in a few thousand is
already enough).

I see. Thanks for the pointer to pppm/cg. Indeed it is even better to
forego the check my keeping a table of charge particles. However it is
not the best choice for my system (I have a ceramic with non neutral
atoms, and occasionally I insert a crap ton of neutral test particles
which are quickly removed). I'm already using hybrid coul/long, but I

what has that to do with pppm/cg. it should work just fine. the table
is updated at the beginning of ::compute() so it doesn't make a
difference whether you add a ton of particles or not. you don't do it
*inside* of kspace->compute(), or do you?

assume cranking up the real space cutoff is only a good solution, if
your charged particle density is low (which it is not in my case).

this is an adjustable parameter that depends a lot on how many
processors you use. pppm scales worse, the more processors you use and
the larger your system is, because it has to redistribute the density
grid across the entire system to be able to parallelize the FFT. so
now you have real space that parallelizes very well but scales O(N**3)
with the cutoff and kspace which scales O(N*log(N)) but doesn't
parallelize so well. so this can always make an impact, the
hybrid/overlay method helps, when you can identify the charged
particles by atom type and thus trim down the neighbor list massively
(hence the use of the multi neighborlist style and the multi comm
style).

I'll just keep my patch in my local version. It seems that it would
not be of too much help for anyone else (even though I'm curious with
which ratio of charged to uncharged particles the simple if would
start giving a benefit. My guess is that one in a few thousand is
already enough).

if you insist out rolling your own, then you should at least also add
the test for ::fieldforce_ik()/_ad() since there you also loop over
all particles, but the resulting force only applies to charged
particles.

axel.