[lammps-users] single precision fftw causes 'nan' in long range kspace energy


I am trying to run LAMMPS compiled with the single-precision FFTW (2.1.5) libraries on the Cray XT3, however, when I do this I get 'nan' for the long-range kspace energy on the first step and the simulation fails (please see log file at the end of this email for simulation parameters and output). I am changing FFT_PRECISION to 1 in fft3d.h. Some things I have tried:

1. I have tried this with both the apoa1 benchmark system and my own protein system. In both cases when LAMMPS is compiled with double-precision fftw libraries it runs fine but with sfftw it fails.
2. I have tried compiling with both pgi (6.1.4) and gcc (4.1.2) compilers, and also with all gcc optimization flags turned off, and 'pair modify table 0' still with the same result.
3. I also tried this on my intel core2 desktop with fftw and gcc, again, same result.

Anyone have any insights?


P.S. The reason that I am interested in using single-precision, is that in order to use grid spacings of 1 angstrom, the de facto standard for PME in biosimulations, I need to use a tolerance of 1e-6 for pppm. However, this results in half of the calculation being devoted to the kspace calculation, and a significant portion of that is in the FFT:

Pair time (\) = 79\.8051 \(51\.279\) Bond time \() = 0.148467 (0.0953981)
Kspce time (\) = 65\.5463 \(42\.1169\) Neigh time \() = 4.3323 (2.78373)
Comm time (\) = 2\.11653 \(1\.35998\) Outpt time \() = 1.08976 (0.700226)
Other time (%) = 2.5909 (1.66479)

FFT time (% of Kspce) = 41.7554 (63.7038)
FFT Gflps 3d 1d-only = 6.02324 45.9326

I notice that generally LAMMPS takes pppm 1e-4 as the accuracy standard, but in this case that results in a grid that is spaced at over 2 angstroms along each coordinate direction. Can pppm use a grid spacing that is 2 to 2.5 times more sparse than PME and achieve similar accuracy? I plan to look into this more closely, but I thought I would mention it here since, at a pppm tolerance of 1e-4, the FFT doesn't take up much of the calculation, hence there would be little benefit to using the single precision fft.

LAMMPS (22 Jun 2007)
# Created by charmm2lammps v1.8.1 on Mon Oct 15 13:17:02 EDT 2007

units real
neigh_modify delay 5 every 1

atom_style full
bond_style harmonic
angle_style charmm
dihedral_style charmm
improper_style harmonic

pair_style lj/charmm/coul/long 10 12
pair_modify mix arithmetic
kspace_style pppm 1e-4

read_data damp2_all22.data
  4 = max bonds/atom
  6 = max angles/atom
  18 = max dihedrals/atom
  2 = max impropers/atom
  orthogonal box = (-114.746 -36.1536 -31.2315) to (80.5355 37.0646 58.1215)
  8 by 2 by 2 processor grid
  128737 atoms
  88092 bonds
  53273 angles
  19378 dihedrals
  1276 impropers
  4 = max # of 1-2 neighbors
  9 = max # of 1-3 neighbors
  19 = max # of 1-4 neighbors
  21 = max # of special neighbors

special_bonds charmm
fix 1 all nve
# constrain all hydrogens and TIP3 angle (angle type 146)
#fix 2 all shake 1e-8 500 0 m 1.0 a 146
velocity all create 0.0 12345678 dist uniform

thermo 20
thermo_style custom cpu step etotal ke temp pe ebond eangle edihed eimp evdwl ecoul elong press
#thermo_style multi
timestep 1.0

restart 500 damp2_all22.restart1 damp2_all22.restart2
dump 1 all atom 500 damp2_all22.dump
dump_modify 1 image yes scale yes

run 500
PPPM initialization ...
  G vector = 0.214308
  grid = 72 32 36
  RMS precision = 8.25641e-05
  brick FFT buffer size/proc = 6762 4608 4347
Memory usage per processor = 39.2199 Mbytes
CPU Step TotEng KinEng Temp PotEng E_bond E_angle E_dihed E_impro E_vdwl E_coul E_long Press
           0 0 nan 0 0 nan 1440.6853 3966.9177 2064.8419 241.54002 39358.088 1259009.4 nan nan
ERROR: Out of range atoms - cannot compute PPPM

I've never tried to run/build LAMMPS with single-precision FFTs
so can't help you there. I don't think its a good idea if
you want 1.0e-6 accuracy since single-precision only carries
7 or 8 digits of accuracy. So it seems like you'd be close to
the edge. Paul may want to comment on PPPM vs PME
accuracy. But the general rule-of-thumb is don't worry about
the grid spacing. Rather trust the accuracy criterion you
set. If you want 1.0e-4 then PPPM in LAMMPS will choose
an appropriate grid spacing, whatever it ends up to be.


Just thought of an issue with single-precision FFTs from LAMMPS.
The FFT package (fft3d.cpp) doesn't own the data, it just
provides an interface to the FFTW library. So the storage
of all the data is in LAMMPS and is still double precision,
including how it stores complex datums. So you would have
to change data storage in LAMMPS if you expect FFTs with
floats to work instead of doubles.


Thanks Steve. Your points are well-taken. Actually, the bigger picture is that I am trying to make an apples-to-apples comparison of LAMMPS to NAMD for biomolecular simulation, inasmuch as that is possible, and NAMD uses single-precision FFTs. So I am looking to compare performance for a given level of accuracy. NAMD does not report PME accuracy directly, but I believe that I will need close to 1e-6 accuracy to approach the accuracy of PME in NAMD with a 1 angstrom grid.

Incidentally, after I wrote my original email I discovered that if I use 'kspace_style ewald' instead of pppm then the single-precision FFT works. I will dig into the code as you suggest, but if anyone has experience or comments to share (on any of these matters) they are welcome.


Steve Plimpton wrote:

Ewald does not use FFTs, so that's not an indication
anything works, unfortunately.


For a good discussion on PPPM vs PME accuracy, see:

How to mesh up Ewald sums. I. A theoretical and numerical comparison
of various particle mesh routines (and part II), Markus Deserno and
Christian Holm,
Journal of Chemical Physics, Vol 109, Number 18, pages 7678 - 7701 (1998).

Basically, they argue that PPPM is superior to PME, but very similar.
The differences are subtle, and many parts of the algorithms are