I am trying to run LAMMPS compiled with the single-precision FFTW (2.1.5) libraries on the Cray XT3, however, when I do this I get 'nan' for the long-range kspace energy on the first step and the simulation fails (please see log file at the end of this email for simulation parameters and output). I am changing FFT_PRECISION to 1 in fft3d.h. Some things I have tried:
1. I have tried this with both the apoa1 benchmark system and my own protein system. In both cases when LAMMPS is compiled with double-precision fftw libraries it runs fine but with sfftw it fails.
2. I have tried compiling with both pgi (6.1.4) and gcc (4.1.2) compilers, and also with all gcc optimization flags turned off, and 'pair modify table 0' still with the same result.
3. I also tried this on my intel core2 desktop with fftw and gcc, again, same result.
Anyone have any insights?
P.S. The reason that I am interested in using single-precision, is that in order to use grid spacings of 1 angstrom, the de facto standard for PME in biosimulations, I need to use a tolerance of 1e-6 for pppm. However, this results in half of the calculation being devoted to the kspace calculation, and a significant portion of that is in the FFT:
Pair time (\) = 79\.8051 \(51\.279\)
Bond time \() = 0.148467 (0.0953981)
Kspce time (\) = 65\.5463 \(42\.1169\) Neigh time \() = 4.3323 (2.78373)
Comm time (\) = 2\.11653 \(1\.35998\) Outpt time \() = 1.08976 (0.700226)
Other time (%) = 2.5909 (1.66479)
FFT time (% of Kspce) = 41.7554 (63.7038)
FFT Gflps 3d 1d-only = 6.02324 45.9326
I notice that generally LAMMPS takes pppm 1e-4 as the accuracy standard, but in this case that results in a grid that is spaced at over 2 angstroms along each coordinate direction. Can pppm use a grid spacing that is 2 to 2.5 times more sparse than PME and achieve similar accuracy? I plan to look into this more closely, but I thought I would mention it here since, at a pppm tolerance of 1e-4, the FFT doesn't take up much of the calculation, hence there would be little benefit to using the single precision fft.
LAMMPS (22 Jun 2007)
# Created by charmm2lammps v1.8.1 on Mon Oct 15 13:17:02 EDT 2007
neigh_modify delay 5 every 1
pair_style lj/charmm/coul/long 10 12
pair_modify mix arithmetic
kspace_style pppm 1e-4
4 = max bonds/atom
6 = max angles/atom
18 = max dihedrals/atom
2 = max impropers/atom
orthogonal box = (-114.746 -36.1536 -31.2315) to (80.5355 37.0646 58.1215)
8 by 2 by 2 processor grid
4 = max # of 1-2 neighbors
9 = max # of 1-3 neighbors
19 = max # of 1-4 neighbors
21 = max # of special neighbors
fix 1 all nve
# constrain all hydrogens and TIP3 angle (angle type 146)
#fix 2 all shake 1e-8 500 0 m 1.0 a 146
velocity all create 0.0 12345678 dist uniform
thermo_style custom cpu step etotal ke temp pe ebond eangle edihed eimp evdwl ecoul elong press
restart 500 damp2_all22.restart1 damp2_all22.restart2
dump 1 all atom 500 damp2_all22.dump
dump_modify 1 image yes scale yes
PPPM initialization ...
G vector = 0.214308
grid = 72 32 36
RMS precision = 8.25641e-05
brick FFT buffer size/proc = 6762 4608 4347
Memory usage per processor = 39.2199 Mbytes
CPU Step TotEng KinEng Temp PotEng E_bond E_angle E_dihed E_impro E_vdwl E_coul E_long Press
0 0 nan 0 0 nan 1440.6853 3966.9177 2064.8419 241.54002 39358.088 1259009.4 nan nan
ERROR: Out of range atoms - cannot compute PPPM