rob,
Hi, everybody. Thanks! Yes, obviously one could use compute rdf and
Fourier transform, but the short (pair + skin) cutoff inherent to this fix
would mean (as noted) one could only get large wavenumbers, and might indeed
produce "crappy" results given the nature of FTs.
I did look at dynsf. It looks nice. But I guess I should have been clearer
about our situation. We already have a parallel code that does a nice job
of calculating S(q) in postprocessing, but it's utility is limited by the
N^2 scaling. Axel's point about using GPUs is a good suggestion - would that
we had someone who knew CUDA! 
it is easier than you think, and in my personal experience spending
some time porting a code to GPUs (be it with CUDA, OpenCL or any other
current or future tool) is helping also to write better code for
(multi-core) CPUs (with vector units).
Anyway, our parallel code works by loading _all the atoms_ (from LAMMPS
configuration dumps) into common memory, then calculating S(\vec{q}) for
different \vec{q} on different processors, and then gathering all the
S(\vec{q}) to produce S(q). The disadvantage of this approach is that a lot
of configuration dumps are necessary to get good statistics. So I was
wondering whether it would be worthwhile to implement this kind of algorithm
in LAMMPS to calculate and average S(q) on-the-fly as a compute. Once
i don't think it is worth it. your point about the statistics is
missing one important component: statistical relevance. simply
sampling data more often does not necessarily improve statistics. the
individual samples also have to be _statistically relevant_,
preferably completely independent. however in MD they are not; that is
not when they are sampled in close succession. lets just consider the
two extreme cases:
1) you have long-range structural features in your system. that would
mean, that you need to extend the g(r) to long r but that would *also*
mean that you have strong correlation between time steps, and thus
more frequent collection of data, won't improve your results much, if
at all.
2) you don't have long-range structural features. that would make it
more likely you can benefit from on-the-fly analysis, but in this
case, you can use the existing rdf compute. please note, that you can
make long r rdf computations more affordable in LAMMPS by using the
"rerun" command to postprocess a previously recorded trajectory, set a
suitably large cutoff and otherwise turn the force computation off.
complication is (I think - correct me if I'm wrong) each processor would
need to be able to access the position of each atom in order to get low
wavenumbers. I'm not sure how easy this is to do within the current LAMMPS,
i.e. do any fixes/computes gather all atom positions together onto each
processor? Am I missing something?
no, you would have to do communication to collect position data into a
single processor. it is required for some dump file formats and
interfaces like USER-COLVARS (which runs in serial, but benefits from
the fact, that usually only a very small number of atoms participate
in a collective variable), for example and that can affect parallel
scaling.
Does anyone have a good idea of where to start for writing a compute to
compute S(q) in parallel within LAMMPS? (directly, including all particle
pairs). I.e. would this need to be done "from scratch"?
probably. the challenge is to parallelize it and to make it scale with
system size. to make it palatable, thus you would have to set up a
ring buffer scheme (for which code exists in LAMMPS as it is used by
shake and fix rigid/small, IIRC) and then basically have each
processor deal with all pairs involving its local atoms and then have
a ring of buffers with the local atoms of all other subdomains that
you circulate around taking care not to double count pairs. certainly
doable, but most certainly also a project for a persion that *does*
want to dig deeper into the intestines of LAMMPS.
in short:
- it would be more convenient and a useful addition to lammps' analysis features
- it would be quite a bit of programming/debugging/lammps-learning work
- it may not have the effect you are looking for
axel.
p.s.: if you can get a hand on a copy if the paper below (i don't
think i have one left, and there is no online access beyond 1998), you
can see, that i've spent some time on this kind of subject back in the
time when men were *real* men, computers were *real* computers and
little fluffy beings from alpha centauri were *real* little fluffy
beings form alpha centauri (and i had just started as a grad student).
Long-range structures in bulk water, A. Kohlmeyer, W. Witschel, E.
Spohr, Z. Naturforsch. 52a, 432-434, (1997). (Link)