[lammps-users] NaN on 8 processors, but not 4. MPICH-2 related?

We're running some fairly modest 25,000 atom Poiseuille flow simulations
on 8 processors and see it crash frequently with no LAMMPS-specific
error; just rank x in job y caused collective abort of all ranks.
What's peculiar is that I can start a simulation and have it run fine on
4 processors, but the same simulation will give NaN for the pressure at
t=0 if I use 8 processors. It doesn't matter if the 8 processors are on
the same compute node or different compute nodes. In addition, this
happens on all of our clusters, which use MPICH-2, but not on a cluster
at another site using the same code and input files, but using HP-MPI.

Has anyone seen this before? Can it be related to the MPI library?
What else can be causing it?


