Dump atom/mpiio not working, also lost atom problem

Hi all,

I have build lammps on an IBM Blue Gene/Q massively parallel processing supercomputer, but when I try to run it I got two problems.

First, the dump files produced by ‘dump all atom’ and ‘dump all atom/mpiio’ are incomplete and has lots of zeros in them, as if there is only one core writing into the file (mpiio not working). However xyz/mpiio dump style did seem to work.

Second, some times I got lost atoms error:

‘ERROR: Lost atoms: original 6231 current 6227’

The same script has been ran on another cluster (Intel Xeon E5-2695 v2) with same lammps build without getting this problem. And the energy/pressure are about the same for the two clusters.

Both r13864 and 17Nov16 were installed and have the same error.

Thank you for your help,

Lijie

Hi all,

I have build lammps on an IBM Blue Gene/Q massively parallel
processing supercomputer, but when I try to run it I got two problems.

First, the dump files produced by 'dump all atom' and 'dump all
atom/mpiio' are incomplete and has lots of zeros in them, as if there is
only one core writing into the file (mpiio not working). However
xyz/mpiio dump style did seem to work.

​if you get corrupted files with dump all atom, then this would be more
likely an issue with the file system and not with LAMMPS. are you certain,
that you don't have multiple jobs running at the same time writing to the
same files?

i would suggest you contact your​ HPC staff and investigate.

Second, some times I got lost atoms error:

'ERROR: Lost atoms: original 6231 current 6227'

The same script has been ran on another cluster (Intel Xeon E5-2695 v2) with
same lammps build without getting this problem. And the energy/pressure are
about the same for the two clusters.

​that doesn't mean anything. with the BG/Q you are likely using more MPI
ranks and thus have a different domain decomposition. in one case, your
communication cutoff is sufficient, in the other it is not. you may still
have bad dynamics.​ whether or not this is the case, is very difficult to
say from remote and without knowing any details.

axel.