PRD - lost atoms

Hi,

I am having troubles running PRD on multiple nodes. After the first event is detected, the simulation crashes due to lost atoms. Each replica (except the one where the event was detected) has a different number of lost atoms.

On a single node, the system runs well. On 2 or more nodes, this happens.

The dumped structure looks fine, apart from missing atoms. Also, I ran many regular MD simulations under the same conditions (system, potential, temperature, tstep), never lost any atoms.

I attach my input file. Something I noted: the lost atoms are from the species I ignore when checking for an event.

I run with the following command:

mpirun -np 32 ~/lammps-10Feb15/src/lmp_mpi -partition 2x16 -in input.in > out.Zr

Would you have any suggestion on how to solve this issue?

Regards,

Romain

input.in (1.57 KB)

Each of your “regular” non-PRD runs was
also on 16 procs?

Steve

Hi Steve,

24 CPUs, on another cluster.
On the particular cluster, 16 and 32.

Romain

Sorry, not clear on the answer to my Q.

Understood. I’ll get that going.

Romain

Hi Steve,

I confirm. I ran 10^6 MD steps, same binary/machine/structure file, NVE+Langevin 3000K, on 1 (16 CPUs) and 2 (32 CPUs) nodes.
No lost atoms in MD, PRD loses atoms on more than 1 node.

Romain

Is in the input.in everything needed to run this?
Can you check the log files produced by a failing
2-replica run - are the lost atoms at the beginning
of a short run, or in the middle/end of a short run?

Can you post a script that fails as quickly as possible
with the Lost Atoms error (if not input.in)?

Steve