lammps_scatter_atoms: lost atoms with increasing MPI processes

Stephen_Cox · May 16, 2017, 4:25pm

Hi,

I have written a code that uses LAMMPS as a library. I am having issues with lost atoms when I use more than 12 cores. If I take the initial configuration and run LAMMPS normally on 40 cores, I don’t find any issues. The relevant part of my code is:

for(int step=0; step<K+1; step++)
{
array_to_vector(step, pos_current_array, pos_vector, 3N, 1.0);
array_to_vector(step, vel_current_array, vel_vector, 3N, 1.0);

boxlo[0] = volume_current_array[step][0];
boxhi[0] = volume_current_array[step][1];
boxlo[1] = volume_current_array[step][2];
boxhi[1] = volume_current_array[step][3];
boxlo[2] = volume_current_array[step][4];
boxhi[2] = volume_current_array[step][5];

lammps_reset_box(lmp, boxlo, boxhi, 0.0, 0.0, 0.0);

lammps_scatter_atoms(lmp,const_cast<char *>(“x”),1,3,pos_vector);
lammps_scatter_atoms(lmp,const_cast<char *>(“v”),1,3,vel_vector);

lmp->input->one(const_cast<char *>(“run 0 post no”));

…
}

And the error I’m getting from LAMMPS is:

thermo_style custom step temp pe ke enthalpy press xlo xhi ylo yhi zlo zhi
2 by 4 by 4 MPI processor grid
run 0 post no
Neighbor list info …
update every 1 steps, delay 1 steps, check yes
max neighbors/atom: 2000, page size: 100000
master list distance cutoff = 5.3065
ghost atom cutoff = 5.3065
binsize = 2.65325, bins = 19 19 19
1 neighbor lists, perpetual/occasional/extra = 1 0 0
(1) pair sw, perpetual
attributes: full, newton on
pair build: full/bin/atomonly
stencil: full/bin/3d
bin: standard
Per MPI rank memory allocation (min/avg/max) = 2.724 | 2.737 | 2.856 Mbytes
Step Temp PotEng KinEng Enthalpy Press Xlo Xhi Ylo Yhi Zlo Zhi
ERROR: Lost atoms: original 4000 current 2253 (…/thermo.cpp:433)
Last command: run 0 post no

I should mention that I’m scattering/gathering atoms from a NPT simulation. I initially read in a data file with the atom id’s consecutively ordered, and I’m defining atom_modify map array before reading in this data file.

I’m grateful for any help.

Steve

sjplimp · May 17, 2017, 2:19pm

I don’t really understand what you are trying to do.

It looks like you scattering coords back into LAMMPS.

Does that mean you have changed the atom coords?

If so, then obviously you could cause atoms to be

lost if you do that incorrectly.

On the LAMMPS side, atoms are detected as lost

only on steps when thermo info is output. So I suggest

to debug that you print thermo output every step, and

also that you try reneighboring with varying frequency.

E.g. if you still lose atoms when you reneighbor every

step and print thermo every step, that gives you a clue

as to what to investigate.

Steve

Stephen_Cox · May 17, 2017, 3:13pm

Hi Steve,

Thanks for your reply.

I’m trying to implement a transition path sampling algorithm. I have a trajectory generated from LAMMPS, with every 200 time steps stored in pos_current_array (positions) and vel_current_array (velocities). Later on in the program, I will select one of these saved configurations and propagate dynamics and accept/reject this new trajectory based on some criteria. The loop put in my original post is to do with initialization, so here I’m looping over the saved configurations of the original trajectory. The next lines of the code are:

double ke_k, pe_k, enth_k, press_k, temperature_k;

lmp->output->thermo->evaluate_keyword(const_cast<char *>(“ke”),&ke_k);
lmp->output->thermo->evaluate_keyword(const_cast<char *>(“pe”),&pe_k);
lmp->output->thermo->evaluate_keyword(const_cast<char *>(“enthalpy”),&enth_k);
lmp->output->thermo->evaluate_keyword(const_cast<char *>(“press”),&press_k);
lmp->output->thermo->evaluate_keyword(const_cast<char *>(“temp”),&temperature_k);

Which must be where the atoms are being detected as lost.

sjplimp · May 17, 2017, 3:38pm

ok - that makes more sense.

I am guessing the issue is the following.

When you invoke scatter_atoms() you are

asking a new set of coords (in your case)

to be assigned to atoms that currently have

different (old) coords, and are thus owned by

procs that correspond to the old coords.

So if your new coords are dramatically different

than the old coords (because they were from

some alternate trajectory that you are now swapping

to), then some procs may now own atoms that

are dramatically outside their domain.

When LAMMPS starts a new run (new neigh list),

it will first try to migrate those atoms to the

correct procs. But that can fail if the new proc

is far away, resulting in lost atoms.

The lib-interface function lammps_create_atoms()

was designed for a use case more like yours.

It will actually figure out which proc the new coords

belong to and assign them to that proc initially.

Instead of to the proc that currently owns it (based

on the old coords).

Note that if the interface to lammps_create_atoms()

doesn’t include all the atom properties you want

to “restore”, you would need to use lammps_scatter_atoms()

on those extra properties after using lammps_create_atoms().

Steve

Stephen_Cox · May 19, 2017, 3:27pm

Hi Steve,

Thanks, this seems to work with the caveat that I have to delete all the atoms prior to creating them:

lmp->input->one(const_cast<char *>(“delete_atoms group all”));
lammps_create_atoms(lmp, N, NULL, &atom_type_vector[0],
pos_vector, vel_vector, NULL, 1);

Thanks again,
Steve

sjplimp · May 19, 2017, 3:49pm

yes, that’s correct (the assumption

with lammps_create_atoms() is you

might be giving LAMMPS a completely

new set of atoms)

delete_atoms group all is a very fast

operation - it just zeroes the count of atoms

on each proc

Steve