Hi!
I’m developing a custom compute command that calculates atom displacements using a neighbor list from an external file. This file lists the global IDs of central atoms and their neighbors. The displacement measures how far the central atom is from the center of its neighboring atoms.
In my custom MD simulations, I will run multiple loops until a certain condition is met. Here is key part of my input:
variable j loop ${nloop}
label loop_mark
... # some variable define
thermo_style custom step temp etotal pe ke vol lx ly lz
run 20
... # some variable define
if "some condition" then "jump SELF break"
next j
jump SELF loop_mark
label break
I’ve noticed that each loop takes longer to complete over time.
In my implementation, I called atom->init()
in void ComputeCustomDisp::init()
, and when looping over all atoms in void ComputeCustomDisp::compute_peratom()
function, I used atom->map(global index)
to get local index on-the-fly:
void ComputeCustomDisp::init()
{
// read the neighbor list file, initialize atom->map()
read_file();
atom->map_init();
}
void ComputeCustomDisp::compute_peratom()
{
// check number of atoms
if (atom->nmax > nmax) {
memory->destroy(array_atom);
nmax = atom->nmax;
memory->create(array_atom, nmax, size_peratom_cols, "custom_disp:array_atom");
}
// reset the array_atom to 0
for (int i = 0; i < atom->nlocal + atom->nghost; i++) {
array_atom[i][0] = 0.0;
array_atom[i][1] = 0.0;
array_atom[i][2] = 0.0;
}
// assign the coord
double **x = atom->x;
// loop over all central atoms
size_t i,j;
for(i=0; i < central_id.size(); i++) {
int central_global_id = central_id[i];
int central_local_id = atom->map(central_global_id);
if (central_local_id < 0 || central_local_id > atom->nlocal) continue; // skip if the central atom isn't in the processor
// loop over all neighbors of the central atom
double dx=0, dy=0, dz=0;
double tmpx, tmpy, tmpz;
int neighbor_count = 0;
for(j = 0; j < neighbor_id[i].size(); j++) {
int neighbor_global_id = neighbor_id[i][j];
int neighbor_local_id = atom->map(neighbor_global_id);
if (neighbor_local_id < 0) {
error->warning(FLERR, "Cannot find the local index of the neighbor atom");
continue;
}
tmpx = x[central_local_id][0] - x[neighbor_local_id][0];
tmpy = x[central_local_id][1] - x[neighbor_local_id][1];
tmpz = x[central_local_id][2] - x[neighbor_local_id][2];
domain->minimum_image(tmpx, tmpy, tmpz);
dx += tmpx;
dy += tmpy;
dz += tmpz;
neighbor_count++;
}
dx /= neighbor_count;
dy /= neighbor_count;
dz /= neighbor_count;
array_atom[central_local_id][0] = dx;
array_atom[central_local_id][1] = dy;
array_atom[central_local_id][2] = dz;
}
}
To try to fix this, I stored local indexes in a vector during the void ComputeCustomDisp::init()
and used them directly in void ComputeCustomDisp::compute_peratom()
, which seemed to solve the problem.
According to LAMMPS’s domain decomposition scheme, the atoms in a certain region are held by a processor. Are atoms at the boundaries of a domain reassigned to different processors as they move across the boundaries? or are they attached to a fixed processor throughout the simulation?
Could you offer any insights or suggestions on why the original implementation will slow the simulations? If the atoms held by a processor are changing dynamically, my new solution will fail, is there any more effective to handle it?
Best regards,
Denan