Does periodic boundaries influence the extraction of atom variables in a pair potential?

voro · January 27, 2023, 9:25pm

Hi all,

I wrote a custom granular pair potential a few months back that works well for a wall/particle bounded system but when tested on a granular specimen with periodic walls on all sides, I get a segmentation fault at the first timestep due to one of the atom vector variables not extracting the correct quantity. Upon closer inspection, I noticed that failure happens to granules at the boundary.

For example, the pairwise interaction of granule 9153 (as obtained from tag[i]) and 9599 (as obtained from tag[j]) fails because these two granules are at a periodic boundary. In the input file, I have a periodic boundary at z=0 and z=0.8m. Here are the x,y,z coordinates of granule 9153 and 9599 in the input data file:

9153: 4.0941922423730681e-02, 1.9419058391353688e-02, 3.2053353230316953e-07
9599: 3.8314403595514357e-02, 1.5487448856834229e-02 8.0007039234515032e-01.

When I extract these granular coordinates during the pair potential evaluation, my coordinates are roughly the same but are accounted for periodicity (note the z dimension of granule 9599):

9153: 4.0941925114364e-02, 1.9419072048767e-02, 3.11622e-7
9599: 3.8314403595514e-02. 1.5487448856834e-02, -8.0690400520e-5.

Now, specifically, my pair potential has a dynamic radius quantity called the truncated radius(trunc_rad)… this radius expands depending on the overlap. When I extract trunc_rad[i], I get the correct value but I get 0 for trunc_rad[j]. This messes up subsequent calculations that results in NaNs and eventually a segfault.

Here is the pertaining section of code where it fails where I calculate some quantities required to update the truncated radius:


  double *trunc_rad = atom ->trunc_rad;

  for (ii = 0; ii < inum; ii++) {
    i = ilist[ii];
    xtmp = x[i][0];
    ytmp = x[i][1];
    ztmp = x[i][2];
    itype = type[i];
    jlist = firstneigh[i];
	
	cur_trunc_radi=trunc_rad[i];
	radi = radius[i];
	cur_radi=std::max(radi,cur_trunc_radi);
    jnum = numneigh[i];

	int count=0;

    for (jj = 0; jj < jnum; jj++) {

      j = jlist[jj];
      j &= NEIGHMASK;
	  

      delx = xtmp - x[j][0];
      dely = ytmp - x[j][1];
      delz = ztmp - x[j][2];
	  xtmpj=x[j][0];
	  ytmpj=x[j][1];
	  ztmpj=x[j][2];
	  
      rsq = delx*delx + dely*dely + delz*delz;
	  cur_trunc_radj=trunc_rad[j];
	  radj = radius[j];
	  cur_radj=std::max(radj,cur_trunc_radj);  //CODE FAILS HERE

	  
	  
	  radsum=cur_radi+cur_radj;
	  



      if (rsq < radsum*radsum) {
		count+=1;
		

		  
		b=cur_radj+cur_radi;
		b=sqrt(rsq);
	  
		s=(b+cur_radj+cur_radi)*0.5;
		A=sqrt(s*(s-cur_radi)*(s-cur_radj)*(s-b));
		h=2.0*A/b;
		temp_h=(cur_radi*cur_radi-h*h)/(cur_radj*cur_radj-h*h);
		
		
		if(b<radsum){
		if(isnan(h)==1){
		 std::cout<<std::fixed<<std::setprecision(15)<<" "<<tag[i]<< " "<<tag[j]<<" "<<" xtmpi "<<xtmp<<" xympi "<<ytmp<<" ztmpi "<<ztmp<<" xtmpj "<<xtmpj<<" xympj "<<ytmpj<<" ztmpj "<<ztmpj<<"h in radloop is complex "<<h<<" b "<< b <<" s "<<s<< " A "<< A<< " s-cur_radi "<<s-cur_radi<<" s-cur_radj "<<s-cur_radj<< " s-b "<<s-b<<" curradi "<<cur_radi<< " curradj "<< cur_radj<< " radsum "<<radsum<< " rsq "<<rsq<< " radsum*radsum "<<radsum*radsum<<std::endl;
		}
		

		temp_r2=b/(1.0+sqrt(temp_h));
		temp_r1=b-temp_r2;
		
        temp_s1[i] += temp_r1;
		temp_s2[i]+=temp_r1*temp_r1;
		temp_n[i]+=1.0;
		temp_s1[j] += temp_r2;
		temp_s2[j]+=temp_r2*temp_r2;
		temp_n[j]+=1.0;
		}
		


      }
    }

  }

My truncated radius in the input script is

9153: 0.004867644251833
9599: 0.004569340782181

However, during the loop in the above code, I get:

9153: 0.004867644251833
9599: 0

Not sure why this happens…

Any help is much appreciated! Can provide more code but I don’t know if that will be of help…

Best,

akohlmey · January 27, 2023, 9:29pm

My guess is, it will happen, when “j” is a ghost atom, i.e its value is j >= atom->nlocal.

Do you have a forward communication that updates the custom per-atom properties to ghost atoms?

voro · January 27, 2023, 9:35pm

Thank you for your prompt reply, Axel. No I don’t have it. I just wrote this on top of pair/gran/hertz/history and the only forward communication I have is:

  if (fix_rigid && neighbor->ago == 0) {
    int tmp;
    int *body = (int *) fix_rigid->extract("body",tmp);
    double *mass_body = (double *) fix_rigid->extract("masstotal",tmp);
    if (atom->nmax > nmax) {
      memory->destroy(mass_rigid);
      nmax = atom->nmax;
      memory->create(mass_rigid,nmax,"pair:mass_rigid");
    }
    int nlocal = atom->nlocal;
    for (i = 0; i < nlocal; i++)
      if (body[i] >= 0) mass_rigid[i] = mass_body[body[i]];
      else mass_rigid[i] = 0.0;
    comm->forward_comm_pair(this);
  }

Not sure how to add this capability as I thought it would be the same as hertz… Any idea on how I can achieve this? I checked atom.cpp and atom_vec_sphere.cpp too where I defined the per-atom vectors… Do these files have any bearing?

akohlmey · January 27, 2023, 9:39pm

Please study the documentation and other source code. There is a whole bunch of helpful information in the “Information for Developers” section, e.g. 4.5. Communication patterns — LAMMPS documentation

I know too little about your modifications and your model to give any more specific advice.

voro · January 27, 2023, 10:12pm

Thank you Axel, actually my trunc_rad is just like rho in pai_eam… Not too familiar with EAM as I don’t know too much about molecular dynamics but used it as an inspiration for this DEM potential. You mentioned this in a thread we had many months ago.. After I compute the trunc_rad, I have my force calculation loop just like pair_eam.

Can I naively use the methods found in pair_eam.cpp:

pack_forward_comm()
unpack_forward_comm()
pack_reverse_comm()
unpack_reverse_comm()

and include:
if (newton_pair) comm->reverse_comm_pair(this);

and

comm->forward_comm_pair(this);

between the calculation of trunc_rad and force.

Another thing I noticed is that I defined my arrays such as trunc_rad to be:

  memory->create(temp_trunc_rad,nlocal,"pair:temp_trunc_rad")

I am guessing this too won’t be compatible with periodic boundary conditions and I would have to swap nlocal with nmax now that there are ghost atoms?

Thank you!

akohlmey · January 27, 2023, 10:19pm

The information you seek is in the manual and the comments in the source code. As I already noted, I know too little about your stuff (and I already explained recently how much I get irritated by being fed only tiny bits and pieces of information) for anything else. …and regardless, you are now making modifications at a level where you either should be capable of figuring it out by yourself or you need to collaborate with somebody that can. We have put as much effort as we can, to direct people into the right direction with all this text that was (often rather recently) added to the manual, but I cannot give specific advice for code written based on guesses and cut-n-paste code reuse without proper studying and reflecting on the meaning of it.

srtee · January 27, 2023, 10:59pm

See 4.7. Writing new styles — LAMMPS documentation for an example of ghost comms in a fix. I’ve never done it for a pair style but I imagine it’s not too different.

voro · January 30, 2023, 1:30pm

@akohlmey

I have been debugging trying to incorporate a periodic boundary compatible potential and I have a few questions. If I do a test for a code with simulation domain (simulation box +granules) as follows using a serial compilation of lammps

I get nlocal=3, inum=3 and nmax=16384 and nghost =42. I checked my neighbor list and it seems to be correct as in it identifies neighbors as:

Why is nmax and nghost so high? I read the documentation and don’t understand… Won’t there only be 2 ghost atoms (the image of 1 and 3) if I’m using a serial version?

Will also nlocal and inum remain unchanged during a serial run if there is any type of granular packing?

I have also created arrays to store variables that worked when the granules don’t go out of the periodic domain but fail when they leave the domain. For example, I initialize the arrays using this method:

  memory->create(temp_trunc_rad,nlocal,"pair:temp_trunc_rad"); 
  memory->create(temp_s1,nlocal,"pair:temp_s1");
  memory->create(temp_s2,nlocal,"pair:temp_s2");
  memory->create(temp_n,nlocal,"pair:temp_n");

Perform some calculation

 
  
  for (ii = 0; ii < inum; ii++) {
    i = ilist[ii];
    xtmp = x[i][0];
    ytmp = x[i][1];
    ztmp = x[i][2];
    itype = type[i];
    jlist = firstneigh[i];
	
	cur_trunc_radi=trunc_rad[i];
	radi = radius[i];
	
	radi=original_rad[i];
	radi = radius[i];
	cur_radi=std::max(radi,cur_trunc_radi);
    jnum = numneigh[i];
/* 	std::cout<<"i loop"<<std::endl;
	std::cout<< i <<' '<< jnum<<' ' << std::endl; */
	int count=0;

    for (jj = 0; jj < jnum; jj++) {
      j = jlist[jj];
      j &= NEIGHMASK;
	  

      delx = xtmp - x[j][0];
      dely = ytmp - x[j][1];
      delz = ztmp - x[j][2];
	  xtmpj=x[j][0];
	  ytmpj=x[j][1];
	  ztmpj=x[j][2];
	  
      rsq = delx*delx + dely*dely + delz*delz;
	  cur_trunc_radj=trunc_rad[j];
	  radj = radius[j];
	  radj=original_rad[j];
	  radj = radius[j]; //swapped.
	  cur_radj=std::max(radj,cur_trunc_radj);
	  
	  
	  radsum=cur_radi+cur_radj;
	  



	  

      if (rsq < radsum*radsum) {
		count+=1;
		
	
		  
		b=cur_radj+cur_radi;
		b=sqrt(rsq);
	  
		s=(b+cur_radj+cur_radi)*0.5;
		A=sqrt(s*(s-cur_radi)*(s-cur_radj)*(s-b));
		h=2.0*A/b;
		temp_h=(cur_radi*cur_radi-h*h)/(cur_radj*cur_radj-h*h);
		
		
		if(b<radsum){
		if(isnan(h)==1){
		 std::cout<<std::fixed<<std::setprecision(15)<<" "<<tag[i]<< " "<<tag[j]<<" "<<" xtmpi "<<xtmp<<" xympi "<<ytmp<<" ztmpi "<<ztmp<<" xtmpj "<<xtmpj<<" xympj "<<ytmpj<<" ztmpj "<<ztmpj<<"h in radloop is complex "<<h<<" b "<< b <<" s "<<s<< " A "<< A<< " s-cur_radi "<<s-cur_radi<<" s-cur_radj "<<s-cur_radj<< " s-b "<<s-b<<" curradi "<<cur_radi<< " curradj "<< cur_radj<< " radsum "<<radsum<< " rsq "<<rsq<< " radsum*radsum "<<radsum*radsum<<std::endl;
		}
		
	
		temp_r2=b/(1.0+sqrt(temp_h));
		temp_r1=b-temp_r2;
		
        temp_s1[i] += temp_r1;
		temp_s2[i]+=temp_r1*temp_r1;
		temp_n[i]+=1.0;
		temp_s1[j] += temp_r2;
		temp_s2[j]+=temp_r2*temp_r2;
		temp_n[j]+=1.0;
		}
		
		std::cout<<std::fixed<<std::setprecision(15)<<' '<<tag[i]<<' ' <<tag[j]<<' '<<xtmp<<' '<<ytmp<<' '<<ztmp<<' '<<xtmpj<<' '<<ytmpj<<' '<<ztmpj<<' '<< cur_radi<<' '<<cur_radj<<' '<< b<<' '<<s<<' '<<h<<' '<<A<<' ' <<temp_h<<' '<<temp_r1<<' '<<temp_r2<<' '<<temp_n[i]<<' ' <<temp_s1[i]<<' ' <<temp_s2[i]<<' ' <<temp_n[j]<<' ' <<temp_s1[j]<<' ' <<temp_s2[j]<<' ' <<std::endl;

      }
    }

  }

And then I compute pertinent quantity

  for(ii=0;ii<inum;ii++){
	 i=ilist[ii];
	 itype=type[i];
 
 
	 b_calc=(0.25)*(temp_s1[i]-temp_n[i]*radius[i])-radius[i];
	 b_calc=(0.25)*(temp_s1[i]-temp_n[i]*original_rad[i])-original_rad[i];
	 c_calc=(0.25)*(radius[i]*temp_s1[i]-temp_s2[i]);
	 c_calc=(0.25)*(original_rad[i]*temp_s1[i]-temp_s2[i]);
	 temp_trunc_rad[i]=(-b_calc+sqrt(b_calc*b_calc-4.0*c_calc))*0.5;

	 
	 
	 
	 if (temp_trunc_rad[i]>trunc_rad[i]){
		 trunc_rad[i]=temp_trunc_rad[i];
	 }
	 if (itype==2 || itype==3){
		 trunc_rad[i]=original_rad[i];
	 }
	  
	 radius[i]=trunc_rad[i];
	 if(isnan(radius[i])==1){
		 std::cout<<"Radius is complex"<<std::endl;
	 }
	 
	std::cout<<tag[i]<<" finaloop "<<' '<<trunc_rad[i]<<' '<<b_calc<<' '<< c_calc<<std::endl;

	 
  }

But when I try to destroy temp_s1, temp_s2 and temp_n using memory->destroy like this:


  memory->destroy(temp_trunc_rad); 
  memory->destroy(temp_s1);
  memory->destroy(temp_s2);
  memory->destroy(temp_n);

I get a memory error… This too happens only when I use PBC and the granules go out of it. If the granules don’t go out of the domain, I get what I want and is capable of deleting the memory without a memory issue. I don’t understand why I am unable to destroy the created arrays because they are unchanged and I don’t resize them. Would you happen to know why this might happen?

Thanks!

akohlmey · January 30, 2023, 1:40pm

“nmax” is the largest allocation of per-atom data across all processors. To avoid excessive time loss from growing arrays in small increments, they are grown in increments of DELTA, which happens to be defined as 16384 (see file src/atom_vec.cpp).

The value of “nghost” is determined by the value of the communication cutoff and the number of atoms within that range. Periodic copies are added, if needed. LAMMPS is not subject to minimum image conventions.

You are allocating those arrays to be of size “nlocal”, but when you are inside you i,j loops you have to consider that “j” atoms may be ghost atoms and thus you are accessing those arrays out of range and corrupting your memory. Without periodic boundaries and when running only on 1 MPI process, there are no ghost atoms, and the corruption cannot happen.

akohlmey · January 30, 2023, 1:42pm

They can change when you have atoms migrating between subdomains when running in parallel. They can change when you allow LAMMPS to “lose atoms”, which is not uncommon in DEM.

akohlmey · January 30, 2023, 2:16pm

I believe you need to go back to study the EAM implementation some more.
It does a few things differently than the code excerpts you are showing and those differences matter.

In general, you don’t need to understand the “physics” of a model to understand the “mechanics” of its implementation. Most of my ability to do maintenance for a software package as large and as complex and varied as LAMMPS is based on that fact. But for that it is crucial to study code carefully and not just copying or mimicking it and hoping for the best.

voro · January 30, 2023, 2:38pm

Thank you, Axel. Yes, I studied pair_eam and thought my troubles were over when my implementation was used for non-periodic BC case, and the total energy was conserved, plus results made sense with theoretical scaling… happiness was short-lived when I embarked on the periodic BC case. I have been running all my simulations with a serial implementation without the ability to lose atoms hence why I used “nlocal” at the time when I know pair_eam uses nmax for rho… I see now with your explanation why it might not be the same with ghost atoms.

A follow-up question if you don’t mind: Will this problem be solved ( and will i be able to destroy memory) if I allocate those arrays to be of size “nmax”?

My concern is will this accumulation still hold? For example will tag[i]=3 and tag[j]=3 still accumulate to the same temp_s1 index. I guess my confusion is if my granule is intersecting the periodic boundary, is it seen as a ghost atom in the j loop and a local atom in the i loop?

temp_s1[i] += temp_r1;
temp_s1[j] += temp_r2;

My understanding of pair_eam says that the accumulation should hold.

Thanks!

akohlmey · January 30, 2023, 2:42pm

Your understanding is not correct.

voro · January 30, 2023, 7:55pm

@akohlmey

I understand now why the j<nlocal condition is used during the computation of rho in pair_eam in the jloop and it makes sense for it to be included in my code so that it won’t double count the contribution of the ghost atoms and access memory it shouldn’t be. In that way, I can get also get away by having my temp_s1,temp_s2, temp_n be of size “nlocal” as I won’t go beyond it’s range.

I have a question regarding forward comm and its use in a serial run (i believe I don’t need reverse communication as I use newton off).

Assuming I calculate temp_s1,temp_s2, temp_n, I then proceed to update my radius using these quantities with the following code:

for(ii=0;ii<inum;ii++){
	 i=ilist[ii];
	 itype=type[i];
 
 
	 b_calc=(0.25)*(temp_s1[i]-temp_n[i]*radius[i])-radius[i];
	 b_calc=(0.25)*(temp_s1[i]-temp_n[i]*original_rad[i])-original_rad[i];
	 c_calc=(0.25)*(radius[i]*temp_s1[i]-temp_s2[i]);
	 c_calc=(0.25)*(original_rad[i]*temp_s1[i]-temp_s2[i]);
	 temp_trunc_rad[i]=(-b_calc+sqrt(b_calc*b_calc-4.0*c_calc))*0.5;

if (temp_trunc_rad[i]>trunc_rad[i]){
		 trunc_rad[i]=temp_trunc_rad[i];
	 }
	 if (itype==2 || itype==3){
		 trunc_rad[i]=original_rad[i];
	 }
	  
	 radius[i]=trunc_rad[i];
	 if(isnan(radius[i])==1){
		 std::cout<<"Radius is complex"<<std::endl;
	 }
	 
	std::cout<<tag[i]<<" finaloop "<<' '<<trunc_rad[i]<<' '<<b_calc<<' '<< c_calc<<std::endl;

	 
  }

This bit of code only updates the radius of local granular atoms and I would require forward_comm right after this prior to the force calculation so that I update the radius of the ghost atoms?

Otherwise during the force calculation loops, in the i loop of a granule across a boundary, I would have its updated radius and its corresponding ghost would have an un-updated radius which would subsequently affect the force calculation? Is this correct?

Sorry for asking a lot of questions on here… I don’t have anyone else in the institution who has experience with this.

akohlmey · January 30, 2023, 8:05pm

That is bad, but - as I have stated before - I don’t have the time to fill in for this, and ultimately it is your problem if you start a project without the proper support that you require. The fact that you are lacking support does in no way entitle you to a larger share of my time. I already spent more time on the very specific details of your implementation (and still without having a good idea of what is going on because you only provide fragmented information) than I did on other projects where I ended up with a co-author credit. I am done with this.

You should find a proper collaborator that does have the time and interest in your research and that you are willing to share all required information with.

voro · January 30, 2023, 8:12pm

Understandable and thanks for all the help thus far!

srtee · January 30, 2023, 10:40pm

It sounds like you are developing code without a debugger. A question like this:

can be straightforwardly answered if you run your code in a debugger, set an appropriate break point, and see how your variables of interest change line by line. (Ghost comms run pretty much the same in serial as in parallel, so you could learn all this without even touching MPI debugging.)

I strongly advise you to drop everything and take as much time as it takes to master a debugger – gdb is widely available, free, and good enough. It would be much, much better if you could find a code development supervisor or collaborator, but if you can’t, you must at least be able to inspect what your code is actually doing (not what you think it’s doing or should do).