Generate some per-atom scalar/vector properties during PairStyle::compute and be able to use them as LAMMPS variable

yury-lysogorskiy · June 23, 2022, 12:50pm

Dear LAMMPS developers,

I have the following situation: during simulation with our custom pair_style pace, we generate some per-atom scalar and/or vector properties as a by-product of main energy/forces calculations.

I would like use these values as like them were computed with compute, i.e. be able to show them in log or save with dump.

What is the best way to do it in LAMMPS codebase?

Best regards,
Yury

akohlmey · June 23, 2022, 1:32pm

@yury-lysogorskiy

If those values are “global” you can make them available through the compute pair command — LAMMPS documentation. For that you need to allocate the Pair class member pvector in the constructor of your pair style and set the member nextra to the length of this vector. In the Pair::compute() function you then compute the per-MPI-rank sum and the compute pair command will sum over the MPI ranks and you can access the numbers as a global vector. There are several pair styles that follow this scheme, for example airebo, reaxff, hbond/dreiding/lj, hbond/dreiding/morse.

If those are per-atom items, you allocate and manage the per-atom storage within the pace pair style and clear/fill it with data within the compute() function. Then you could create a compute pace/atom command and make those available as per-atom array or vector. There are two ways to get hold of the data managed by the pair style:

you make the compute a friend of the pair style, look up the pointer to the pair style via force->pair->match() and then cast it to a PairPACE pointer and then you can access those arrays from within the compute any which way you like and export them through the compute’s compute_peratom() function
you add entries for each per-atom array to the PairPACE::extract() function, look up the pair style inside the compute and then extract the individual pointers and assign/cast them to the necessary pointers within the compute and make them available. This doesn’t require giving direct access to internal data, but is a bit more laborious in adding all the calls to force->pair->extract() and the corresponding exports.

How you handle things inside the compute pace/atom command is up to you, you can have keywords that select individual properties and only export one property, or combine them all into a big per-atom array. The former usually produces more readable inputs, but requires using multiple compute instances with different keywords, the latter is more compact but also a bit more opaque. For similar functionality in LAMMPS either approach has been used, so there is no specific preference.

yury-lysogorskiy · June 23, 2022, 5:05pm

Thank you for the response. I think I got the idea. However, I’d like to understand how compute is operating under MPI ?

Here I collect the per-atom scalar quantities into some array extrapolation_grade_gamma for atom i, where i = list->ilist[ii]. And in ComputePaceAtom::compute_peratom I just redirect the vector_atom pointer here. That seems to be working on serial version (of course it is not final version), but what should be done in MPI parallelization with domain decomposition? Does compute_peratom will be called in each proc? Does extrapolation_grade_gamma corresponds to the atoms on the given proc only? How to aggregate it over all proc?

akohlmey · June 23, 2022, 5:13pm

Per-atom compute data is distributed. So each MPI rank has the data for the “owned” atoms only, that is it has nlocal items.

akohlmey · June 23, 2022, 5:19pm

You don’t. If you dump it, the dump class will take care of communication. A basic design choice in LAMMPS is to avoid collecting data from all atoms into a single global array. For large parallel calculations with many atoms that avoids index overflows since LAMMPS limits the per MPI rank number of atoms to what can be indexed with a signed 32-bit integer, but the global number of atoms may be much larger as it uses signed 64-bit integers for counting them.