How to map Atom Number or Atom ID

Ninh_Hoang · March 22, 2024, 1:34am

Hi every,
Now I want to get neighbor list. By PyLammps, I can do this and extract Atom ID of Atom Number in neighbor list. However, I need to use the loop to extract each atom, it takes a long time. I want to know there is any way to map Atom Number to Atom ID?
Thank you very much.
The code to extract atom id in neighbor list is below.

from lammps import lammps
import numpy as np

lmp = lammps()
lmp.commands_string(“”"
region box block -2 2 -2 2 -2 2
lattice fcc 1.0
create_box 1 box
create_atoms 1 box
mass 1 1.0
pair_style lj/cut 2.5
pair_coeff 1 1 1.0 1.0
run 0 post no"“”)

look up the neighbor list

nlidx = lmp.find_pair_neighlist(‘lj/cut’)
nl = lmp.numpy.get_neighlist(nlidx)
tags = lmp.extract_atom(‘id’)
print(“half neighbor list with {} entries”.format(nl.size))

print neighbor list contents

for i in range(0,nl.size):
idx, nlist = nl.get(i)
print(“\natom {} with ID {} has {} neighbors:”.format(idx,tags[idx],nlist.size))
if nlist.size > 0:
for n in np.nditer(nlist):
print(" atom {} with ID {}".format(n,tags[n]))

akohlmey · March 22, 2024, 6:36am

What exactly do you mean by that and what is the relation to the code below.

P.S.: the code is not quoted correctly and thus not easily readable. You need to use triple backquotes (```) instead of the greater sign (>).

Ninh_Hoang · March 22, 2024, 9:14am

Hi Akohlmey,
In PyLammps, I can get neighbor list but the result is series of numbers which are Atom Number. By, command extract_atom(‘id’), I can get Atom ID according to Atom Number. But by that way, I need to use loops to extract each atom number in neighbor list. So it takes a long time. I want to know how to convert Atom Number to Atom ID directly? For instance, Lammps’s algorithm is used to assign Atom Number and Atom ID.
Simple example: Atom Number= [0,1,2,3] and corresponding Atom ID =[4,5,1,4].
The code I used to extract Atom ID like example code from LAMMPS 2.3.10. Neighbor list access — LAMMPS documentation

akohlmey · March 22, 2024, 9:32am

You are still not making any sense here. The array you get from extract_atom(‘id’) is that map you are talking about. The code you quoted (and still haven’t fixed its formatting!) is using that information directly.

But all per-atom data is indexed by the index used in the neighbor list, and this is necessary to use because some of the ghost atoms may have the same atoms may have the same atom ID as one of the local atoms. I cannot think of any way how you would need this information presented differently.

So what is it exactly that you want to achieve?
Can you give an example and show a piece of code that uses the data in the “slow” way that you are concerned about?
I suspect that there are some problems on your side not understanding the fundamentals of the data model and parallelization of LAMMPS with the domain decomposition. Have you studied the recent paper describing LAMMPS? or at least read through this section of the LAMMPS manual: 4.4. Parallel algorithms — LAMMPS documentation?

Ninh_Hoang · March 22, 2024, 10:55am

For example: I have an array consisting 10 elements whose values are Atom Number (D=[0,1,2,3…,9]. Now I want to know corresponding Atom ID of each Atom Number? I used command ‘extract_atom(‘id’)’ to extract Atom ID of each Atom Number in the array. So if the number of element in the array increase, the time also increases.
Below is a piece of my code to get neighbor list:

from lammps import lammps
import numpy as np 
lmp = lammps()

##############################
# Get neighbor list at timestep 0
lmp.command('units real')
lmp.command('dimension 3')
lmp.command('boundary p p p')
lmp.command('atom_style atomic')
lmp.command('read_data input_data_9.txt')                                                   ###########################
lmp.command('velocity all create 100 12345 ')         
lmp.command('pair_style lj/cut 10.')
lmp.command('pair_coeff * * 0.239 3.4 10.')          
lmp.command('neighbor 0.0 bin')
lmp.command('neigh_modify every 1 check yes ')
lmp.command('timestep 1.0 ')
lmp.command('fix 1 all nvt temp 100.0 100.0 100.0 ')
lmp.command('run 0')

######################################
# NEIGHBOR LIST
nlidx = lmp.find_pair_neighlist('lj/cut')      
nl = lmp.numpy.get_neighlist(nlidx)             
id = lmp.extract_atom('id')     

pos = lmp.extract_atom('x') 
force = lmp.extract_atom('f') 

pos_lst = np.zeros((nl.size,3))           
force_lst = np.zeros((nl.size,3))         Z 

src_node_lst = []
dst_node_lst = []

for i in range(nl.size):
    idx, nlist = nl[i]

    atom_id = id[idx]
    [pos[idx][0], pos[idx][1], pos[idx][2]]
    # insert position
    pos_lst[atom_id-1] = [pos[idx][0], pos[idx][1], pos[idx][2]]

    # insert force
    force_lst[atom_id-1] = [force[idx][0], force[idx][1], force[idx][2]]

    # calculate neighbor list
    if nlist.size >0:
        src_atom = np.full((nlist.size,), atom_id).tolist()
        dst_atom = []
        for atom in np.nditer(nlist):
            dst_atom.append(id[atom])    
        src_node_lst += src_atom
        dst_node_lst += dst_atom

np.savez('./data/data_9_0.npz', pos = pos_lst, force = force_lst)                                           ###########################
np.savez('./neighbor/neigh_9_0.npz', src_node = src_node_lst, dst_node = dst_node_lst)                      ###########################

akohlmey · March 22, 2024, 1:41pm

Now why is this a surprise? This is how things work in computing: when you have a bigger problem, you have more work and thus it takes longer.
The way how you extract properties is the way you do it. The only thing that might help would be to extract “id”, “pos”, and “force” as numpy arrays instead of regular python lists via ctypes.

Please also keep in mind that the major time consumer is the part where you copy the neighbor list contents. For larger systems, this becomes large and since Python is about 100 times slower than C++ code of the same kind, things are bound to be slow. Writing out and compressing the data also is going to slow down things a lot.

Ninh_Hoang · March 22, 2024, 2:01pm

Yes. I know. Thank you for your comments.
I just want to ask that there is any other ways to get Atom ID from Atom Number instead of using command “extract_atom(‘id’)”?
Thank you very much.

akohlmey · March 22, 2024, 3:06pm

No.

Let me emphasize, it is not the access to the Atom->tag array that slows you down.
It is the process of collecting, reordering, and converting a massive amount of data with python that is slow.

Ninh_Hoang · March 25, 2024, 12:39pm

Yes, Thank you for your answers.

Vasilii_Maksimov · May 24, 2024, 8:49pm

Dear Axel,

Would you please explain how to get a neighbor list of an atom with a particular id without having to loop over the lammps.extract_atom(“id”) list? From what I understand there is an Atom::map() object that maps atom id to atom number, but I was unable to find a way to obtain it in python.

Sincerely,
Vasilii Maksimov

akohlmey · May 24, 2024, 9:44pm

Off the top of my memory, there isn’t.

Thus, at the moment you would have to construct the reverse mapping yourself in python.
You can get the output for lammps.extract_atom("id") and then build a “reverse” list, e.g. as a dictionary.
Be careful with ghost atoms, they may already exist as local atoms and thus should not be added in that case and thus the have the ghost atom index override the local atom index.

Of course, the proper solution would be to add a direct interface to that functionality to the c-library interface and from that also to the python interface.
To identify multiple atoms with the same atom ID (one local, multiple ghosts when the system is small and the cutoff large) one would need to interface the “sametag” list as well.

I can look into implementing it.

akohlmey · May 25, 2024, 3:57am

This pull request: Add support for extracting a few more properties and the Atom::map() function to the library interface by akohlmey · Pull Request #4174 · lammps/lammps · GitHub
has an implementation of a wrapper for the Atom::map() function added to the C-library interface and that was exported to the python module.

This can be used as follows with the peptide example input:

from lammps import lammps
lmp = lammps(cmdargs=['-log', 'none'])

sometags = [1, 10, 25, 100]
lmp.file('in.peptide')

tags = lmp.extract_atom('id')
for mytag in sometags:
    myidx = lmp.map_atom(mytag)
    print("tag = %d  idx = %d  check = %d" % (mytag, myidx, tags[myidx]))

lmp.close()

Vasilii_Maksimov · May 25, 2024, 6:37pm

Dear Axel,

Thank you so much for such a prompt response and implementation of the feature! I can’t believe I have not thought of just building an inverse map myself. I have successfully ran your example with in.peptide file, however, for some reason when I try this on my system, the map does not seem to get build. I assume that is the case as I am getting a “-1” returned as a result of any lmp.map_atom(tag) call. I just can’t figure out what triggers the creation of a map (if that is even the issue). Here are all the files that I am trying this with.

Also, here is the python code just to have it in the question:

from lammps import lammps

lmp = lammps(cmdargs = ["-log", "none", "-screen", "none"])
lmp.command("units real")
lmp.command("atom_style charge")
lmp.command("read_data glass_SiO.structure")
lmp.command("include pot_SiO.FF")
lmp.command("neighbor 2.0 bin")
lmp.command("run 0")
tags = lmp.extract_atom("id")

sometags = [1, 10, 25, 100]
for mytag in sometags:
    myidx = lmp.map_atom(mytag)
    print("tag = %d  idx = %d  check = %d" % (mytag, myidx, tags[myidx]))

lmp.close()

Output:

tag = 1  idx = -1  check = 0
tag = 10  idx = -1  check = 0
tag = 25  idx = -1  check = 0
tag = 100  idx = -1  check = 0

EDIT:
I have also noticed that looping over the lammps.extract_atom(“id”) object throws the following segmentation fault error at the end:

*** Process received signal ***
Signal: Segmentation fault (11)
Signal code: Address not mapped (1)
Failing at address: 0x5e068f766000

The same error (I understand it’s very generic) is present when trying to convert this “tags” object to a list via the list() method. A slightly different error appears when trying to convert it to a numpy array:

  File "working_venv/lib/python3.10/site-packages/numpy/core/_internal.py", line 654, in _dtype_from_pep3118
    dtype, align = __dtype_from_pep3118(stream, is_subdtype=False)
  File "working_venv/lib/python3.10/site-packages/numpy/core/_internal.py", line 727, in __dtype_from_pep3118
    raise NotImplementedError(
NotImplementedError: Unrepresentable PEP 3118 data type '&' (pointers)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "peptide/test.py", line 8, in <module>
    tags = np.array(lmp.extract_atom('id'))
ValueError: '&<i' is not a valid PEP 3118 buffer format string
(

I’m guessing it has something to do with an incorrect pointer assignment at the end of iteration, but this is only a guess.

Also, I am unsure whether it is appropriate to keep adding questions to this thread, so if you believe it is better for me to start the new one, please let me know.

Thank you again,
Vasilii

akohlmey · May 25, 2024, 7:11pm

By default, atom maps are only built when some component of the system requires it (e.g. you have bonds). Apparently, that is not the case for your simulation. But that can be easily remedied by inserting

lmp.command("atom_modify map yes")

before the “read_data” command (you cannot change it afterwards).

With that the test python script has the expected output:

[akohlmey@crunch LAMMPS_python_map_issue]$ python main.py
tag = 1  idx = 385  check = 1
tag = 10  idx = 3703  check = 10
tag = 25  idx = 818  check = 25
tag = 100  idx = 3415  check = 100

Vasilii_Maksimov · May 25, 2024, 7:21pm

Thank you! It finally works.

Best,
Vasilii