molindex and identifying atoms on the same molecule

Dear LAMMPS users and developers,

I am trying to add a new compute to LAMMPS, and I need to be able to identify which atoms are part of the same molecule.

Under atom_style “full”, the molecule number of each atom is defined in the data file so I am thinking to do a comparison of molecule numbers for each atom. I would like to ask:

Is the molecule number assigned to each atom held in atom->molindex?

The array is empty when called from a typical compute like
compute_pe_atom.cpp’s ComputePEAtom::compute_peratom()
or
compute X/tally

In each case I get

(gdb) print(atom->molindex)
$1 = (int *) 0x0

My guess is that either

(1) An array different to molindex holds the molecule-member information
(2) Lammps doesn’t hold molecule-member information beyond the initial read_data (in which case, any suggestion about a different way to do this would be appreciated)
(3) molindex is correct but I need to do something to make this information available within the computes above.

I would really appreciate any guidance on this issue.

Thank you,
Best,

Jeff

Dear LAMMPS users and developers,

I am trying to add a new compute to LAMMPS, and I need to be able to
identify which atoms are part of the same molecule.

Under atom_style "full", the molecule number of each atom is defined in
the data file so I am thinking to do a comparison of molecule numbers for
each atom. I would like to ask:

Is the molecule number assigned to each atom held in atom->molindex?

​no.​

The array is empty when called from a typical compute like
  compute_pe_atom.cpp's ComputePEAtom::compute_peratom()
or
  compute X/tally

​if you'd looked closer at compute_pe_mol_tally.cpp,​ you'd have seen, that
what you are looking for is stored in atom->molecule and that
when atom->molecule_flag is non-zero, this information is available.

axel.

p.s.: please note, that molecule IDs in LAMMPS are purely indicating
molecules by convention and thus may not be tied to individual molecules at
all.

Dear Axel,

Thank you so much for your quick reply. That’s really helpful.

I don’t fully understand what you mean by “molecule IDs in LAMMPS are purely indicating molecules by convention and thus may not be tied to individual molecules at all”. Would you mind elaborating further? Based on my reading of compute_pe_mol_tally, I assume that all the atoms in the same molecule will always have the same molecule ID as each other.

Thanks again.

Best,

Jeff

They will if the data file that defined the mol IDs assigned
them that way. But there is no requirement for a user to create
a data file with that meaning for the listed mol IDs.

Steve

Dear Axel,

Thank you so much for your quick reply. That's really helpful.

I don't fully understand what you mean by "molecule IDs in LAMMPS are
purely indicating molecules by convention and thus may not be tied to
individual molecules at all". Would you mind elaborating further? Based on
my reading of compute_pe_mol_tally, I assume that all the atoms in the same
molecule will always have the same molecule ID as each other.

as mentioned, ​that is only by convention. the molecule id has no physical
meaning during the simulation and thus can be used to group atoms any which
way you like. the conventional use is to group atoms by molecule, however
other grouping (e.g. by residue) is possible, too. some LAMMPS commands
also follow the convention that isolated atoms all have the same molecule
id 0. the grouping by molecule id pre-dates the "chunk" infrastructure in
LAMMPS, which extends and simplifies grouping of atoms in LAMMPS.
for simple classical force fields, the determination of molecules (as atoms
connected by explicit bonds) can also be done by LAMMPS itself through the
fragment/atom compute.

axel.

Dear Axel and Steve,

Thank you both very much for elaborating further. That’s very good to know.

Best,

Jeff

Dear Axel and Steve,

As a follow-up to my earlier question, I am trying to learn how lammps works by following the code execution with gdb but I am finding:

  1. backtrace cannot give me all the functions before pair.cpp - instead of file:line-number I get a memory address and “??”:
    (This example is using RESPA)

Breakpoint 1, LAMMPS_NS::ComputePEMolTally::pair_tally_callback (this=0x100a0b5a0, i=0, j=1, nlocal=1105, newton=1, evdwl=0,
ecoul=-2.9017068011391349) at …/compute_pe_mol_tally.cpp:96
96 if ( ((mask[i] & groupbit) && (mask[j] & groupbit2))
(gdb) bt
#0 LAMMPS_NS::ComputePEMolTally::pair_tally_callback (this=0x100a0b5a0, i=0, j=1, nlocal=1105, newton=1, evdwl=0,
ecoul=-2.9017068011391349) at …/compute_pe_mol_tally.cpp:96
#1 0x00000001003b6709 in LAMMPS_NS::Pair::ev_tally (this=0x100a098b0, i=0, j=1, nlocal=1105, newton_pair=1, evdwl=0,
ecoul=-2.9017068011391349, fpair=-0.14064396945146065, delx=-1.3739999999999988, dely=-0.66300000000000026,
delz=-0.16599999999999948) at …/pair.cpp:908
#2 0x3ff217eb8ee294fc in ?? ()
#3 0x00000001000e6460 in ?? () at …/compute_pe_mol_tally.cpp:86
#4 0x00000001055b99d6 in ?? ()
#5 0x0000000100a098b0 in ?? ()
#6 0x0000000100a0b5a0 in ?? ()
#7 0x00000000ade2fd05 in ?? ()
#8 0x3ff10f6f9c6b86ea in ?? ()
#9 0xbfd2bf929b1e6594 in ?? ()
#10 0x3fe6f789b8a10c27 in ?? ()
#11 0xbfc53f7ced916860 in ?? ()
#12 0xbfe5374bc6a7efa0 in ?? ()
#13 0xbff5fbe76c8b4390 in ?? ()
#14 0xbfc2009f20963a0d in ?? ()
#15 0xc00736b20e2bc971 in ?? ()
#16 0x0000000000000000 in ?? ()

  1. the first line that gdb hits is not a main function, but a line in library.cpp:
    (gdb) start -in in.lammps -e both
    Temporary breakpoint 1 at 0x100324110: main. (2 locations)
    Starting program: /Users/jeff/lammps/src/lmp_serial -in in.lammps -e both
    Temporary breakpoint 1, 0x0000000100324130 in lammps_create_atoms (ptr=0x1, n=0, id=0x7fff5fbff580, type=0xa900adb097f62c82,
    x=0x0, v=0x0, image=0x0, shrinkexceed=0) at …/library.cpp:1041
    1041 if (image) atom->image[nlocal] = image[i];

  2. certain lines and files (like read_data.cpp) can not have breakpoints placed:

(gdb) break read_data.cpp:126
Cannot access memory at address 0x100000160

I understand I need to compile with gdb flags and no optimization, so I modified Makefile.serial to be:
CC = g++
CCFLAGS = -ggdb -O0
SHFLAGS = -fPIC
DEPFLAGS = -M

LINK = g++
LINKFLAGS = -ggdb -O0
LIB =
SIZE = size

ARCHIVE = ar
ARFLAGS = -rc
SHLIBFLAGS = #-shared (I tried with and without shared)

(I do have FFTW on this mac, but I added “FFT_INC = -DFFT_NONE” to use KISS)

I also made sure to compile STUBS/Makefile with gdb flags

CC = g++
CCFLAGS = -ggdb -O0 -fPIC -I.
ARCHIVE = ar
ARCHFLAG = rs

Do I need to make further changes to the makefiles?
Any guidance would be appreciated.
(I ask here because I figure this is a problem with my understanding of how to compile lammps, rather than a problem with how to use gdb, but please accept my apologies if this guess is wrong.)

Thank you.

Best,

Jeff

Dear Axel and Steve,

As a follow-up to my earlier question, I am trying to learn how lammps
works by following the code execution with gdb but I am finding:

1. backtrace cannot give me all the functions before pair.cpp - instead of
file:line-number I get a memory address and "??":
(This example is using RESPA)
Breakpoint 1, LAMMPS_NS::ComputePEMolTally::pair_tally_callback
(this=0x100a0b5a0, i=0, j=1, nlocal=1105, newton=1, evdwl=0,
    ecoul=-2.9017068011391349) at ../compute_pe_mol_tally.cpp:96
96 if ( ((mask[i] & groupbit) && (mask[j] & groupbit2))
(gdb) bt
#0 LAMMPS_NS::ComputePEMolTally::pair_tally_callback (this=0x100a0b5a0,
i=0, j=1, nlocal=1105, newton=1, evdwl=0,
    ecoul=-2.9017068011391349) at ../compute_pe_mol_tally.cpp:96
#1 0x00000001003b6709 in LAMMPS_NS::Pair::ev_tally (this=0x100a098b0,
i=0, j=1, nlocal=1105, newton_pair=1, evdwl=0,
    ecoul=-2.9017068011391349, fpair=-0.14064396945146065,
delx=-1.3739999999999988, dely=-0.66300000000000026,
    delz=-0.16599999999999948) at ../pair.cpp:908
#2 0x3ff217eb8ee294fc in ?? ()
#3 0x00000001000e6460 in ?? () at ../compute_pe_mol_tally.cpp:86
#4 0x00000001055b99d6 in ?? ()
#5 0x0000000100a098b0 in ?? ()
#6 0x0000000100a0b5a0 in ?? ()
#7 0x00000000ade2fd05 in ?? ()
#8 0x3ff10f6f9c6b86ea in ?? ()
#9 0xbfd2bf929b1e6594 in ?? ()
#10 0x3fe6f789b8a10c27 in ?? ()
#11 0xbfc53f7ced916860 in ?? ()
#12 0xbfe5374bc6a7efa0 in ?? ()
#13 0xbff5fbe76c8b4390 in ?? ()
#14 0xbfc2009f20963a0d in ?? ()
#15 0xc00736b20e2bc971 in ?? ()
#16 0x0000000000000000 in ?? ()

2. the first line that gdb hits is not a main function, but a line in
library.cpp:
(gdb) start -in in.lammps -e both
Temporary breakpoint 1 at 0x100324110: main. (2 locations)
Starting program: /Users/jeff/lammps/src/lmp_serial -in in.lammps -e both
Temporary breakpoint 1, 0x0000000100324130 in lammps_create_atoms
(ptr=0x1, n=0, id=0x7fff5fbff580, type=0xa900adb097f62c82,
    x=0x0, v=0x0, image=0x0, shrinkexceed=0) at ../library.cpp:1041
1041 if (image) atom->image[nlocal] = image[i];

3. certain lines and files (like read_data.cpp) can not have breakpoints
placed:
(gdb) break read_data.cpp:126
Cannot access memory at address 0x100000160

I understand I need to compile with gdb flags and no optimization, so I
modified Makefile.serial to be:
CC = g++
CCFLAGS = -ggdb -O0
SHFLAGS = -fPIC
DEPFLAGS = -M

LINK = g++
LINKFLAGS = -ggdb -O0
LIB =
SIZE = size

ARCHIVE = ar
ARFLAGS = -rc
SHLIBFLAGS = #-shared (I tried with and without shared)

(I do have FFTW on this mac, but I added "FFT_INC = -DFFT_NONE" to use
KISS)

I also made sure to compile STUBS/Makefile with gdb flags

CC = g++
CCFLAGS = -ggdb -O0 -fPIC -I.
ARCHIVE = ar
ARCHFLAG = rs

Do I need to make further changes to the makefiles?

​the symptoms you describe all suggest that your compiled objects are
inconsistent with ​the source code.
do a "make clean-all", and "make serial"

beyond that, there is little else to say. i don't use a mac and thus cannot
give any advice on development tools on a mac. it is known that those tend
to be "quirky" compared to linux.

Any guidance would be appreciated.
(I ask here because I figure this is a problem with my understanding of
how to compile lammps, rather than a problem with how to use gdb, but
please accept my apologies if this guess is wrong.)

​it is difficult to say what is wrong. watch carefully, if there are any
warnings, when you load lammps into gdb.

axel.​

Dear Axel,

Thank you so much for your reply.

I tested it on my linux box, and it works fine. Thank you for the advice!

There are no warnings about inconsistent code when I start gdb. I also tried clean-all and deleting the files built in STUBS, but this does not change the situation on my mac. The only difference in compilation between the mac and the linux box is a warning on the mac that does not occur on linux:

rm -f *.o libmpi_stubs.a
g++ -ggdb -O0 -fPIC -I. -c mpi.c
clang: warning: treating ‘c’ input as ‘c++’ when in C++ mode, this behavior is deprecated

and several warnings during compilation of the form:

warning: format specifies type ‘long’ but the argument has type ‘long long’ [-Wformat]

So as you say, it seems to be a quirk on the mac.

I will do my development on the linux box from now on.

Thank you again.

Best,

Jeff

Dear Axel,

Thank you so much for your reply.
I tested it on my linux box, and it works fine. Thank you for the advice!

There are no warnings about inconsistent code when I start gdb. I also
tried clean-all and deleting the files built in STUBS, but this does not
change the situation on my mac. The only difference in compilation between
the mac and the linux box is a warning on the mac that does not occur on
linux:

rm -f *.o libmpi_stubs.a
g++ -ggdb -O0 -fPIC -I. -c mpi.c
clang: warning: treating 'c' input as 'c++' when in C++ mode, this
behavior is deprecated

​the warning is correct and should give you pause.​
​the file in STUBS is C code, not C++ and thus should be compiled with gcc
not g++​

and several warnings during compilation of the form:

warning: format specifies type 'long' but the argument has type 'long
long' [-Wformat]

​those are also valid an​d need to be addressed eventually. but are not in
any critical parts of the code and thus low priority.
if you'd like to help out with development, you can file an issue on github
where you report all of those "long vs. long long" format warnings.
LAMMPS development has been slowed down quite a bit over the last few
months, since all developers are kept busy by other projects (that pay for
our salaries).
you can easily see that from the fact that responses more often appear
outside business hours.

So as you say, it seems to be a quirk on the mac.

​well, the warning from above shows that your "g++" is actually "clang++".
not sure whether the debugger you use is actually the real gdb and fully
compatible with clang.​

​axel.​