PyLammps hanging in parallel execution

111246 · January 23, 2020, 5:35pm

Dear all,

I am trying to run the attached python script which imports lammps as module and makes use of the OpenKIM interface.
When I run the script in serial mode using the command:

python test.py Ag.data EAM_Dynamo_AcklandTichyVitek_1987_Ag__MO_212700056563_005

the script runs fine. When I run the script in parallel using mpirun:

mpirun -np 2 python test.py Ag.data EAM_Dynamo_AcklandTichyVitek_1987_Ag__MO_212700056563_005

the script hangs due to the command at line 45:

indx = L.atoms[iat].type - 1

which is part of a loop over all particles of the system. Following the instructions in the manual, this command is executed only if MPI.COMM_WORLD.rank == 0.
I have tested the script using the latest version available at GitHub.
How can I overcome this issue?

Thanks in advance for your help
Evangelos Voyiatzis

test.py (1.92 KB)

Ag.data (2.26 KB)

akohlmey · January 23, 2020, 8:57pm

Dear all,

I am trying to run the attached python script which imports lammps as module and makes use of the OpenKIM interface.
When I run the script in serial mode using the command:

python test.py Ag.data EAM_Dynamo_AcklandTichyVitek_1987_Ag__MO_212700056563_005

the script runs fine. When I run the script in parallel using mpirun:

mpirun -np 2 python test.py Ag.data EAM_Dynamo_AcklandTichyVitek_1987_Ag__MO_212700056563_005

the script hangs due to the command at line 45:

indx = L.atoms[iat].type - 1

which is part of a loop over all particles of the system. Following the instructions in the manual, this command is executed only if MPI.COMM_WORLD.rank == 0.
I have tested the script using the latest version available at GitHub.
How can I overcome this issue?

i don’t think there is a simple way to resolve this within PyLammps.
the operation in evaluating L.atoms[iat].type requires an MPI_Alltoall() internally in the LAMMPS library to make distributed per-atom data globally available, but the PyLammps code also references internal functionality, that only provides data on MPI rank 0 (since it make LAMMPS generate output and then parses this output).

For complex and advanced operations like you are trying to do, I would fall back to the lammps module. There are gather/scatter operations that you can use. The fact that PyLammps in many functionality depends on triggering output (sometimes even by sending additional commands to the LAMMPS instance) makes it very problematic for parallel runs (as output is only generated on rank 0, and may have problems with block buffering, depending on the MPI library implementation).

axel.

111246 · January 24, 2020, 8:31am

thanks a lot for your time and your response Axel !

I will then look into the lammps module in more detail.

Evangelos

Στις Πέμ, 23 Ιαν 2020 στις 9:57 μ.μ., ο/η Axel Kohlmeyer <[email protected]> έγραψε: