Implement fix pimd of LAMMPS by python

Hi,

I am confused about how to implement the functionality of fix pimd of LAMMPS by Python in a parallel way. Namely, I have a LAMMPS input script named in.lammps which uses fix pimd to perform path-integral molecular dynamics. And the command to directly run it with lmp engine is

mpirun -np 4 $LMP -partition 4x1 -in in.lammps -log log_file/log -screen screen_file/screen

where $LMP is the path to lmp engine, and output files will be stored in log_file and screen_file folders. The above workflow is confirmed to be valid and produce desired results.

Then I would like to use LAMMPS python module to implement the fix pimd function. The needed mpi4py module has been installed, the LAMMPS has been built in shared mode, and the LAMMPS python module has been installed successfully. When the LAMMPS input script is designed to run a classical MD simulation instead of PIMD, I have tested that they work well, i.e., use mpi4py to run LAMMPS by python in a parallel way, by using the command

mpirun -np 4 python test.py

But now I get confused about how to incorporate the partition part of the command -partition 4x1 for a PIMD simulation, which makes each replica (or bead) be assigned to one processor to do the simulation. I have tested that if I do not include this partition command explicitly, but still use mpirun -np 4 python test.py to run the code, then it only gives me 1 output file, but it should produce 4 (1 for each replica) in principle.

Briefly speaking, I wonder how to use LAMMPS python module to perform PIMD simulation in a parallel way. I will appreciate it if someone can help me.

FYI, the python script is

from mpi4py import MPI
from lammps import lammps

lmp = lammps()
lmp.file(“in.lammps”)
lmp.close()

I just posted an example for that in a different discussion: MSEVB using Python API

What the -partition flag does in LAMMPS is that it takes the global communicator (MPI_COMM_WORLD) and splits it into 4. Then each of the MPI ranks creates a LAMMPS instance with the subcommunicator passed as argument (by default LAMMPS uses the global one).

For demonstration purposes, here is an example that runs two LAMMPS instances on two processors each, i.e. the equivalent of -partition 2x2, if you do mpirun -np 4:

import sys
from mpi4py import MPI
from lammps import lammps

npartition = 2

me_global = MPI.COMM_WORLD.Get_rank()
np_global = MPI.COMM_WORLD.Get_size()

np_per_comm = np_global / npartition
if np_global != npartition*np_per_comm:
    sys.exit("inconsistent number of partitions")

if me_global == 0:
    print("Running on %d partitions of %d processors" % (npartition, np_per_comm))

color = me_global // np_per_comm
comm = MPI.COMM_WORLD.Split(color=color)
lmp = lammps(cmdargs=['-nocite', '-log', 'log.lammps-'+str(me_global), '-screen', 'none'], comm=comm)
lmp.file('in.lammps')
lmp.close()

Of course, this is only a small part of what would need to be done. You also need to create communicators to exchange information between the different replica. This is where it gets complicated and beyond the scope of what can be explained here.

Hi Akohlmey,

Thanks for your reply! Nice to meet you again.

In your suggestion, you said I have to “create communicators to exchange information between different replicas”, i.e., between different processors. Regarding a PIMD simulation, it indeed requires exchanging information among processors, such as atomic position, forces and so on. But I thought before that “how to perform a PIMD simulation” has been written in the source code of LAMMPS with c++, in which it has been specified how MPI works and what information is communicated in order to successfully perform a PIMD. However, according to what you said, it seems that I have to write this by myself at the level of Python, like specifying how positions, images and forces are communicated. And it sounds like the underlying source codes of LAMMPS do not play their roles to lead the communication. Is my understanding correct?

How to perform a PIMD calculation has been programmed into fix pimd.

If you don’t use the functionality of fix pimd, but want to replace it with python code, you have to do everything that fix pimd does in your python code.

LAMMPS internally, does not know anything about this. When you look at the example code I provided, it just runs the LAMMPS calculations side-by-side and they know nothing of each other. All the steps that would turn this into a PIMD calculation are in fix pimd.

There are some convenience functions that can be used to exchange data between processes, but with programming this in python, you have only access to a subset of the functionality that exists internally in C++, since the LAMMPS python interface is based on the C library interface which only exports a subset of the internal functionality.

That said, if you want to look at PIMD functionality written in Python with an interface to LAMMPS, I suggest you look at the i-PI project led by Michele Ceriotti at EPFL in Lausanne.

It has an interface to LAMMPS in the fix ipi command — LAMMPS documentation

Thanks for the detailed explanation. It is my fault that I did not clarify my purpose above. In fact, I do neither want to do anything as ambitious as coding a “fix pimd” with python by myself, nor look for a python version of PIMD. Instead, what I would like to do, is to simply use LAMMPS’s fix pimd to perform a safe and correct PIMD simulation, assisted by python.

To be specific, as long as I write down fix pimd xxxxx in the input script in.lammps of LAMMPS, and run it with

mpirun -np 4 $LMP -partition 4x1 -in in.lammps -log log_file/log -screen screen_file/screen

Everything works well, and I get good results thanks to LAMMPS’s fix pimd function. It is what I desired, and makes me want to use LAMMPS fix pimd to complete my project. So, now I have a correct input script in.lammps, and a reliable LAMMPS fix pimd function, which I would not modify or change.

My question is merely about how to write a python script that applies LAMMPS python module lammps in order to repeat what I have done above. In other words, I just want to use python to read and execute the input script in.lammps, and run LAMMPS in some way such that I do an equivalent PIMD simulation. But I do not know how to do it in a proper way. The main obstacle prevents me doing this is I do not know how to recover the -partition 4x1 part.

Although … why? What do you need your python wrapper script to do that can’t be done either by a Bash script or in post-processing?

Hi Srtee,

Actually I am doing an inverse problem, which needs to iteratively repeatedly perform PIMD + post-processing parts. For example, the program can look like

while( judgement ):
{
do PIMD ( with updated things )
post process
update something
}

Since LAMMPS is very efficient and widely used, I would like to use LAMMPS to do the PIMD part. And due to the judgement, post-processing and updating parts are somehow complex and lengthy, I prefer to use a more integrated programming language like python to realize these, instead of just code while loop in shell script.

In Bash: (script from Emulating a do-while loop in Bash - Stack Overflow)

# do LAMMPS run while (postprocess_and_decide.py returns 1)
while : ; do
  mpirun lmp -i in.pimd
  python3 postprocess_and_decide.py 
  # which ends either with exit(1) to run again
  # or exit(0) to halt
  # $? in next line returns exit code of last run process
  [[ $? == 0 ]] && break
done

This is not the only way or even the best way to do what you want to do. But it will be clear to many, many more people what you are trying to do, including (1) your future self / PhD students and (2) your sysadmins, who should be able to understand this script without having to learn the LAMMPS-Python interface.

Yep, this is definitely a way to achieve what I want to do. But I think I am only on step far from my goal of doing everything in python. If it turns out I cannot launch LAMMPS fix pimd via python, I would look for other ways, including what you suggested. In fact, there are several reasons why I did not use bash script: (1) The structure of while loop is more complicated than above example, which makes python more comfortable for me; (2) If I partition the whole program into the pattern shown in the Bash script, I have to launch python script many times, which requires to do I/O operations frequently and also needs to import modules and do some initializations frequently, where due to some reasons may cost time that is comparable to run my desired chunk of code.

Have you tried reading the documentation of the LAMMPS python module?
…and looked more closely at my example?
What do you think the purpose of the “cmdargs” argument is?

This sounds like a case of a disease also known as “premature optimization™”. In other words, you are making your life more complicated before you even know that the problem you are trying to avoid is going to be a significant problem.