Running multiple Lammps jobs with python (multiprocessing)

Dear developers
I wish run multiple LAMMPS (Windows 10 version AUg 2023) jobs in parallel using Python.
I use Python multiprocessing package. All LAMMPS jobs run successfully but not in parallele but one after each other.
the Python code is:

import os
from lammps import lammps
import multiprocessing

# import random
# create multiple lammps jobs
# a=[]
# for i in range(30):
#     a.append(random.randint(10000,50000))
# print(a)
   
args0 = ["-var", "myfile", "pubchem2256_atrazine", "-var", "seed1", "100001", "-var", "seed2", "100002", "-var", "seed3", "100003"]
args1 = ["-var", "myfile", "pubchem2256_atrazine", "-var", "seed1", "200001", "-var", "seed2", "200002", "-var", "seed3", "200003"]
args2 = ["-var", "myfile", "pubchem2256_atrazine", "-var", "seed1", "300001", "-var", "seed2", "300002", "-var", "seed3", "300003"]    
args3 = ["-var", "myfile", "pubchem2256_atrazine", "-var", "seed1", "400001", "-var", "seed2", "400002", "-var", "seed3", "400003"]
args4 = ["-var", "myfile", "pubchem2256_atrazine", "-var", "seed1", "500001", "-var", "seed2", "500002", "-var", "seed3", "500003"]
args5 = ["-var", "myfile", "pubchem2256_atrazine", "-var", "seed1", "600001", "-var", "seed2", "600002", "-var", "seed3", "600003"]
args6 = ["-var", "myfile", "pubchem2256_atrazine", "-var", "seed1", "700001", "-var", "seed2", "700002", "-var", "seed3", "700003"]    
args7 = ["-var", "myfile", "pubchem2256_atrazine", "-var", "seed1", "800001", "-var", "seed2", "800002", "-var", "seed3", "800003"]
args8 = ["-var", "myfile", "pubchem2256_atrazine", "-var", "seed1", "900001", "-var", "seed2", "900002", "-var", "seed3", "900003"]    
args9 = ["-var", "myfile", "pubchem2256_atrazine", "-var", "seed1", "910001", "-var", "seed2", "910002", "-var", "seed3", "910003"]

lmp0 = lammps(cmdargs=args0)
lmp1 = lammps(cmdargs=args1)
lmp2 = lammps(cmdargs=args2)
lmp3 = lammps(cmdargs=args3)
lmp4 = lammps(cmdargs=args4)
lmp5 = lammps(cmdargs=args5)
lmp6 = lammps(cmdargs=args6)
lmp7 = lammps(cmdargs=args7)
lmp8 = lammps(cmdargs=args8)
lmp9 = lammps(cmdargs=args9)

# creating multiple processes
proc0 = multiprocessing.Process(target=lmp0.file("Organochloride_HO-H2O.lammps"))
proc1 = multiprocessing.Process(target=lmp1.file("Organochloride_HO-H2O.lammps"))
proc2 = multiprocessing.Process(target=lmp2.file("Organochloride_HO-H2O.lammps"))
proc3 = multiprocessing.Process(target=lmp3.file("Organochloride_HO-H2O.lammps"))
proc4 = multiprocessing.Process(target=lmp4.file("Organochloride_HO-H2O.lammps"))
proc5 = multiprocessing.Process(target=lmp5.file("Organochloride_HO-H2O.lammps"))
proc6 = multiprocessing.Process(target=lmp6.file("Organochloride_HO-H2O.lammps"))
proc7 = multiprocessing.Process(target=lmp7.file("Organochloride_HO-H2O.lammps"))
proc8 = multiprocessing.Process(target=lmp9.file("Organochloride_HO-H2O.lammps"))
proc9 = multiprocessing.Process(target=lmp9.file("Organochloride_HO-H2O.lammps"))

# Initiating process 1 to n
proc0.start()
proc1.start()
proc2.start()
proc3.start()
proc4.start()
proc5.start()
proc6.start()
proc7.start()
proc8.start()
proc9.start()

# Waiting until proc 0 to n finishes
# proc0.join()
# proc1.join()
# proc2.join()
# proc3.join()
# proc4.join()
# proc5.join()
# proc6.join()
# proc7.join()
# proc8.join()
# proc9.join()

# Processes finished
print("Both Processes Completed!")

This a rather simple procedure. But I do not understand why processes are ran one after each other, instead simultaneously. I ave also activated proc0.join(), … But nothing different. Because thee is no more instructions in Python code, calling Lammps.
Is there something wrong ?
Thanks a lot for help
Kindest regards
Pascal

Why so complicated? Why not simply use LAMMPS in multi-partition mode?
Like with this demo input?

variable seed1 world 100001 200001 300001 400001 500001 600001 700001 800001 900001 910001
variable seed2 world 100002 200002 300002 400002 500002 600002 700002 800002 900002 910002
variable seed3 world 100003 200003 300003 400003 500003 600003 700003 800003 900003 910003

print "${seed1} ${seed2} ${seed3}"

When run with this command line:

mpiexe -n 10 lmp -in in.partition  -p 10x1

It should run 10 single processor runs where each has the expected set of seed variables.

If you replace the “print” statement with “include ubchem2256_atrazine” it will run your simulations instead. If you want 5 partitions of 2 processors each, then you have to switch to a “universe” style variable and make a loop. For more details see the LAMMPS manual about multi-partition runs and the variable command.

As for your python problem. It seems wrong to me that you create each LAMMPS instance before engaing the multi-processing module. Rather you would put each run that would create and delete a LAMMPS instance into a function and then distribute those functions.

Great
Thanks a lot Axel. I was not aware of this possibility.
I will try and first read the corresponding section in the manual. I agree, it is more appropriate for my need.
Best
Pascal

Please also have a look at: 8.1.3. Run multiple simulations from one input script — LAMMPS documentation and 8.1.4. Multi-replica simulations — LAMMPS documentation

Dear Axel
I have adapted to my code. It works very fine.
By this way I am running the same simulation of a single molecule 10 different initial positions and orientation using create_atoms command
I run the job as:
mpiexec -np 10 lmp -p 10x1 -in file.lammps -var myfile molecule1
The input script contains (as you suggested):

variable seed1 world  N1 N2 N3 N4 N5 .... N10
variable seed2 world  N1 N2 N3 N4 N5 .... N10
variable seed2 world  N1 N2 N3 N4 N5 .... N10
variable file string myfile  # myfile is used with -var myfile The name of the chosen molecule

If I wish also to run the script for molecule1 molecule2 molecule3, with 10 runs for each.
should I write the code part as:

variable myfile world molecule1 molecule2 molecule3
variable seed1 world  N1 N2 N3 N4 N5 .... N10
variable seed2 world  N1 N2 N3 N4 N5 .... N10
variable seed2 world  N1 N2 N3 N4 N5 .... N10

and run with
mpiexec -np 30 lmp -p 30x1 -in file.lammps
?
Thanks a lot again

Now I wish to run 10 such simulations for say 3 different mocules

No it doesn’t work like this.

You must have as many entries for a “world” style variable as you have partitions.
Thus your “myfile” variable would have to have 10 times the molecule1 value, then 10 times the molecule2 value and 10 times the molecule3 value. Similarly the seed variables would have to have their list of arguments repeated 3 times. This way each “world” will have a different set of “seed1/2/3” values.

The alternative would be to do a loop. My previous example would then be modified to:

variable myfile index molecule1 molecule2 molecule3

label repeat
variable seed1 delete
variable seed2 delete
variable seed3 delete

variable seed1 world 100001 200001 300001 400001 500001 600001 700001 800001 900001 910001
variable seed2 world 100002 200002 300002 400002 500002 600002 700002 800002 900002 910002
variable seed3 world 100003 200003 300003 400003 500003 600003 700003 800003 900003 910003

print "${myfile} ${seed1} ${seed2} ${seed3}"
next myfile
jump SELF repeat

This could be run with: mpirun -np 30 ./lmp -in in.partition -p 10x3 This would run each molecule after the other on 10 partitions, but also would use 3 processors per partition.

Dear Axel
Thank you very much for your explanations regarding the two alternatives.
I will test both.
And give feddback when completed.
Kind regards
Pascal

Dear Axel,
When checking my posts I see I did not answer this one as promised.
Since your advices I run such partition jobs which allows me to account for enough statistics.
Thank you again.
Best
Pascal