Dear LAMMPS developers,
I am using LAMMPS to evaluate atomic forces according to the ReaxFF force field for 20 predetermined sets of atomic positions. In the main input file (crca_20201212_114224_000_0000.in) I have a loop which repeatedly includes other files (crca_20201212_114224_000_0000_{1…20}.in) containing the atomic positions. With a single processor, this takes 8.609s on my laptop running Debian stretch. However, if I instead use 2 partitions of 1 processor each, and use a uloop variable so that the 20 atomic configurations are split between the 2 partitions, the runtime increases to 11.924s instead of decreasing as I expected it to. Do you have any idea why this happened, and how I might improve the performance as I use more processors? I have attached the input files for the 1-process run in mwefast.tar.gz and the 2-process run in mweslow.tar.gz. I timed the runs by calling ‘time ./run.sh’ in each of the directories once decompressed. I’d be very grateful for your advice.
Kind regards,
Matthew Okenyi
mwefast.tar.gz (127 KB)
mweslow.tar.gz (128 KB)