Multi-partition mode and file variables

Dario_Marrocchelli · February 14, 2014, 5:18pm

Dear All,

I am trying to run several short calculations using the multi-partition mode in lammps. I attach my input file. My issue is that my input script reads the id of the atom to remove at each run from a file called id.dat. Unfortunately, when I switch the multi-partition mode, the file variable I have defined is not updated between the different partitions (though it is when the partitions switch to the next run). I have tried both ways of reading the next string from the file (next and next()) but those din’t work. The manual says that all the variables should be next’d using one single command (e.g. next index number), but if I try that lammps complains that the variables are different. The issue seems to be that I would need a universe style file variable, which does not exist!

Is there any way around this?

Thanks,

Dario

STO.in (1.37 KB)

akohlmey · February 14, 2014, 5:28pm

Dear All,

I am trying to run several short calculations using the multi-partition mode in lammps. I attach my input file. My issue is that my input script reads the id of the atom to remove at each run from a file called id.dat. Unfortunately, when I switch the multi-partition mode, the file variable I have defined is not updated between the different partitions (though it is when the partitions switch to the next run). I have tried both ways of reading the next string from the file (next and next()) but those din't work. The manual says that all the variables should be next'd using one single command (e.g. next index number), but if I try that lammps complains that the variables are different. The issue seems to be that I would need a universe style file variable, which does not exist!

Is there any way around this?

split your file and let each partition read a separate file based on
the partition id. also, you need to have a different output per
partition or it will be corrupted sooner or later. writing to the same
file concurrently from multiple MPI task is screaming for a race
condition to happen. if you storage is on NFS or similar, it will
break.

axel.

sjplimp · February 15, 2014, 3:05pm

What you appear to be wanting is to
have a file with N lines in it (each with an atom ID),

Then to run N simulations on M partitions.
And have the Ith simulation when it starts

on some random partition, read the Ith line
from the file.

The file-style variables don’t work that way.

First, they read the file incrementally, one
line at time. They don’t skip to some arbitrary
line. Second, if multiple partitions
open the same file, they can all read it, but
they don’t know anything about each other.

You can’t share a file pointer across multiple

partitions (processors).

Maybe someone has an alternate idea of

how to do what you want.

Steve

Dario_Marrocchelli · February 18, 2014, 8:24pm

Dear AXel, Steve and all,

Splitting the file into M files, where M is the number of partitions works well! Only think is that I need to define a world variable to use it as a pointer for the M files. I attach the script I have written in case somebody else needs to do something similar.

Thanks,

Dario

label loopb
variable index uloop 220
variable ID world 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
variable number file fort.${ID}

read_restart restart_dislocation.*

kspace_style pppm 1e-4

thermo 10 #Output every 10 steps thermodinamic quantities
thermo_style custom step temp vol press pe lx fmax atoms cpu

set type 1 charge 1.84
set type 3 charge -1.4

variable energy equal ‘pe’
variable energy2 equal ‘pe + 2599968.6’

group oxygen_vacancy id ${number}
variable oxygen delete

dump 1 oxygen_vacancy custom 1 test_${ID} x y z

run 0

undump 1

delete_atoms group oxygen_vacancy
set type 2 charge 2.3599600457
minimize 0.0 0.5 1000 100000

shell echo {index} {energy} >> energy.dat

shell tail -1 test_{ID} >> positions shell echo {energy2} >> energy

clear

next index
jump STO.in loopb

Dario_Marrocchelli · February 20, 2014, 5:05pm

Hi Axel and all,

It seems like I was too optimistic last time; turns out that splitting the file into M files does not work either. For those who have not read the previous emails, here is what I want to do: I have a file with N lines in it (each with an atom ID) and I want to run N simulations on M partitions. And have the Ith simulation when it starts on some random partition, read the Ith line from the file.

The problem is that each partition has a slightly different speed. So if I want to run a total of 5504 calculations on 32 partitions, I am not guaranteed that each partition will run 5504/32=172 calculations. Actually, the number of calculations varies from 150 to 180. That is an issue because the M files all are the same length and there is no way to predict which partitions will be quicker… So I am back to square one.

I have decided to try a different approach; I have only one file which I manipulate with a combination of head/tail commands (see input file below). This should work (in principle) though I then get a “Substitution for illegal variable" error message… I used this approach previously with a loop variable and it worked. Do uloop variables behave differently then loop variables? Any other suggestion as to how I can do this?

Thanks,

D

label loopb
variable index uloop 5500
variable ID world 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42

read_restart restart_dislocation.*

kspace_style pppm 1e-4

thermo 10 #Output every 10 steps thermodinamic quantities
thermo_style custom step temp vol press pe lx fmax atoms cpu

set type 1 charge 1.84
set type 3 charge -1.4

variable energy equal 'pe'
variable energy2 equal 'pe + 2599831.30'

# This should ensure that I read the loop-th line

      shell head -${uloop} id.dat > head.dat
      shell tail -1 head.dat > tail.dat
      variable number file tail.dat

group oxygen_vacancy id ${number}
variable oxygen delete

dump 1 oxygen_vacancy custom 1 test_${ID} x y z

run 0

undump 1

      delete_atoms group oxygen_vacancy
      set type 2 charge 2.3599600457
      minimize 0.0 0.5 1000 100000

shell echo \{index\} {energy} >> energy.dat

shell tail -1 test_\{ID\} >> positions       shell echo {energy2} >> energy

clear

next index

jump STO.in loopb

sjplimp · February 21, 2014, 2:22pm

It seems like I was too optimistic last time; turns out that splitting the file into M files does not work >either. For those who have not read the previous emails, here is what I want to do: I have a file with N >lines in it (each with an atom ID) and I want to run N simulations on M partitions. And have the Ith >simulation when it starts on some random partition, read the Ith line from the file.

If you re-read my earlier message, that’s what I told you will not work. That’s
not how file-style variables operate. They are for one processor to read successive
line from a single file.

Why don’t you do something simple. Pre-process your file with M lines into
M files, each with a single line, and each with a name like foo.I, where I is an integer
from 1 to M.

Then each time some partition is assigned the Ith simulation, it defines a file-style
variable for file foo.I and reads the single value.

Steve

Dario_Marrocchelli · February 27, 2014, 9:29pm

Steve,

Thanks for the suggestions. Here is an update:

If I split my files into M files each with a single line, then it works. The problem is that I need to generate >10,000 files which is a potential issue for the cluster where I am running the calculations.

I have come up with a variation of this. I have one single file that I manipulate, with a series of head/tail shell commands, to give me the Nth atom id (see attached script). This is then read by a file variable (that I delete at the end of each run). This works fine!

However, I am encountering a different problem. I am using 32 partitions and looping using a uloop variable. It is my understanding that each value of the uloop variable should be used ONLY by one partition, right? However, in a very few cases (like 5 times out of 8000 calculations), two partitions get the same uloop value and therefore run the same calculation. This should not happen, right? FYI, I am using the Sep_30/2013 version of lammps.

Thanks,

D

INPUT SCRIPT

variable index uloop 8207
label loopb
variable ID world 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42

read_restart restart_dislocation.*

kspace_style pppm 1e-4

thermo 10 #Output every 10 steps thermodinamic quantities
thermo_style custom step temp vol press pe lx fmax atoms cpu

set type 1 charge 1.84
set type 3 charge -1.4

variable energy equal ‘pe’
variable energy2 equal ‘pe +1934998.63’

shell head -{index} id.dat > ./files/head.dat_{ID}
shell tail -1 ./files/head.dat_{ID} > ./files/tail.dat_{ID}
variable number file ./files/tail.dat_{ID} group oxygen_vacancy id {number}
variable oxygen delete
variable number delete

dump 1 oxygen_vacancy custom 1 ./files/test_${ID} x y z

run 0

undump 1

delete_atoms group oxygen_vacancy
set type 2 charge 2.3599463190
minimize 0.0 0.5 10000 100000

shell echo {index} {ID} {energy} >> ./files/energy.dat_{ID}
shell tail -1 ./files/test_{ID} >> ./files/positions_{ID}
shell echo {energy2} >> ./files/energy_{ID}

clear

next index

jump STO.in loopb

sjplimp · February 28, 2014, 3:25pm

Re: your 1st 2 points, I think that is the only way to
do it with file variables as currently implemented in LAMMPS.

Re: your 3rd point, you should try the most current version
of the code. We added some randomness to the reading/writing
of the “lock” file that stores the current 1 to N uloop variable
when running on multiple partitions. Using a lock file is not a
guarantee you will never have the kind of problem you are
seeing (2 partitions get access to the lock file simultaneously),
but the new methodology hopefully reduces the chances of
it happening.

Steve