Doubt using read_data command multiple times in the same code

Hello LAMMPS users,

I am using the LAMMPS version 23Jun2022.
I am using PyLAMMPS to calculate vacancy formation energy at different regions in my model. I use the read_data command to load the initial structure of my model, and calculate the initial energy of the structure. In order to calculate the vacancy formation energy, I use a loop to create one vacancy at a time at different locations in my model. I need to use read_data again to load the initial configuration of the model after each vacancy is created, so as to go back to the original model configuration and number of atoms before I delete one atom again in every iteration instead of accumulating deleted atoms. I read the LAMMPS documentation, and I used the following command:
L.read_data(“Inc800H-Cu.data add append”).

This led to the energy becoming ‘nan’ when I computed the energy. Is there a way to overcome this?

Regards,
Rajesh

By doing so you are just re-importing the atoms a second time, therefore you create a lot of overlapping between atoms that are sharing the same location, LAMMPS does not like that and returns nan value.

Why not simply restarting a new simulation every time?

Simon

Thank you for the reply, Simon. I tried read_restart inside the loop to go back to the original configuration after an atom was deleted each time, but the error then read “Cannot use read_restart after the box has been defined”. I need to have an initial read_data or read_restart command because I would need that to define the region from which I am choosing the atom to be deleted.

Regards,
Rajesh

Really? (Regions can be defined before creating the simulation box.)

I define a box using the read_data command:
L.read_data(“Inc800H-Cu.data”)

And then I choose a block of atoms from this defined box in the following way:
L.region(“R block”, 0.0, 6.0, “INF INF INF INF”)
L.group(“selreg region R”)

So, I believe I would need an existing box to do that. Correct me if I am wrong, please. Thank you.

Regards,
Rajesh

I’d imagine you know what the y and z-dimensions are of your data file, so you can hard-code those in instead of triggering an error with INFs.

Yeah, I tried that too.
L.region(“R block”, 0.0, 24.0, 0.0, 144.16, 0.0, 144.16)
L.group(“selreg region R”)
L.group(“selatom type”, 1)
L.group(“select intersect selreg selatom”)
L.variable(“nselect equal count(select)”)

It led to this error: “Group command before simulation box is defined”.

Regards,
Rajesh

You are not making much sense here:

  • you say you need a box to define a region (which is incorrect)
  • you say you don’t want the atoms that are in that box
  • but then you want to define a group with those very atoms you don’t want
  • and - of course - you cannot read a restart at that point, because you already have a box
    That is full of contradictions.

What keeps you from reversing the order and thus use the “clear” command, then “read_restart” and then define the region and the group from the region?

Or even define the region, read the data file, and then use the region to define the group?

Hello Dr. Kohlmeyer,

I think I should explain it better. I intend to calculate the vacancy formation energy in different regions of my model. For that, I divide my model into 8 regions, randomly pick one atom at a time and delete it from a region, and calculate its vacancy formation energy. I do this 300 times, where I pick 300 atoms of a particular type and calculate all their vacancy formation energies (VFE), and then take the average VFE per region. So, I have two main loops to do this. The first loop iterates through the 8 regions, and the second loop iterates through them until there are 300 atoms (the number of atoms picked from each region). I define the region in the first loop, and the region changes to the adjacent region once I am done picking 300 atoms from one region.
nregions = 8
count = 300
L.read_restart(“Initial800H-Cu.restart”)
L.pair_style(“eam/alloy”)
L.pair_coeff(“* * FeNiCrCoCu-heafixed.setfl Ni Cr Fe Cu”)
for i in range(nregions):
L.region(“R block”, begin, end, 0.0, 144.16, 0.0, 144.16)
L.group(“selreg region R”)
L.group(“selatom type”, 1)
L.group(“select intersect selreg selatom”)

for j in range(count):
     ............. # Skipping a few lines here
     L.delete_atoms("group select_atom") # deleting the selected atom
     ............  # Lines to calculate the VFE of the structure with the deleted atom absent in it

Once I pick an atom to delete and calculate the VFE of the structure after deleting that atom, I need to “undelete” that atom to bring the model to its initial configuration, before choosing another atom to delete and calculate the VFE of the structure. I tried doing that in the following way:
L.clear
L.read_restart(“Initial800H-Cu.restart”) # Did not work
And then the loop goes on to choose another atom from the initial structure to delete.

Regards,
Rajesh

First off, please see Please Read This First: Guidelines and Suggestions for posting LAMMPS questions for guidelines on how to correctly format quoted inputs and also about the uselessness of statement like “it does not work”.

Second, this is a long-winded explanation about something that is very simple. You are just not thinking like a programmer. The situation is:

  • there is no “undelete” for atoms, so you must restore the previous state
  • you must use “clear” to read a previous state from a restart with “read_restart”
  • the same is true for “read_data” only that less information is contained in a data file (it is far more portable to make up for it).
  • you lose all previous settings like groups, regions, etc. in the LAMMPS instance with “clear” since that literally destroys the LAMMPS simulations state and creates a fresh one. Only information that is contained in the restart is recreated

So the logical conclusion is that within your loops you must issue a clear command before resetting the system with read_restart or read_data. At that point you will also have to recreate all relevant settings that were in the LAMMPS instance before. Since you created them from your python script, you can easily recreate them since you do not lose the information of the python instance.

This is all very straightforward and just requires reviewing the documentation. You may need to adjust the logic and workflow of your script accordingly, but that is a common thing when programming: you cannot force your way of thinking on a computer, but you have to work with its limitations and its own logic.

P.S.: I would recommend against using PyLammps and instead use the regular lammps python module. PyLammps is by construction more fragile and the nature of its implementation has a significant negative impact on performance. For a larger programming project, it is straightforward to use wrapper functions that utilize the available introspection features to extract the information that PyLammps conveniently collects.

Dr. Kohlmeyer,
Thank you for that detailed explanation. I put the comment “# Did not work” there because using the read_restart command even after using the ‘L.clear’ command led to the error “Cannot read_restart after simulation box is defined” again. The ‘clear’ command should have cleared all the previous settings, and the read_restart command should have worked. But for some reason, I get that error.

Regards,
Rajesh

Have you tried running your vacancy energy measurement routine as a plain LAMMPS script instead of in PyLAMMPS?

Yes, I have. But since I need to put all the atoms in a region of a certain type in a text file (say ‘atom.txt’), from which I choose a random atom to be deleted in every iteration, I deemed PyLAMMPS to be more convenient. Also, I have three loops, one for the regions, one for the total number of atoms chosen to average the VFE over, and a third loop to go through the atom.txt file to pick an atom to be deleted randomly. In LAMMPS, it seemed difficult to go through the text file and pick atoms randomly.

Regards,
Rajesh

My advice – take it or leave it – is to “detox” from doing any programming with LAMMPS (including PyLAMMPS). Write up your analysis as a plain LAMMPS script using only “if-then” statements (non-nested!!) and index variables with default values. Then control the script completely externally. Personally I love Bash loops (I am allergic to Python because I always my friends always mess it up on HPC) but you could use Python (with multiprocessing or Dask or whatever fancy snake fangs are out there these days).

This will simplify your process tremendously, because you then only have to write a LAMMPS script that works once, and then you know that you will be able to loop it properly. Right now the leap from 1 iteration to 2 iterations is killing you. Why let it?

Here is a script for a void ray remote control:

variable i index 1 # input file
variable d index 1 # delete style
variable r index 1 # region style
variable s index 1 # RNG seed

read_data input_$i.data
if "$r == 1" then "region void block 2 6 2 6 2 6" &
elif "$r == 2" "region void sphere 3 3 3 2" # note, only first if gets then
group type2 type 2
if "$d == 1" then "delete_atoms region void" &
elif "$d == 2" "delete_atoms random count 1 no all void $s" &
elif "$d == 3" "delete_atoms random count 1 no type2 void $s"

# force field here

run 0
print "$i $d $r $s $(pe)" file i$i-d$d-r$r-s$s.out

Run with default values:

lmp -i void_ray.in

Run with custom values:

lmp -i void_ray.in -v r 2 -v s 1092 -v i 4

Loop-de-loop:

for i in {1..9..2}; do for s in 3 10 489; do lmp -i void_ray.in -v s $s -v i $i; done; done

And all possible once the script works once.