Simulation freezes during water insertion GCMC-MD simulation

Dear LAMMPS community,

I encountered an issue when performing molecule insertion with fix GCMC. The force field I’m using is Moment Tensor Potential (MTP) MLIP.

As the documentation explained, GCMC on molecules can only run on one single core, which I did. The simulation can run 4000 time steps of NPT without any trouble. Upon starting GCMC, the simulation often freezes for an unknown reason. The log file and the trj file are not updated once frozen. I once suspected that the problem resulted from the memory issue of my hardware, so I checked the memory usage of the node I used and found that there was still plenty of available memory.

I’ll put my input script below. Hope you can give me some suggestions

LAMMPS input:

# 1) Initialization
units metal
boundary p p p
atom_style molecular 
variable T equal 445.0

# 2) System definition
read_data  h2o_16_300K_Fe_vac.data
replicate 1 2 1
mass 1 12.0107  # C
mass 2 54.938049  # Mn
mass 3 55.845  # Fe
mass 4 14.0067  # N
mass 5 22.98977  # Na
mass 6 15.9994  # O
mass 7 1.00794  # H

variable seed index 11263

# 3) Simulation settings
group O type 6
group PB type 1 2 3 4
group H2O type 6 7
group Na type 5

pair_style mlip load_from=curr.almtp extrapolation_control=true extrapolation_control:threshold_break=10 extrapolation_control:threshold_save=2 extrapolation_control:save_extrapolative_to=preselected_59.cfg
pair_coeff  * *

molecule h2omol H2O.mol
molecule Namol Na.mol
variable oxygen atom "type==6"
group oxygen dynamic all var oxygen
variable nO equal count(oxygen)

variable sodium atom "type==5"
group sodium dynamic all var sodium
variable nNa equal count(sodium)

# 4) Visualization
dump dump all atom 200 dump.gcmc_59.trj

# 5) Run
#reset_atoms id sort yes
neighbor        2 bin
neigh_modify    every 5 delay 5 check yes

velocity all create ${T} ${seed} mom yes rot yes dist gaussian
compute_modify thermo_temp dynamic yes
compute ctH2O H2O temp
compute ctPB PB temp
compute_modify ctH2O dynamic/dof yes
compute_modify ctPB dynamic/dof yes

variable dt equal 0.0005 

# ) NPT
fix mynpt1 all npt temp ${T} ${T} $(200.0*dt) tri 1 1 $(2000.0*dt)
timestep 0.001
thermo 500
thermo_style custom step temp etotal press vol v_nO
run 4000

variable xlo equal xlo+0.1
variable xhi equal xhi-0.1
variable ylo equal ylo+0.1
variable yhi equal yhi-0.1
variable zlo equal zlo+0.1
variable zhi equal zhi-0.1
region system block ${xlo} ${xhi} ${ylo} ${yhi} ${zlo} ${zhi}


fix halt_upper O halt 2000 v_nO >= 50 error continue
#fix halt_lower O halt 2000 v_nO <= 0 error continue
fix fgcmc H2O gcmc 2000 10 50 0 11263 445.0 0 0.2 mol h2omol group H2O region system

fix 1 all recenter INIT INIT INIT
timestep 0.0004
thermo 200
run 200000

unfix halt_upper
unfix fgcmc
unfix mynpt1

fix mynpt2 all npt temp ${T} ${T} $(200.0*dt) tri 1 1 $(2000.0*dt)
thermo 200
run 100000

write_data MnFePBA_gcmc_59.data
write_dump all atom MnFePBA_gcmc_dump_59.trj

The last update of the log file

WARNING: Fix gcmc using full_energy option (../fix_gcmc.cpp:526)
0 atoms in group FixGCMC:gcmc_exclusion_group:fgcmc
0 atoms in group FixGCMC:rotation_gas_atoms:fgcmc
Per MPI rank memory allocation (min/avg/max) = 7.002 | 7.002 | 7.002 Mbytes
   Step          Temp          TotEng         Press          Volume          v_nO     
      4000   312.20881     -2203.2233      2751.678       4253.5944      32     

It is near impossible to provide any assistance with this for multiple reasons

  • Unless you have some gross mistakes in your input, nobody can tell whether there is a problem by just looking at it. Even more so as you input appears to be more on the complex side and GCMC is not a trivial method to begin with and few people have a lot of experience with it. But to reproduce and debug what you are seeing, a complete input deck is needed and ideally one that is small and very fast to run (it does not have to produce meaningful science, just trigger the issue quickly).
  • Which brings me to the next issue: you are using a pair style that is not part of LAMMPS. It has not been validated by the LAMMPS developers and would require to first learn how to install and use it. This is not likely to happen, so you have to depend on getting assistance from somebody that is already using it, and that means your best chance is to contact the developers of this pair style.
  • There is no information about the LAMMPS version you are using and platform that your are running on. So you may be running into an issue that has already been corrected if your LAMMPS version is old.
  • There is no real attempt to narrow down whether the issue originates with fix gcmc or with the pair style. When the simulation gets stuck, you could - for example - identify the process ID, then attach a debugger to the running process (gdb -p <pid>), and try to identify where the simulation is stopping (this is assuming you are running on Linux and have compiled LAMMPS with debug information included into the executable). This could give a valuable hint at what is happening.

My (wildly speculative) guess is that one of the MC steps results in an energy of force that is not representable with floating point math (NaN) and then LAMMPS may get stuck in loops checking for convergence. This can be due to the pair style or the system geometry. Being able to run 4000 time steps is not a lot, so this does not tell us much.

Many potentials behave badly when atoms are placed very close together, including very simple ones like the venerable Born and Buckingham models that go to negative infinity as r→0. This usually does not cause any trouble in MD simulations, but is very problematic in MC simulations. For this reason, we have provided the `overlap_cutoff` keyword in fix gcmc. You should try it out and see if it helps.

3 Likes

Great thanks for the advice. I’ll try that. The MLIP force field that I’m using is trained by me. It performs almost perfectly on Na insertion GCMC, where the system has 20,000 atoms and the simulation is run on 200 cores. The simulation can run 2 ns without a problem.

Molecule insertion is probably much more difficult for LAMMPS to handle.

Thank you for the fast and detailed response.

The lammps version I’m using is

LAMMPS (2 Aug 2023 - Update 1)
using 1 OpenMP thread(s) per MPI task

Is there a major update after this version?

I’ll check the debugger and hope I’ll find something useful

You have to look at the git log or git diff to see exactly what has changed since.
There were some changes and most were cosmetic, but a few were about parallelization where the simulation could stall in case an error condition happened. Not sure if those apply to your case.
After about 2.5 years, it may be worth upgrading.

I tried to use gdb to see which process causes the problem, and I found this

(gdb) bt
#0  0x000000000061037a in LAMMPS_NS::Region::match(double, double, double) ()
#1  0x0000000000aeecd0 in LAMMPS_NS::FixGCMC::attempt_molecule_translation_full() ()
#2  0x0000000000ae84b5 in LAMMPS_NS::FixGCMC::pre_exchange() ()
#3  0x000000000052f54b in LAMMPS_NS::Modify::pre_exchange() ()
#4  0x00000000006de6e5 in LAMMPS_NS::Verlet::run(int) ()
#5  0x0000000000627b1d in LAMMPS_NS::Run::command(int, char**) ()
#6  0x0000000000443947 in LAMMPS_NS::Input::execute_command() ()
#7  0x00000000004414a8 in LAMMPS_NS::Input::file() ()
#8  0x000000000040f675 in main ()


It seems that the MC moves are the real problem. So, I set MC move in my fix GCMC to zero, and it fixed the issue. Thanks again for the advice!

Thanks for letting us know. It would have been even more helpful, if you had compiled LAMMPS with debug info included, so that we would see exactly which line could be the problem.

After looking at the relevant source code, I see a possible problem where something could get stuck in a while loop without a check for a maximum number of trips.

Since I am not able to run your input because of your pair style, if I attach a patch with some debug code to detect if my speculation is correct, would you be willing to apply it and let me know the info it prints out?

Yes, I’m willing to help.

I’d need a step-by-step instruction for this.

Because I’m very occupied in March, I may not be able to update you frequently.

The changes you need are here: Collected small fixes and changes by akohlmey · Pull Request #4898 · lammps/lammps · GitHub

But you need only part of it which is in the attached file.
gcmc.diff.gz (600 Bytes)

You need to download and uncompress the file and then apply the changes, either manually (they are simple enough for that), or use the patch program in the main LAMMPS folder with patch -p1 < gcmc.diff (google around if you need to understand how this works).

Then you can recompile and test. Rather than getting stuck, LAMMPS should now abort with an error message.

1 Like