The computation of metal nanoparticle with ReaxFF is too slow

Hi there,

A diameter of 4 nm Nickel nanoparticle is fixed in the center of a box with 10x10x10 nm size, and water molecules are distributed around the nanoparticle. Why does the computation become very slow when the nanoparticle is added. I have used fix balance to rebalance the system in every 500 timesteps, and the value of %varavg in the output file is around 168.

Is there any other way to speed up the calculation besides fix balance?

There is not enough information here for any meaningful advice.

Load balancing is at best a secondary concern. There are many other options to look at first.

The very first thing is that you need to demonstrate what you are comparing to what, i.e. what does your system look like before you add the nano particle and what does it look like after. Best provide some logfile output with the timing summary. Also, the input file for both cases is crucial information.
Best also do this without load balancing.

Second, you need to let us know how many CPUs of what kind you are using and what platform you are on and how you are running LAMMPS.

Hi,

Thank you very much for your reply.
The details of the simulation systems are shown in the following figure.

The input file for system 1:
units real
boundary p p p
atom_style charge

read_data C10W600.data
pair_style reax/c lmp_control safezone 16 mincap 10000
pair_coeff * * CHONSFPtCl.ff C H O N S F Pt Cl X
neighbor 2 bin
neigh_modify every 10 delay 0 check yes
velocity all create 300 4928459 rot yes dist gaussian
timestep 0.25

fix 1 all qeq/reax 1 0.0 10.0 1e-6 reax/c
fix 2 all nvt temp 300.0 300.0 25.0

thermo_style custom step temp press density
thermo 100
run 19400

unfix 2
reset_timestep 0
fix 4 all nvt temp 300.0 2000.0 25.0
fix 6 all reax/c/species 1 1 1000 C10W600-2000K.out element C H O N S F Pt Cl X
fix 7 all reax/c/bonds 10000 C10W600-2000K-bonds.txt

dump TRAJ all custom 100 C10W600-2000K.xyz id type element x y z
dump_modify TRAJ element C H O N S F Pt Cl X

restart 100000 Restart.restart
run 500000

The input file for system 2:
units real
boundary p p p
atom_style charge

read_data C10W600-4.data
pair_style reax/c lmp_control safezone 16 mincap 10000
pair_coeff * * CHONSFPtCl.ff C H O N S F Pt Cl X
neighbor 2 bin
neigh_modify every 10 delay 0 check yes
velocity all create 300 4928459 rot yes dist gaussian
timestep 0.25
group NPs type 9
group ben type 1 2 3

fix 1 all qeq/reax 1 0.0 10.0 1e-6 reax/c
fix 2 all nvt temp 300.0 300.0 25.0
fix 3 NPs momentum 100 linear 1 1 1 angular

thermo_style custom step temp press density
thermo 100

comm_style tiled
fix fixbalance all balance 500 1 rcb weight group 2 NPs 30 ben 1.0

run 19400

unfix 2
reset_timestep 0
fix 4 all nvt temp 300.0 2000.0 25.0
fix 6 all reax/c/species 1 1 1000 C10W600-4-2000K.out element C H O N S F Pt Cl X
fix 7 all reax/c/bonds 10000 C10W600-4-2000K-bonds.txt

dump TRAJ all custom 100 C10W600-4-2000K.xyz id type element x y z
dump_modify TRAJ element C H O N S F Pt Cl X

restart 100000 Restart.restart
run 500000

**lmp_control file **
simulation_name Blends_LowT

tabulate_long_range 10000
energy_update_freq 1

nbrhood_cutoff 4.5
hbond_cutoff 6.0
thb_cutoff 0.001
bond_graph_cutoff 0.3
write_freq 10000
traj_title Blends_LowT
atom_info 1
atom_forces 0
atom_velocities 0
bond_info 0
angle_info 0

Many thanks for your help.

You are probably are using too many MPI processes for your rather small system size. At some point adding more processors only adds overhead not performance.

The difference in performance is quite obviously explained with the different number of neighbors and neighbors per atom. The additional computational effort for the nano particle is substantial compared to the previous, dilute system, since the computational effort is significantly increased by the number of neighbors. For each neighbor of an atom you need to compute pairwise interactions and then bond orders and for those neighbors above bond order cutoffs, you also need to compute bond, angle and dihedral interactions. So the computational cost of the added nanoparticle could easily be 20x than for the reset of the system.

As for load balancing. It could certainly be helpful, but the default metric of the atom density is not suitable for the same reason why your effort increased so much. You should try the number of neighbors as metric instead, or perhaps a combination, or assign per-type weights etc.

Thank you very much for your help. I am not sure about the meaning of 'try the number of neighbors as metric instead, or perhaps a combination, or assign per-type weights etc.. How should I improve my code? Does assign per-type weights mean use weight in fix balance command?

Please read the documentation, use common sense, and show some initiative. I very much dislike having to repeat what I already wrote.

Sorry, I will read the documentation. Thank you again for your help.