Accelerate simulation speed_LAMMPS

Rakesh_Barik · June 2, 2019, 10:18am

Hello everyone,

I am running a program with approx. 10 lakhs atoms. Its running very slow. What are the possible ways to increase computation speed without reducing number of atoms.

Thanks

akohlmey · June 2, 2019, 11:41am

Hello everyone,

I am running a program with approx. 10 lakhs atoms.

please note, that few people outside of india know what a lakh is. why not just write 100k or 100,000 and avoid confusion?

Its running very slow.

that is a very unscientific description. can you quantify “very slow”? what is the performance you get versus the performance you expect.?

What are the possible ways to increase computation speed without reducing number of atoms.

for given meaningful advice, one first has to know, what kind of input you are running (what styles, fixes, computes, etc.), and how you are running it (what command line), how effective you are making use of parallelization and how balanced it is, and finally what kind of hardware/OS you are running on and what alternate resources you have available for running your simulation.

axel.

Rakesh_Barik · June 3, 2019, 3:38am

Thank you sir for your suggestion. Actually I am running “simulated annealing” and the script is provided below.

I am using a workstation with Windows 10 OS.
Processor details: Intel Xeon (1 socket, 8 cores, 16 logical processors)

command line: lmp_serial -sf omp -pk omp 16 -in annealing.in.txt
The CPU performance is showing 9% utilization only and it is taking approx. 1 hours for every 500 timesteps.
Kindly suggest me what command line should I follow and how to do parallel simulation to accelerate the simulation speed.

INPUT SCRIPT:

Simulated annealing from 800K to 10K followed by minimization at 0K

------------------------------INITIAL STRUCTURE--------------------------------

clear
log annealing.log
units metal
dimension 3
boundary p p p
atom_style atomic
read_data Pearlite_relaxed.dat

pair_style meam
pair_coeff * * Fe3C_library_Liyanage_2014.meam Fe C Fe3C_Liyanage_2014.meam Fe C
compute PE_atom all pe/atom
compute PE all reduce sum c_PE_atom

------------------------EQUILIBRATION AT 800K FOR 100ps------------------------

set timestep

reset_timestep 0
timestep 0.002

set temperature at 800K

velocity all create 800 12345

Equilibrate at 800K for 100ps

fix 1 all npt temp 800 800 0.2 iso 0 0 2 drag 1

NOTE: Tdamp value = 100timestep & Pdamp = 1000timestep & drag = 0.2-2 for damping oscillations

thermo 500
thermo_style custom step vol press pxx pyy pzz pe temp
dump 1 all custom 1000 ./datafiles/equil_800K_*.dump id type mass x y z c_PE_atom fx fy fz
run 50000
unfix 1
undump 1

------------------------SLOW COOLING TO 10K WITHIN 100ps-----------------------

set timestep

reset_timestep 0
timestep 0.002

cooling to 10K at 7.9K/ps rate

fix 2 all npt temp 800 10 0.2 iso 0 0 2

NOTE: cooling rate = 790/100 = 7.9 K/ps

thermo 500
thermo_style custom step vol press pxx pyy pzz pe temp
dump 2 all custom 1000 ./datafiles/equil_800K_*.dump id type mass x y z c_PE_atom fx fy fz
run 50000
unfix 2
undump 2
write_data ./datafiles/Pearlite_afterannealing_10K.dat

COMPLETED

akohlmey · June 3, 2019, 10:07am

Thank you sir for your suggestion. Actually I am running “simulated annealing” and the script is provided below.

I am using a workstation with Windows 10 OS.
Processor details: Intel Xeon (1 socket, 8 cores, 16 logical processors)

command line: lmp_serial -sf omp -pk omp 16 -in annealing.in.txt
The CPU performance is showing 9% utilization only and it is taking approx. 1 hours for every 500 timesteps.
Kindly suggest me what command line should I follow and how to do parallel simulation to accelerate the simulation speed.

none of the features you use, especially not the pair style, supports multi-threading with OpenMP through USER-OMP (there is no meam/omp pair style), so using suffix omp and 16 threads is not doing anything. also, with hyperthreading enabled the performance gain from the actual hyper-threads is very minimal and thus you would have limited performance due to additional parallel overhead from threading in LAMMPS.

what you have to do, is to use an MPI enabled binary and then run 8 (and not 16) processes with mpiexec.
if that is not fast enough, you have to run on a high-performance cluster with multiple nodes. at 100k atoms, you should be able to see some speedup with MPI until about 100-200 processes, assuming a regular geometry and thus no load balancing issues.

you can find some more discussion on the performance of different packages and settings in the “Accelerate Performance” chapter of the LAMMPS manual.

axel.

Rakesh_Barik · June 3, 2019, 4:47pm

After installing MPICH2, I followed the steps for integrating it with the system (execute smpd.exe -install).
Then with a new regular cmd I followed the command line: mpiexec -localonly 8 lmp_mpi -in in.code

But I got the following error:
CreateProcess(‘lmp_mpi -in in.code’) failed, error 2
launch_process failed.
launch failed: CreateProcess(lmp_mpi -in in.code) on ‘DESKTOP-3DVD925’ failed, error 2 - The system cannot find the file specified.

Rakesh_Barik · June 3, 2019, 4:50pm

After installing MPICH2, I followed the steps for integrating it with the system (execute smpd.exe -install).
Then with a new regular cmd I followed the command line: mpiexec -localonly 8 lmp_mpi -in in.code

But I got the following error:
CreateProcess(‘lmp_mpi -in in.code’) failed, error 2
launch_process failed.
launch failed: CreateProcess(lmp_mpi -in in.code) on ‘DESKTOP-3DVD925’ failed, error 2 - The system cannot find the file specified.

akohlmey · June 3, 2019, 4:52pm

the error message is quite clear. do you have an executable called lmp_mpi.exe somewhere installed in your path?

last time you reported using the serial (i.e. non-MPI) version of LAMMPS. just installing MPICH2 by itself doesn’t automatically give you the lmp_mpi.exe executable. you have to uninstall the serial and install the parallel executable on top of that.

axel.

Rakesh_Barik · June 3, 2019, 5:25pm

Yes, finally it worked. Thank you so much sir.