Different minimisation convergence using CPUs only and with GPU

Hi Lammps users and developers,

I am doing a simulation at the interface between 2 materials (that are parametrised based on OPLS-AA force field). My system has 100,000 atoms, and the 2 materials are in 2 separate phases at the start. Lammps version used is 22Aug2018

Because the initial arrangement is quite forced, I wanted to perform a minimisation before starting the simulation. I have done this with and without GPU package enabled for the same input file on the same machine, and got different convergence for minimisation.

CPUs only:

Step E_pair TotEng Pxx Pyy Pzz Lx Ly Lz Temp Density

0 404985.05 445785.15 65014.417 58430.156 18272.363 200 172 69 0 0.50901997

3200 -8257.1182 17539.693 -126.79774 -143.96579 -66.22431 200 172 69 0 0.50901997
3300 -8557.4651 17231.663 -143.08402 -133.75736 -84.326089 200 172 69 0 0.50901997
3400 -8911.197 16908.058 -126.1902 -115.47025 -35.829904 200 172 69 0 0.50901997
3500 -9194.0254 16618.456 -114.95276 -73.932621 -13.719999 200 172 69 0 0.50901997
3600 -9466.6124 16204.191 -148.81724 -104.63668 -43.779435 200 172 69 0 0.50901997
3652 -9492.1536 16162.772 -149.36996 -103.51686 -46.26485 200 172 69 0 0.50901997
Loop time of 774.636 on 8 procs for 3652 steps with 78014 atoms

83.5% CPU use with 8 MPI tasks x no OpenMP threads

Minimization stats:
Stopping criterion = linesearch alpha is zero
Energy initial, next-to-last, final =
445785.154316 16162.7721486 16162.7721486
Force two-norm initial, final = 133349 51.7101
Force max component initial, final = 3337.07 35.3162
Final line search alpha, max atom move = 5.61355e-12 1.98249e-10
Iterations, force evaluations = 3652 7388

GPU mixed precision:

Step E_pair TotEng Pxx Pyy Pzz Lx Ly Lz Temp Density
0 404985.82 445785.92 65014.422 58430.404 18272.413 200 172 69 0 0.50901997
100 12008.022 39216.281 2422.9404 241.61927 -36.912503 200 172 69 0 0.50901997
193 11406.731 38127.281 1722.5556 12.021991 -109.0406 200 172 69 0 0.50901997
Loop time of 6.2557 on 8 procs for 193 steps with 78014 atoms

89.7% CPU use with 8 MPI tasks x no OpenMP threads

Minimization stats:
Stopping criterion = linesearch alpha is zero
Energy initial, next-to-last, final =
445785.924177 38127.2809885 38127.2809885
Force two-norm initial, final = 133349 38.6507
Force max component initial, final = 3337.18 0.540374
Final line search alpha, max atom move = 5.38587e-12 2.91038e-12
Iterations, force evaluations = 193 547

GPU double precision:

Step E_pair TotEng Pxx Pyy Pzz Lx Ly Lz Temp Density
0 404985.82 445785.92 65014.422 58430.404 18272.413 200 172 69 0 0.50901997
100 12008.022 39216.281 2422.9404 241.61928 -36.912514 200 172 69 0 0.50901997
181 11421.53 38195.409 1672.0603 26.671128 -143.32993 200 172 69 0 0.50901997
Loop time of 6.82913 on 8 procs for 181 steps with 78014 atoms

90.2% CPU use with 8 MPI tasks x no OpenMP threads

Minimization stats:
Stopping criterion = linesearch alpha is zero
Energy initial, next-to-last, final =
445785.924177 38195.4087411 38195.4087411
Force two-norm initial, final = 133349 88.5156
Force max component initial, final = 3337.18 1.01164
Final line search alpha, max atom move = 1.43844e-12 1.45519e-12
Iterations, force evaluations = 181 606

Stopping criterion is linesearch alpha is zero but the next to last and final energy are the same. The convergence is similar for GPU enabled lammps with mixed and double precision, but different without GPU. I have searched the user list and saw comments that said this could be because my system has bad initial geometry and not dynamically stable. I have visualised the minimisation for all of them, and saw no significantly unphysical behaviour (which at least shows that my force field settings work relatively fine). Can anyone suggest reasons, and what I should do to fix this?

Hi Lammps users and developers,

I am doing a simulation at the interface between 2 materials (that are parametrised based on OPLS-AA force field). My system has 100,000 atoms, and the 2 materials are in 2 separate phases at the start. Lammps version used is 22Aug2018

[...]

Stopping criterion is linesearch alpha is zero but the next to last and final energy are the same. The convergence is similar for GPU enabled lammps with mixed and double precision, but different without GPU. I have searched the user list and saw comments that said this could be because my system has bad initial geometry and not dynamically stable. I have visualised the minimisation for all of them, and saw no significantly unphysical behaviour (which at least shows that my force field settings work relatively fine). Can anyone suggest reasons, and what I should do to fix this?

what is there to fix? you have a system with 300,000 degrees of
freedom, so you are searching a minimum in 300,000-dimensional
potential hypersurface. there are extremely many local minima, and
because you start from a very high potential energy point, it is very
likely, that the tiniest difference can result in a completely
different minimization path leading to a different (local) minimum.

if you believe, there is a significant difference between the CPU and
GPU implementation of the pair styles, you have to provide better
proof with a smaller system.

axel.

Hi Axel,

Thanks for your reply. I will do a test for a smaller system to see if there’s any difference.