Energy Minimization (SD) and MD (NVT) in different machines and LAMMPS versions

jfernando · March 11, 2025, 8:48pm

MATSCI_enquiry.zip (62.4 KB)
Note: I will state the main questions at the end, and I mention it just to bear in mind what I am looking for understanding, and not to lose the purpuse among upcoming information.

Following to the initialization of the box simulation in my previous question, thanks to akohlmey I managed to create the simulation box with molecules randomly positioned. But during Energy Minimization (SD) and MD (NVT) I faced some irregularities in the outputs between 3 different computing machines I am testing:

A personal Ubuntu laptop, with a pre-built LAMMPS (7 Feb 2024 - Update 1) executable.
A workstation built with CMake. LAMMPS (29 Aug 2024 - Update 1) The “cmake …/cmake” build options are:

cmake -D PKG_GPU=YES -D GPU_API=CUDA -D GPU_PREC=mixed -D GPU_ARCH=sm_89 -D GPU_DEBUG=NO -D CUDPP_OPT=NO -D CUDA_MPS_SUPPORT=YES -D CUDA_BUILD_MULTIARCH=YES -D USE_STATIC_OPENCL_LOADER=NO -D BUILD_OMP=on -D PKG_OPENMP=YES -D PKG_MOLECULE=yes -D PKG_KSPACE=yes -D USE_INTERNAL_LINALG=yes -D PKG_ELECTRODE=yes -D INTEL_ARCH=cpu -D INTEL_LRT_MODE=threads -D LAMMPS_ASYNC_IMD=yes -D MOLFILE_INCLUDE_DIR=/usr/local/lib/vmd/plugins/include -D PKG_MOLFILE=yes -D PKG_RHEO=yes -D PKG_BPM=yes -D DOWNLOAD_VORO=yes -D Python_EXECUTABLE=/usr/lib/python3.12 -D ZLIB_INCLUDE_DIR=/usr/include/zlib.h -D ZLIB_LIBRARY=/usr/lib/x86_64-linux-gnu -D BIN2C=/usr/local/cuda-12.8/bin/bin2c -D CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12.8 …/cmake

Node of a supercomputer. LAMMPS (29 Aug 2024 - Update 1). Unfortunately it is unknown for me how the operators installed it, but since it is for university users in general, I assume they should have compiled most of the packages available. Besides, it might be worth mentioning that the nodes are entirely CPU.

First of all, using my personal laptop (1) the random seed used for positioning the molecules plays a role in the initial energy and forces output. Using a sequence of integers as random seeds, it is preferable to use different random seeds for each molecule type randomly positioned rather than the same. That is, when using the same seed=123456789 for all the molecule types I got a worse performance than when using different seeds as follows:

molecule PGAC PGAC.txt
create_atoms 0 random 36 123450 box mol PGAC 36 overlap 0.6 maxtry 450 #0.2-0.6 same pe
molecule tOMCA trans-OMCA.txt
create_atoms 0 random 54 123456 box mol tOMCA 54 overlap 1.9 maxtry 40
molecule H2O H2O.txt
create_atoms 0 random 6104 1234567 box mol H2O 6104 overlap 1.6 maxtry 1400
molecule Br Br.txt
create_atoms 0 random 72 12345678 box mol Br 72 overlap 2.0 maxtry 90
molecule Na Na.txt
create_atoms 0 random 54 123456789 box mol Na 54 overlap 2.0 maxtry 80 #1.9-2.0

Specifically, with the same random seed=123456789 for all types of mol, I needed either to increase the maximum amount of tries (maxtry), or even to decrease the minimum overlap distance, what resulted in higher initial energy and forces. This just affected in the amount of SD steps needed to equilibrate the system towards meeting the tolerance requirements, but did not affect in the average final energy and forces results as expected. However, this initial energy and force values did vary between machines, and thus LAMMPS versions as follows:

Personal laptop: initial E_pair= 66458.605 (real units: kcal)
Workstation: initial E_pair= 2233597.8 (real units: kcal)
Node: initial E_pair= 2233597.8 (real units: kcal)

Repeating the same task (fixing the random seeds) and in the same machine, the results are the same, but between the last two and the 1st cases there is a difference. However, since the energy and force results at the end of the stage (stopped by the energy tolerance criterion) are very similar on the 3 computers; I infer that the difference is not how the energy and forces are computed between the machines, but how the simulation box is initially structured.

Personal laptop: final E_pair= -81540.743 (real units: kcal) after 24975 SD steps. etol=1e-7
Workstation: final E_pair= -81272.469 (real units: kcal) after 27825 SD steps. etol=1e-8
Node: final E_pair= -81312.26 (real units: kcal) after 28040 SD steps. etol=1e-7

The fact that the 1st case took 3000 steps less to equilibrate towards the same etol=1e-7 must be related to the lower initial energy computed from the construction of the initial box. This suggests that the algorithm that positions the molecules randomly works different between the 7 Feb (personal laptop) & 29 Aug 2024 (workstation and nodes) versions. The same input seems to initially distribute the molecules more evenly with the 7 Feb 2024 version, reducing the pair potential energy.

Finally, although the initial E_pair values seem equivalent between the 2nd and 3rd cases (29 Aug 2024 version), the workstation needed a lower etol=1e-8 to meet the average energy and forces values. Besides, after performing the following MD-NVT stage the timestep required not to get the following error was lower with the workstation: “ERROR on proc xx: Bond atoms x x missing on proc xx at step xx”

Personal laptop: timestep=1.6 fs
Workstation: timestep=1.4 fs
Node: timestep=1.6 fs

Thus, the main questions are:

Is there a difference between versions in terms of the random positioning algorithm?
Which packages may be missing or exceeding in the workstation built, that made etol and timestep need a decrease to work as desired?
Which other packages of the pile the may be worth adding to make the worktation’s lammps installation to work properly for soft matter & rheology purposes? I ask this because the first time installed the MOLECULE pkg was missing, and thus “full” atom type. But so far, with the MOLECULE and other pkgs the ATb extracted FFs seem to be working properly.

Best regards

akohlmey · March 11, 2025, 9:49pm

Most of the relevant information can be seen when you run lmp -h.

As simple yet consistent way to produce a reproducible, but not repetitive set of random number seeds is to define a variable like the following:

variable seeds equal ceil(random(10000,100000000,87346))

And then use ${seeds} to get randomized random seeds.
E.g.:

velocity        all create 3.0 ${seeds} loop geom

I vaguely remember some change in how “overlap” is interpreted, but the only reliable answer you would get from comparing the two versions with git and look at the changelog. git blame can also sometimes help to see what was last changed when.

None. When a package is missing, LAMMPS will stop with an error. What is more likely are the differences in number of processors and thus different domain decompositions. It also suggests that your overall timestep is too large. Try 1.0fs instead, and if you have any hydrogen atoms with flexible bonds, you need to reduce it further to 0.5fs or even 0.25fs.

That is usually best determined by yourself. We offer two “presets”: basic.cmake and most.cmake. The first installs just a few packages that are required to do a lot of common tasks for research that is not very unusual or demanding. The most.cmake preset contains many packages that can be compiled in an automated fashion and can work on typical platforms. It leaves out a few very exotic packages, but has otherwise almost everything.

jfernando · March 28, 2025, 5:38am

Thanks for your quick and useful response. Right after your answer I got some results that made me realize that I must further optimize my simulations due to the size of my systems. For example, I already used the GPU with the ADA89 architecture of the workstation I am a user, but I am expecting much more than what I am getting- Noticed, for example, that the number of processes and threads per process are important runtime inputs that depending on the hardware will perform differently.

Should I open a new topic question? My main concern is that if it is possible to compile and install 4 acceleration pkgs altogether: OPENMP, GPU, KOKKOS and INTEL pkgs. This is to choose between them depending on the case and type of calculations. I managed to compile the first 3, but when I try to include the INTEL pkg I start getting warnings and then a crash while installing. I also managed to install INTEL without KOKKOS. I followed various past queries and done the corresponding changes to the presets and other files and input according to the answers that were mostly given by you.

If I am to open a new topic question I will provide the files and the CMake details; but I look forward for you to tell me if this is necessary, or of there is a simple answer like no or yes.

akohlmey · March 28, 2025, 9:48am

I don’t understand what you are asking here.
So, yes, create a new topic, best under “LAMMPS Installation” and provide a detailed explanation and any useful supporting information.