Running lammps on a cluster

gunjansauti · September 10, 2022, 8:39am

Hello all,
I have created a Linux cluster with ssh and NFS. I have installed lammps in the NFS folder. I have verified that the cluster works properly with openmpi and the communication is good. But when I run lammps, after reading the data file it gets stuck on the first time step and never goes ahead of that. The log file does not show any error. I am not sure what is causing this, but if anyone else encountered this problem, can you please suggest what should check for?

akohlmey · September 10, 2022, 8:45am

How?

Which version of LAMMPS? How did you compile it? Which (example?) input did you run?
What is your command line? What is the output of lmp -h?

There is not enough information here to make any suggestions. If you did everything correctly and your cluster is set up correctly, then there should not be any problem. But there can be issues at all kinds of levels that have the symptoms you describe.

gunjansauti · September 10, 2022, 8:56am

I have created a mpi_hello file in c. Which responds as it should form all the nodes in the cluster.

LAMMPS version 23 Jun 2022 update 1, compiled with cmake and using openmp package. I used an example I created, which runs properly on a single node. I don’t understand what you mean by command line (I am using ubuntu 20.04 if that’s what you mean). lmp -h behaves as expected (outputting the help text)

akohlmey · September 10, 2022, 9:00am

How do you know it contacts the different nodes and not just runs multiple copies on the same node?

What is the exact command you type to run LAMMPS in parallel across multiple nodes?

It not only outputs the general help text, but also a lot of configuration info. I am interested to see that. Not just to know that it acts as expected (I depend on that).

gunjansauti · September 10, 2022, 9:20am

Attaching the output file from mpi_hello code.

mpi_hello.out (1.9 KB)

mpirun --hostfile /home/auti/.sharedfilesystem/machinefile --mca btl ^openib -np 56 /home/auti/.sharedfilesystem/.softwares/lammps/installpath/bin/lmp

Attaching the output along with.
lmp-h.out (825 Bytes)

akohlmey · September 10, 2022, 9:33am

There are no LAMMPS specific arguments here. This way it will just sit and wait for input.
Please try running the in.lj input from the LAMMPS bench folder and report whatever output you get. And then try the same with the in.rhodo input in the same folder.

Thanks. This looks ok.

gunjansauti · September 10, 2022, 9:36am

Ohh I am sorry, I forgot to add the -in in.lammps here. I just copy-pasted the alias here.
Anyway, I’ll try with the example you suggested and get back to you.

Thanks a lot!

gunjansauti · September 10, 2022, 9:46am

Both the files are running properly, but single node performance is much much faster than when I run it on multiple nodes. Attaching the log files. Thank you.
lj.log.lammps (3.0 KB)
rhodo.log.lammps (5.1 KB)
rhodo.log.lammps_single_node (5.1 KB)

akohlmey · September 10, 2022, 11:30am

OK. that proves that LAMMPS is working correctly and also your cluster communication.

That is due to the kind of network you have. When using Gigabit ethernet with TCP/IP for communication, you have to deal with a slower data transfer rate than, say, InfiniBand and - more importantly - a much higher latency (< 1 us vs ~ 1ms). You can see this in the amount of time spend on communication. The LJ example only has the communication between parallel subdomains, so the “Comm” section of LAMMPS uses most of the time. With the rhodo example, you also have 3d FFTs in the KSpace section, so that one is dominant.

Both are rather small systems (about 30000 atoms), so you have about 500-600 atoms per MPI rank, which is where you have the limit of scaling even when running on a 64-core shared memory workstation.

What you see is the reason why many people that build HPC clusters for running MPI parallel applications spend a lot of money on special HPC networking gear like InfiniBand.

gunjansauti · September 10, 2022, 11:37am

Thanks a lot for your input.
Seems like I have to work on a single node for now. I have the InfiniBand port for my new system, but I won’t be able to connect the older systems with it.