I would like to ask about slow calculation speed of LAMMPS when using meam potential.
I am using different potential files to calculate ion implantation, tersoff/zbl for Fe-Cr and meam for Fe-N respectively. The command I used is fix deposit. The time is shown in the txt document and the meam potential is listed here. FeN.meam (773 Bytes) library.meam (423 Bytes) log of time.txt (4.8 KB)
All the calculations are performed on Linux and gtx 3060 with Lammps (23 Jun 2022). The version of CUDA is 12.0. While Fe-Cr is calculated using GPU, Fe-N is calculated using KOKKOS. I have tried using GPU for Fe-N calculation before, but the calculation time is longer.
There is a lot of information missing to be able to make any assessment.
what are the input files and what exact command lines do you use in each case?
your log file shows that you are using 12 MPI processes, do they all use the same GPU?
what are the specific compilation settings, especially with the GPU package?
The output of lmp -h can be very useful in that respect, especially the part that looks like this:
Please note that you have a consumer GPU which has poor support for double precision floating point and thus will likely lead to improved performance with the GPU package, when it is compiled for mixed or single precision. The KOKKOS package only supports all double precision and thus can only fairly compared to the GPU package when that is configured for all double precision, too.
–what are the input files and what exact command lines do you use in each case?
–I’m so sorry. I don’t quite understand what you mean by input files. My calculations were carried out for a single crystal iron target containing 300000 atoms. The exact command lines is:
fix 1 all nve
fix 2 addatoms deposit 10 2 30000 12345 region sput near 1 vx -74 -74 vz -605 -605 units box
–your log file shows that you are using 12 MPI processes, do they all use the same GPU?
–My command line is:
Fe-Cr: mpirun -np 12 lmp_mpi -sf gpu -pk gpu 1 -in in.implant -package gpu 0 neigh no
Fe-N: mpirun -np 12 lmp_kokkos_cuda_mpi -k on g 1 -sf kk -in in.implant -pk kokkos newton on neigh half
–what are the specific compilation settings, especially with the GPU package?
–OS: Linux Ubuntu 22.04.2 5.19.0-50-generic #50-Ubuntu SMP PREEMPT_DYNAMIC UTC x86_64 x86_64 x86_64
Compiler: GNU ld (GNU Binutils for Ubuntu) 2.38) gcc 11.4.0
C++ standard: C++20
MPI v4.0: MPICH Version: 3.3.2
MPICH Release date: Tue Nov 12 21:23:16 CST 2019
This is useless. It is important to see the entire file.
Not in this case.
The problem is that you are trying to use a feature that does not exist in your version of LAMMPS. It is too old. You need: Release Stable release 2 August 2023 · lammps/lammps · GitHub
So when you are requesting to run MEAM on the GPU, it is not done, but the CPU version is used.
You cannot. I already said that KOKKOS only supports double precision and thus your GPU is not very suitable for that due to its significantly reduced number of double precision units compared with the data center GPUs.
The performance summary does not show any GPU usage statistics.
One basic piece of information you haven’t considered (or told us you have considered) is that tersoff/zbl has a GPU variant, as well as a Kokkos variant (as the manual says) while meam only has a Kokkos variant (as the manual again says). So you simply won’t be getting good speed for meam on your GPU.
From a scientific viewpoint, it seems very strange to compare results from two entirely different force fields, unless they are completely separate studies you’re running. Different force fields have different assumptions, approximations and underlying mathematical models, so it’s very difficult for you to know if any difference you see between both simulations is because of the materials’ difference, or simply the different force fields.
I made the comparison between different functions to demonstrate how slow the calculation speed under the meam potential function is for me. I think the slow speed is caused by the type of potential function. I have also seen that although the mean potential function is slow, it is not so slow.
The meam potential function have no GPU variant. Dose it means that GPU is not used in my calculations?