Hope LAMMPS performs the first partition with multiple cores in TAD (temperature accelerated dynamics) calculations

It is known during the the TAD calculation the first partition, which is responsible for the non-NEB parts, can only be performed by a single CPU core due to the contained NEB restrains the each partition to one CPU core. Seems it is inefficient.
In my opinion, It might be hard to use multiple cores calculate one partition in the NEB, but is it possible for TAD to use multiple cores to calculate the non-NEB parts (normal dynamics and minimizations)? Say there are 32 cores, can we assign core 0-26 to take non-NEB parts and the other 5 cores go for the NEB calculations?
Furtherly, would LAMMPS support each partition being performed by multiple cores in the NEB for the future?
Thanks.

Some comments:

  • the TAD workflow alternates between the TAD phases and NEB phases, so there is little benefit to have separate partitions for those. At any rate it would require some programming to make it work
  • a more efficient approach would be to have different partitioning for NEB and TAD phases, however, LAMMPS is not set up to do this. Such a TAD implementation would have to be written as a “driver program” that would create two LAMMPS instances with different partitioning subdivisions and then hand the work alternately to the two LAMMPS instances. That kind of program could be written in C++ but also in Python using the LAMMPS python module.
  • LAMMPS does support having multiple MPI ranks per partition for NEB, but I suspect that feature was added after TAD was added and the TAD code was never updated for that (these kind of things happen regularly in LAMMPS due to the vast code base and people making changes not being aware or competent in other areas and utilize a feature).
  • You can always use multi-threading (via OPENMP, KOKKOS, or INTEL package) on top of MPI for additional parallelization
1 Like

Hello Axel,
Thanks for your information.
I didn’t know that in NEB per partition can be down with multiple MPI ranks. That should be a big progress.
It is a good idea to customize a Python container controlling the TAD workflow, thus both phases can be down efficiently.

If you do such a thing, please contribute it back to the LAMMPS distribution so that others can benefit as well.

Hi. Did you happen to do this? I am facing the same issue now.