recently i was contacted about improving the parallel performance of
fix rigid and its siblings in the rigid package via multi-threading.
while working on this (the resulting code has just been added to
LAMMPS-ICMS, btw), it became evident, that a significant part of the
scaling problem lies in the replicated data parallelization of the
rigid bodies as such.
i'm pondering some strategies and have a couple of ideas how to make
it better, but before investing a lot of effort into this, it would be
helpful to know what are typical scenarios where people use fix rigid
& co. and how much those scenarios are impacted by the communication
please let me know and also, if you would be able to provide some test
inputs and run some tests on my behalf.
thanks in advance,