Improved performance for setup on large numbers of MPI ranks

Chris,

I tried out your change using 16384 MPI ranks with my original triclinic system and it works great. I'm replicating a ~300 atom perfect crystal up to 0.5 billion atoms, so your enhancement to replicate is really helpful! With the original replicate command, every MPI rank had to loop over half a billion atoms and it took a really long time.

Thanks,

Stan

Stan,

Awesome!

I know what you mean. In the multi-billion particle use cases that motivated all this, I couldn’t even run jobs long enough to start step 0 with replicate + molecule_setup + pppm_setup together...

chris