Improved performance for setup on large numbers of MPI ranks

Chris, I just updated the PR on GitHub with an example input deck. It looks like an issue with triclinic boxes (orthogonal boxes such as the rhodo benchmark work great).


Hi Stan,

Yup, I saw the note. I think I have replicate working with triclinic systems; waiting for verification in large tests queued up. Input and source attached if you want to try before commit,


