fix addforce causes parallel LAMMPS to hang

Hello LAMMPS users,

I am having a problem where a fix addforce command causes LAMMPS to
freeze at "Setting up run ...", but only when executed in parallel. I
tried it on the 17Dec13 and 20Mar14 versions over multiple compilers
and using MPI; it always runs just fine in serial. I have since
circumvented the problem by using fix spring tether, but still thought
I would ask for my own edification. The minimum example script is
attached.

Thanks,
Cody K. Addington

in.nonworking_addforce (1.02 KB)

Hello LAMMPS users,

I am having a problem where a fix addforce command causes LAMMPS to
freeze at "Setting up run ...", but only when executed in parallel. I
tried it on the 17Dec13 and 20Mar14 versions over multiple compilers
and using MPI; it always runs just fine in serial. I have since
circumvented the problem by using fix spring tether, but still thought
I would ask for my own edification. The minimum example script is
attached.

thanks. i can confirm the problem, but i don't think it is fix
addforce that is causing the problem, but compute reduce that is
referred to from an atom style variable. we had a somewhat similar
issue recently, but this is a domain of bugs, where steve has the
upper hand by a far margin. so let's hope he can find the time to look
into it.

axel.

Hello LAMMPS users,

I am having a problem where a fix addforce command causes LAMMPS to
freeze at "Setting up run ...", but only when executed in parallel. I
tried it on the 17Dec13 and 20Mar14 versions over multiple compilers
and using MPI; it always runs just fine in serial. I have since
circumvented the problem by using fix spring tether, but still thought
I would ask for my own edification. The minimum example script is
attached.

ok. i found the cause and it is a bit of a tricky thing.

contributing factors are:
- use of a fix that supports atom style variables
- referral to an operation requiring a collective communication from
within that variable
- a system setup, where a processor has no local atoms (which will
also cause a significant load imbalance and thus a hint to use the
processors command to have a more balanced partitioning of the system
into domains

eliminating any one of these factors will make it work, which is the
reason why it hasn't been seen so far.

it can be fixed through changing the code in fix_addforce, but it
turns out the issue is not limited to this fix, but shows up in
several other fixes, too. thus i'll be discussing with steve what the
best way to solve this issue is going to be globally.

ciao,
     axel.

this will be fixed in the next patch

thanks,
Steve