Bug in USER-OMP?

The attached input script segfaults with the latest OMP version of LAMMPS when run on more than 1 processor. It doesn’t matter if I use the “-sf omp” or not, I get the same error. Here is output from Valgrind:

==15323== Invalid read of size 8
==15323== at 0x5449B2: LAMMPS_NS::DomainOMP::pbc() (domain_omp.cpp:44)
==15323== by 0x944EC0: LAMMPS_NS::Verlet::setup() (verlet.cpp:106)
==15323== by 0x910E00: LAMMPS_NS::Run::command(int, char**) (run.cpp:169)
==15323== by 0x676F0B: LAMMPS_NS::Input::execute_command() (run.h:16)
==15323== by 0x677A15: LAMMPS_NS::Input::file() (input.cpp:201)
==15323== by 0x684C38: main (main.cpp:30)
==15323== Address 0x0 is not stack’d, malloc’d or (recently) free’d

Stan

test1.in (704 Bytes)

Here is another related input script that segfaults the OMP LAMMPS version in a different place (i.e. in MSM) for any number of processors.

Stan

Valgrind:

==10515== Invalid read of size 8
==10515== at 0x6CE820: _ZN9LAMMPS_NS6MSMOMP11direct_evalILi1ELi1ELi0EEEvi.omp_fn.1 (msm_omp.cpp:193)
==10515== by 0x6D35D0: void LAMMPS_NS::MSMOMP::direct_eval<1, 1, 0>(int) (msm_omp.cpp:230)
==10515== by 0x6CBC81: LAMMPS_NS::MSM::compute(int, int) (msm.cpp:536)
==10515== by 0x6D2F7E: LAMMPS_NS::MSMOMP::compute(int, int) (msm_omp.cpp:53)
==10515== by 0x9450B3: LAMMPS_NS::Verlet::setup() (verlet.cpp:138)
==10515== by 0x910E70: LAMMPS_NS::Run::command(int, char**) (run.cpp:169)
==10515== by 0x676F0B: LAMMPS_NS::Input::execute_command() (run.h:16)
==10515== by 0x677A15: LAMMPS_NS::Input::file() (input.cpp:201)
==10515== by 0x684C38: main (main.cpp:30)
==10515== Address 0x20ffffff07 is not stack’d, malloc’d or (recently) free’d

test4.in (700 Bytes)

Here is another related input script that segfaults the OMP LAMMPS version
in a different place (i.e. in MSM) for any number of processors.

the origin for both segfaults is the same and it is hard to say what
is to blame for it. you can call it a bug in the optimizations that i
added recently or an "undesirable feature" in how lammps itself
manages per atom storage. basically, you cannot dereference any
components of Atom unless one atom has been in this domain at least
once. i have to think a bit about how this is best addressed and
probably also need to discuss with steve. it would be a pity to lose
the optimization, since it helps compilers to generate measurably
faster code.
for now, please try this workaround. that should take care of the
majority of cases.

diff --git a/src/lammps.cpp b/src/lammps.cpp
index 2ca7107..ef1257b 100644
--- a/src/lammps.cpp
+++ b/src/lammps.cpp
@@ -460,6 +460,7 @@ void LAMMPS::create()
   if (cuda) domain = new DomainCuda(this);
#ifdef LMP_USER_OMP
   else domain = new DomainOMP(this);
+ atom->avec->grow(0);
#else
   else domain = new Domain(this);
#endif

axel.

This should be fixed in the 7Mar patch ...

Steve

The 7Mar patch did fix the first problem, but I am still getting a segfault for the second problem (i.e. the attached test script) when I use the “-sf omp” option.

Stan

==28478== Invalid read of size 8
==28478== at 0x7259C0: _ZN9LAMMPS_NS6MSMOMP11direct_evalILi1ELi1ELi0EEEvi.omp_fn.1 (msm_omp.cpp:193)
==28478== by 0x72A760: void LAMMPS_NS::MSMOMP::direct_eval<1, 1, 0>(int) (msm_omp.cpp:230)
==28478== by 0x722E29: LAMMPS_NS::MSM::compute(int, int) (msm.cpp:536)
==28478== by 0x72A10E: LAMMPS_NS::MSMOMP::compute(int, int) (msm_omp.cpp:53)
==28478== by 0x9FBB24: LAMMPS_NS::Verlet::setup() (verlet.cpp:138)
==28478== by 0x9BAF30: LAMMPS_NS::Run::command(int, char**) (run.cpp:169)
==28478== by 0x6AEE2B: LAMMPS_NS::Input::execute_command() (run.h:16)
==28478== by 0x6B0DE5: LAMMPS_NS::Input::file() (input.cpp:201)
==28478== by 0x6BF818: main (main.cpp:30)
==28478== Address 0xbf636d61cbf44d28 is not stack’d, malloc’d or (recently) free’d

test4.in (700 Bytes)

this one is due to changes/optimizations in msm.cpp that have not been
carried over into msm_omp.cpp
they only "arrived" in the git repo this morning, so i didn't see them
earlier and didn't even have a chance to address them.

it is a rather unfortunate recurring situation, that people updating the
non-threaded code tend to forget about side effects that their changes may
have on the threaded version. we've been over this a few times already.
this is the reason why i've made such a big effort to hide most of the
thread specific stuff so that we would not even need to have a USER-OMP
package.

more later...

axel.