lj/class2/coul/long --- lammps version malfunction

_Souvik_Pal · November 5, 2011, 7:35pm

Hi ,

I have the following input script:

sjplimp · November 8, 2011, 2:50pm

There was an uninitialized variable "external_force_clear"
in some code that was recently added by Axel for multi-threading
in the minimizer.

I will post a patch this AM.

I'm actually excited to find a bug of Axel's and correct
it, since this usually happens the other way. Now he will
probably check and report that the bug was my fault when
I merged in his code ...

Steve

akohlmey · November 8, 2011, 4:35pm

There was an uninitialized variable "external_force_clear"
in some code that was recently added by Axel for multi-threading
in the minimizer.

yup. my bad. i messed that one up.

I will post a patch this AM.

I'm actually excited to find a bug of Axel's and correct
it, since this usually happens the other way. Now he will
probably check and report that the bug was my fault when
I merged in his code ...

nope. the last multi-threading patch was based on code
i shipped to you in late september. according to "git blame"
(see below) i found and fixed this bug two weeks later and
it should be part of the last update that i sent you on
october 26th, which hasn't been merged yet. there are
a few more rather subtle bugs in multi-thread support,
that were fixed since, but none should affect non-threaded
execution as far as i know.

8fb52958 (sjplimp 2007-11-30 21:54:30 +0000 57)
8dbbd76a (sjplimp 2010-08-03 18:42:04 +0000 58) elist_global
= elist_atom = NULL;
8fb52958 (sjplimp 2007-11-30 21:54:30 +0000 59) vlist_global
= vlist_atom = NULL;
747ea724 (Axel Kohlmeyer 2011-10-12 22:20:00 -0400 60)
external_force_clear = 0;
90dacbb8 (sjplimp 2009-04-03 19:46:03 +0000 61)
33ff40dd (sjplimp 2009-08-14 20:54:10 +0000 62) nextra_global = 0;
33ff40dd (sjplimp 2009-08-14 20:54:10 +0000 63) fextra = NULL;
33ff40dd (sjplimp 2009-08-14 20:54:10 +0000 64)
33ff40dd (sjplimp 2009-08-14 20:54:10 +0000 65) nextra_atom = 0;

people interested in testing multi-threaded support,
should for the time being retrieve the code from my
github account at: http://github.com/akohlmey/lammps-omp

i hope to have it feature complete with respect to
LAMMPS-ICMS in a few days which will then be
the primary source for the latest updates of the
multi-threading code again before they get merged
into the mainline LAMMPS repository.

i am very pleased with how much more parallel
efficiency could be obtained by rewriting the
lower level data management for per thread data.

cheers,
axel.

_Souvik_Pal · November 9, 2011, 5:59am

Interestingly I found a workaround to this problem! It seems the
segmentation fault in minimization depends on the tolerance value set in
minimize command. I reduced the tolerance values to 1e-4 and 1e-6 from 1e-6
to 1e-8 and it worked. I think this dependency of segmentation fault was
also reported in a previous mail thread posted by my colleague Ravi
Kappiyoor.

Thanks,
Souvik.

akohlmey · November 9, 2011, 6:04am

Interestingly I found a workaround to this problem! It seems the
segmentation fault in minimization depends on the tolerance value set in
minimize command. I reduced the tolerance values to 1e-4 and 1e-6 from 1e-6
to 1e-8 and it worked. I think this dependency of segmentation fault was
also reported in a previous mail thread posted by my colleague Ravi
Kappiyoor.

Are you saying, that steve's patch didn't help?

If yes, please post a complete set of input files, so that one of us
can reproduce the segfault.

Thanks ,
Axel.

_Souvik_Pal · November 9, 2011, 6:19am

I downloaded the 8-Nov-2011 version of lammps (which I believe is the latest
one) and hopefully contains the latest patch. I tried the input script with
the data file and it still seg faulted. I am attaching the input script and
the data file and also the pbs script as applicable to my system. I tried a
lot of previous version of lammps from 2009 to 2011 and all of them seg
faulted with tolerance values 1e-6 and 1e-8 in minimize command.
Hope this helps. Please let me know if you need further information.

athtest.pbs (755 Bytes)

input_test.poly (664 Bytes)

10chains.lmp (138 KB)

_Souvik_Pal · November 9, 2011, 6:43am

One more thing, I changed kspace style to ewald from pppm along with the
tolerance values. I am sorry not to mention this in my previous email. I
just missed that.

Thanks,
Souvik.

akohlmey · November 9, 2011, 2:33pm

One more thing, I changed kspace style to ewald from pppm along with the
tolerance values. I am sorry not to mention this in my previous email. I
just missed that.

regardless of using ewald or pppm and what tolerance
i set, i cannot make this input crash on my machine.

it doesn't look so much like a LAMMPS issue right now.
perhaps your compiler is miscompiling some code?

axel.

_Souvik_Pal · November 9, 2011, 2:48pm

Thanks Dr. Kohlmeyer. Well, this might well be the case. I will try
compiling with a different compiler.

Thanks,
Souvik.