Dear List:
I’ve observed that the lj/charmm/coul/long/opt functionality is broken in the latest release (May 4th). The attached input file runs without issue if I use the unoptimized lj/charmm/coul/long pair style. However, when changed to lj/charmm/coul/long/opt I get a segmentation fault (see attached log file). Has any else noticed this or have a fix?
Best,
Michael
data.tip3p (644 KB)
log.noopt (1.7 KB)
in.tip3p_noopt (435 Bytes)
in.tip3p (439 Bytes)
log.opt (2.21 KB)
dear michael,
Dear List:
I've observed that the lj/charmm/coul/long/opt functionality is broken in
the latest release (May 4th). The attached input file runs without issue if
I use the unoptimized lj/charmm/coul/long pair style. However, when changed
to lj/charmm/coul/long/opt I get a segmentation fault (see attached log
file). Has any else noticed this or have a fix?
the /opt pair styles contain several code constructs that don't
fully comply with the c++ standards. so, in fact, they were broken
from the get go. however, they did work for most compilers most
of the time and the big question is now. did you move to a compiler
that is more strict in terms of standard compliance or is there some
oversight in the code that was added since it was last tested/used?
can you try to confirm which older version of lammps does work
and where exactly it is breaking?
cheers,
axel.
Dear Axel:
Thanks for the response.
dear michael,
Dear List:
I've observed that the lj/charmm/coul/long/opt functionality is broken in
the latest release (May 4th). The attached input file runs without issue if
I use the unoptimized lj/charmm/coul/long pair style. However, when changed
to lj/charmm/coul/long/opt I get a segmentation fault (see attached log
file). Has any else noticed this or have a fix?
the /opt pair styles contain several code constructs that don't
fully comply with the c++ standards. so, in fact, they were broken
from the get go. however, they did work for most compilers most
of the time and the big question is now. did you move to a compiler
that is more strict in terms of standard compliance or is there some
oversight in the code that was added since it was last tested/used?
I've compiled with both the intel icpc (version 11.1) and g++ (4.1.2) but get segmentation faults in both cases.
I've also tried compiling at a lower level of optimization (-O1, -O2 etc) using the intel compiler to the same effect.
can you try to confirm which older version of lammps does work
and where exactly it is breaking?
I can confirm that the optimized functions are working using the Mar 15th version.
Running the serial version through gdb gives:
...
PPPM initialization ...
G vector = 0.281342
grid = 24 24 24
stencil order = 5
RMS precision = 4.47383e-05
brick FFT buffer size/proc = 29791 13824 11532
Setting up run ...
Program received signal SIGSEGV, Segmentation fault.
0x0000000000617445 in LAMMPS_NS::PairLJCharmmCoulLongOpt::eval<1, 1, 1> (this=0x4150680) at pair_lj_charmm_coul_long_opt.h:137
137 pair_lj_charmm_coul_long_opt.h: No such file or directory.
in pair_lj_charmm_coul_long_opt.h
Current language: auto; currently c++
(gdb) bt
#0 0x0000000000617445 in LAMMPS_NS::PairLJCharmmCoulLongOpt::eval<1, 1, 1> (this=0x4150680) at pair_lj_charmm_coul_long_opt.h:137
#1 0x0000000000612c8b in LAMMPS_NS::PairLJCharmmCoulLongOpt::compute (this=0x4150680, eflag=1, vflag=<value optimized out>)
at pair_lj_charmm_coul_long_opt.cpp:39
#2 0x00000000006b363e in LAMMPS_NS::Verlet::setup (this=0x414ffe0) at verlet.cpp:112
#3 0x0000000000691b39 in LAMMPS_NS::Run::command (this=0x7fff88b923c0, narg=1, arg=0x413f740) at run.cpp:173
#4 0x0000000000551ead in LAMMPS_NS::Input::execute_command (this=0x413d7c0) at run.h:16
#5 0x00000000005529a6 in LAMMPS_NS::Input::file (this=0x413d7c0) at input.cpp:195
#6 0x000000000055ac17 in main (argc=3, argv=0x7fff88b92f68) at main.cpp:29
Thanks,
Michael
dear michael,
[...]
I've compiled with both the intel icpc (version 11.1) and g++ (4.1.2) but get segmentation faults in both cases.
I've also tried compiling at a lower level of optimization (-O1, -O2 etc) using the intel compiler to the same effect.
ok.
can you try to confirm which older version of lammps does work
and where exactly it is breaking?
I can confirm that the optimized functions are working using the Mar 15th version.
good. that would indeed point to a recent change.
Running the serial version through gdb gives:
...
PPPM initialization ...
G vector = 0.281342
grid = 24 24 24
stencil order = 5
RMS precision = 4.47383e-05
brick FFT buffer size/proc = 29791 13824 11532
Setting up run ...
Program received signal SIGSEGV, Segmentation fault.
0x0000000000617445 in LAMMPS_NS::PairLJCharmmCoulLongOpt::eval<1, 1, 1> (this=0x4150680) at pair_lj_charmm_coul_long_opt.h:137
137 pair_lj_charmm_coul_long_opt.h: No such file or directory.
in pair_lj_charmm_coul_long_opt.h
Current language: auto; currently c++
(gdb) bt
#0 0x0000000000617445 in LAMMPS_NS::PairLJCharmmCoulLongOpt::eval<1, 1, 1> (this=0x4150680) at pair_lj_charmm_coul_long_opt.h:137
the code around line 137 is:
if (j <= NEIGHMASK) {
double delx = xtmp - xx[j].x;
double dely = ytmp - xx[j].y;
double delz = ztmp - xx[j].z;
rsq = delx*delx + dely*dely + delz*delz;
this looks a lot like it got broken when
steve was changing the neigborlist flags.
in fact, the code cannot work.
please change the if statement to:
if (j < NEIGHMASK) {
double delx = xtmp - xx[j].x;
and see if that would work.
...and let us know.
thanks,
axel.
Dear Axel:
[...]
the code around line 137 is:
if (j <= NEIGHMASK) {
double delx = xtmp - xx[j].x;
double dely = ytmp - xx[j].y;
double delz = ztmp - xx[j].z;
rsq = delx*delx + dely*dely + delz*delz;
this looks a lot like it got broken when
steve was changing the neigborlist flags.
in fact, the code cannot work.
please change the if statement to:
if (j < NEIGHMASK) {
double delx = xtmp - xx[j].x;
and see if that would work.
...and let us know.
thanks,
axel.
Thanks for the suggestion but that did not solve the problem as I'm still getting segmentation faults.
I attached this bit of code:
printf("i: %i j: %i NEIGHMASK: %i xtmp: %lf\n",i,j,NEIGHMASK,xtmp);
before line 137, and this is the result:
...
i: 0 j: 1895 NEIGHMASK: 1073741823 xtmp: 2.517980
i: 0 j: 1901 NEIGHMASK: 1073741823 xtmp: 2.517980
i: 1 j: -2147483646 NEIGHMASK: 1073741823 xtmp: 1.629650
As you can see, the problem arises the 2nd time through the i-loop. Somehow the jlist array no longer points to the correct memory location...
Thanks,
Mike
I just released a 19May11 patch that should fix this.
Please try it out.
Thanks,
Steve