segfault doing string operations possibly related to TBB

I’m getting a segfault when creating a string variable of other string variables specifically when using an Intel compiler and the USER-INTEL package. I’ve tracked it down to including libtbbmalloc in the supplied Intel Makefiles, that is, if I remove -ltbbmalloc and add -DLMP_INTEL_NO_TBB to CCFLAGS, the problem goes away. Also, if I compile without the USER-INTEL package, the problem goes away.

The problem comes when I make a new string variable that contains 2 (or more) string variables, which itself is not a problem, but if I try to make this new string part of another string variable, I get the segfault. Here is a basic script that reproduces the problem for me:

variable a string 10

variable b string 300

variable id string “a-{a}-b-{b}”

variable some_directory string “data/${id}” # <-- causes the segfault

The error that gets printed starts with: “*** glibc detected *** ./lmp_intel_cpu: free(): invalid pointer…” Then a backtrace that points to /lib64/libc.so.6. I would copy more of the error message but I’m working on an air-gapped cluster and would have to type it all out by hand. Let me know if there’s something specific you want to see.

I’ve tried a couple different machines with Intel 17.0.1 and 18.0.3 compilers and can reliably reproduce the error on the different configurations. Unfortunately I cannot change the way these compilers were built in case it is a system configuration issue.

I realize I have a couple of outs here: just use pure MPI and don’t worry about threads, or don’t use an intermediate string variable - but I’m just trying to understand the root of this problem. Any ideas?

Thanks.

  • Jesse

which LAMMPS version?

Latest stable (June 2019).

ok. been doing a little test with valgrind’s memory checker. it looks there is a mismatch of using malloc()/free() with new/delete. the use of TBB may just make it more visible.

please try with the following modification:

$ git show
commit 3e2f3a80583494420eda14e4ce178f5b71695e60 (HEAD -> collected-small-fixes, devel/collected-small-fixes)
gpg: Signature made Mon 10 Jun 2019 06:22:04 PM EDT
gpg: using RSA key EEA103764C6C633EDC8AC428D9B44E93BF0C375A
gpg: Good signature from “Axel Kohlmeyer <[email protected]>” [ultimate]
Author: Axel Kohlmeyer <akohlmey@…24…>

avoid a case of mixing malloc()/free() with new/delete

diff --git a/src/variable.cpp b/src/variable.cpp
index 376cc8045…7cbdc57d3 100644
— a/src/variable.cpp
+++ b/src/variable.cpp
@@ -288,11 +288,11 @@ void Variable::set(int narg, char **arg)

int maxcopy = strlen(arg[2]) + 1;
int maxwork = maxcopy;

  • char *scopy = new char[maxcopy];
  • char *work = new char[maxwork];
  • char *scopy = (char *) memory->smalloc(maxcopy,“var:string/copy”);
  • char *work = (char *) memory->smalloc(maxwork,“var:string/work”);
    strcpy(scopy,arg[2]);
    input->substitute(scopy,work,maxcopy,maxwork,1);
  • delete [] work;
  • memory->sfree(work);

int ivar = find(arg[0]);
if (ivar >= 0) {
@@ -310,7 +310,7 @@ void Variable::set(int narg, char *arg)
data[nvar] = new char
[num[nvar]];
copy(1,&scopy,data[nvar]);
}

  • delete [] scopy;
  • memory->sfree(scopy);

// GETENV
// remove pre-existing var if also style GETENV (allows it to be reset)

That patch appears to have worked! Thanks!

thanks for the feedback. the change will be included in the next LAMMPS patch release. axel.