Segmentation Fault when Compiling in Parallel

Hi,

Edit - Update:
I had a large number of conflicting packages installed for openmpi, and gcc, through both homebrew and macports. Uninstalling all these packages and reinstalling gcc8 and all the other relevant dependencies with the gcc8 and fortran variant (where applicable) allowed me to properly compile and run the code. See below for the list of conflicting ports.

I have managed to compile GULP in parallel seemingly with no complaint from the compiler.

However, when I try and run the code I get a segmentation fault from the compiler:

(base) my_computer:Examples connor$ mpirun -np 4 gulp example14

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

#0 0x10765ad3e

#1 0x10765a49a

#2 0x7fff733fb5fc

#0 0x106a15d3e

#1 0x106a1549a

#2 0x7fff733fb5fc

#0 0x108c43d3e

#1 0x108c4349a

#2 0x7fff733fb5fc

#0 0x1033b8d3e

#1 0x1033b849a

#2 0x7fff733fb5fc

#3 0x1068f2cb7

#3 0x108b1bcb7

#4 0x1084b1034

#5 0x1084a7318

#3 0x107538cb7

#4 0x106ec8034

#5 0x106ebe318

#3 0x103290cb7

#4 0x102c24034

#4 0x10628b034

#5 0x106281318

#5 0x102c1a318

#6 0x108148957

#6 0x1028bc957

#6 0x105f20957

#6 0x106b61957

#7 0x108157eba

#7 0x1028cbeba

#7 0x106b70eba

#7 0x105f2feba

#8 0x1081c84d5

#8 0x10293c4d5

#8 0x106be14d5

#8 0x105fa04d5

--------------------------------------------------------------------------

Primary job terminated normally, but 1 process returned

a non-zero exit code. Per user-direction, the job has been aborted.

--------------------------------------------------------------------------

--------------------------------------------------------------------------

mpirun noticed that process rank 2 with PID 0 on node my_computer exited on signal 11 (Segmentation fault: 11).

--------------------------------------------------------------------------

I have compiled the none parallel version just fine, and been able to use it to run examples if that helps. Any help is greatly appreciated!

Cheers,

Connor

Hi Connor. In order to examine the issue can you let me know:

  1. What version of GULP you are using?
  2. What the compiler version number is and what version of MPI?
    If I have this then I can try to test out on our systems with something as close to yours as possible.
    Cheers
    Julian

Hi Julian. So I’m using Macports installed ‘gcc-mp-8’ so gcc 8.4, with openmpi for gcc 8.4 also.

The exact packages are below, taken from the installed list on macports:

gcc: gcc8 @8.4.0_1
openmpi: openmpi-default @4.0.1_1+gcc8
scalapack: scalapack @2.1.0_0+accelerate+gcc8+mpich

I have a number of other (active) ports for mpi however, and I can see in scalapack it appears to have been installed for mpich not openmpi. All the mpi related ports are:

mpich-gcc8 @3.3.2_0+fortran (active)
openmpi @4.0.1_1 (active)
openmpi-default @4.0.1_1+gcc8 (active)
openmpi-gcc8 @4.0.1_1+fortran (active)

And a further bit of information, when running ompi_info I get that my openmpi is actually:

Package: Open MPI brew@HighSierra Distribution
Open MPI: 4.0.5
Open MPI repo revision: v4.0.5
Open MPI release date: Aug 26, 2020

However, on the makefile and on my path the first folders I linked are the opt/local/include, and usr/local/include, so I’m not sure why this is the version that is considered default.

Running ‘mpirun --version’ I get: mpirun (Open MPI) 4.0.1

and ‘mpiexec --version’ I get: mpiexec (OpenRTE) 4.0.1

Apologies for my pretty terrible understanding of make, and probably awful macports set up! If you think it’s possible these different things could be conflicting, is there a quicker solution than you trying to run it? i.e. me just clean installing everything?

Thanks for all the help!

Connor

Hi Connor.
Thanks for the details of your MPI/f90 environment. One last detail - which version of GULP are you using please (version number and date ideally)?

Julian

Hi Julian. Sorry missed that part, I’ve tried to compile with both gulp 5.0 (Downloaded 20/4/2018) and gulp 5.2 (Downloaded 1/9/2020). Both give me a segmentation fault in parallel but compile in serial.

Connor

Hi Julian. Sorry to have wasted your time, I cleaned out all the superfluous packages and reinstalled everything clean with gcc8, openmpi-gcc8, and scalapack gcc8, and it’s compiled and run just fine.

Thanks again for all your help!

Hi Connor. No problem - I was about to say that I’d tested both versions on my Mac and everything had worked fine with Macports installs.

1 Like