OpenMPI problem with ase test

Greetings,

ase test fails on an openmpi error, where openmpi is module library.

Any help please ?

[email protected]:~/work/projects/open-collection/theoretical_chemistry/software/ase/buildup_on_servers/jinr_ru/vmXY_hydra/ase-tests/.ase info 
platform                 Linux-3.10.0-1160.el7.x86_64-x86_64-with-glibc2.17
python-3.10.13           /cvmfs/hybrilit.jinr.ru/sw/slc7_x86-64/Python/v3.10.13/bin/python3.10
ase-3.23.0               /lustre/home/user/m/milias/.local/lib/python3.10/site-packages/ase
numpy-1.26.4             /lustre/home/user/m/milias/.local/lib/python3.10/site-packages/numpy
scipy-1.12.0             /lustre/home/user/m/milias/.local/lib/python3.10/site-packages/scipy
matplotlib-3.8.3         /lustre/home/user/m/milias/.local/lib/python3.10/site-packages/matplotlib
spglib-2.5.0             /lustre/home/user/m/milias/.local/lib/python3.10/site-packages/spglib
ase_ext-20.9.0           /lustre/home/user/m/milias/.local/lib/python3.10/site-packages/ase_ext
flask-3.0.3              /lustre/home/user/m/milias/.local/lib/python3.10/site-packages/flask
psycopg2-2.9.9 (dt dec pq3 ext lo64) /lustre/home/user/m/milias/.local/lib/python3.10/site-packages/psycopg2
pyamg-5.2.1              /lustre/home/user/m/milias/.local/lib/python3.10/site-packages/pyamg
[email protected]:~/work/projects/open-collection/theoretical_chemistry/software/ase/buildup_on_servers/jinr_ru/vmXY_hydra/ase-tests/.module list
Currently Loaded Modulefiles:
  1) BASE/1.0           2) Python/v3.10.13    3) GVR/v1.0-1         4) openmpi/v1.8.8-1
[email protected]:~/work/projects/open-collection/theoretical_chemistry/software/ase/buildup_on_servers/jinr_ru/vmXY_hydra/ase-tests/.ase test -j0
About to run pytest with these parameters:
=========================================================================== test session starts ===========================================================================
platform linux -- Python 3.10.13, pytest-8.3.3, pluggy-1.5.0

Libraries
=========

ase-3.23.0               /lustre/home/user/m/milias/.local/lib/python3.10/site-packages/ase
numpy-1.26.4             /lustre/home/user/m/milias/.local/lib/python3.10/site-packages/numpy
scipy-1.12.0             /lustre/home/user/m/milias/.local/lib/python3.10/site-packages/scipy
matplotlib-3.8.3         /lustre/home/user/m/milias/.local/lib/python3.10/site-packages/matplotlib
spglib-2.5.0             /lustre/home/user/m/milias/.local/lib/python3.10/site-packages/spglib
ase_ext-20.9.0           /lustre/home/user/m/milias/.local/lib/python3.10/site-packages/ase_ext
flask-3.0.3              /lustre/home/user/m/milias/.local/lib/python3.10/site-packages/flask
psycopg2-2.9.9 (dt dec pq3 ext lo64) /lustre/home/user/m/milias/.local/lib/python3.10/site-packages/psycopg2
pyamg-5.2.1              /lustre/home/user/m/milias/.local/lib/python3.10/site-packages/pyamg


Calculators
===========
Config: No configuration file specified
Datafiles: ase-datafiles package not installed

  [ ] abinit           not installed: 'abinit'
  [ ] ace              cfg=<ase.config.Config object at 0x7fec3c1e9a50>
  [ ] aims             not installed: 'aims'
  [ ] amber            cfg=<ase.config.Config object at 0x7fec3c1e9a50>
[vm01.hydra.local:26519] mca: base: component_find: unable to open /cvmfs/hybrilit.jinr.ru/sw/slc7_x86-64/openmpi/v1.8.8-1/lib/openmpi/mca_shmem_mmap: /cvmfs/hybrilit.jinr.ru/sw/slc7_x86-64/openmpi/v1.8.8-1/lib/openmpi/mca_shmem_mmap.so: undefined symbol: opal_shmem_base_framework (ignored)
[vm01.hydra.local:26519] mca: base: component_find: unable to open /cvmfs/hybrilit.jinr.ru/sw/slc7_x86-64/openmpi/v1.8.8-1/lib/openmpi/mca_shmem_posix: /cvmfs/hybrilit.jinr.ru/sw/slc7_x86-64/openmpi/v1.8.8-1/lib/openmpi/mca_shmem_posix.so: undefined symbol: opal_shmem_base_framework (ignored)
[vm01.hydra.local:26519] mca: base: component_find: unable to open /cvmfs/hybrilit.jinr.ru/sw/slc7_x86-64/openmpi/v1.8.8-1/lib/openmpi/mca_shmem_sysv: /cvmfs/hybrilit.jinr.ru/sw/slc7_x86-64/openmpi/v1.8.8-1/lib/openmpi/mca_shmem_sysv.so: undefined symbol: opal_shmem_base_framework (ignored)
--------------------------------------------------------------------------
It looks like opal_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  opal_shmem_base_select failed
  --> Returned value -1 instead of OPAL_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  opal_init failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: ompi_rte_init failed
  --> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[vm01.hydra.local:26519] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!
[email protected]:~/work/projects/open-collection/theoretical_chemistry/software/ase/buildup_on_servers/jinr_ru/vmXY_hydra/ase-tests/.

I don’t know why this happens, but maybe it crashes when running MPI init from inside Python. You could try uninstalling asap3 (if you have it installed) since that might be the cause of the problem judging by the fact that it crashes after amber, and that asap3 uses MPI.

That of course isn’t a solution, but it would be one step towards finding the source of the problem.

1 Like

Uninstalling asap3 works. Thanks for the hint.