opal_os_dirpath_create: Error: Unable to create the sub-directory

Hi,

This may be more an MPI problem than a Lammps problem, but I am getting the following error:

opal_os_dirpath_create: Error: Unable to create the sub-directory (/var/folders/34/fl2q4ldn1xn2czpgk6l66rkr0000gr/T/openmpi-sessions-khuston@freyr_0/10523) of (/var/folders/34/fl2q4ldn1xn2czpgk6l66rkr0000gr/T/openmpi-sessions-khuston@freyr_0/10523/0/0), mkdir failed [1]

[my.host.name:48338] [[10523,0],0] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 107

[my.host.name:48338] [[10523,0],0] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 402

[my.host.name:48338] [[10523,0],0] ORTE_ERROR_LOG: Error in file ess_hnp_module.c at line 634

This occurs after running around 100,000 lammps sessions (I’m doing forward flux sampling) within a couple hours. Based on a discussion from a couple years ago on the OpenMPI mailing list, it sounds like this could occur because the process IDs get recycled, and MPI tries to mkdir using a filename that already exists, causing this error. However, they also indicated that mpiexec should clean up session directories when the process ends, and further it sounded like this would be fixed in openmpi 1.7.5, so I’m not sure why it’s happening.

My only idea is to rewrite my script to run all the simulations within a small number of persistent lammps sessions instead of starting and stopping so many separate sessions. Are there any other ideas about this error?

Thanks!

Kyle

Hi,

This may be more an MPI problem than a Lammps problem, but I am getting the
following error:

opal_os_dirpath_create: Error: Unable to create the sub-directory
(/var/folders/34/fl2q4ldn1xn2czpgk6l66rkr0000gr/T/[email protected].../10523)
of
(/var/folders/34/fl2q4ldn1xn2czpgk6l66rkr0000gr/T/[email protected].../10523/0/0),
mkdir failed [1]

[my.host.name:48338] [[10523,0],0] ORTE_ERROR_LOG: Error in file
util/session_dir.c at line 107

[my.host.name:48338] [[10523,0],0] ORTE_ERROR_LOG: Error in file
util/session_dir.c at line 402

[my.host.name:48338] [[10523,0],0] ORTE_ERROR_LOG: Error in file
ess_hnp_module.c at line 634

This occurs after running around 100,000 lammps sessions (I'm doing forward
flux sampling) within a couple hours. Based on a discussion from a couple
years ago on the OpenMPI mailing list, it sounds like this could occur
because the process IDs get recycled, and MPI tries to mkdir using a
filename that already exists, causing this error. However, they also
indicated that `mpiexec` should clean up session directories when the
process ends, and further it sounded like this would be fixed in openmpi
1.7.5, so I'm not sure why it's happening.

My only idea is to rewrite my script to run all the simulations within a
small number of persistent lammps sessions instead of starting and stopping
so many separate sessions. Are there any other ideas about this error?

no, but look at the "clear" command and the "label" / "jump" / "next"
commands in in LAMMPS. you should be able to combine your thousands of
lammps runs into a few and possibly bypass the issue.

axel.

clear looks like exactly what I need. Thanks!