UPDATE:
I opened an issue on the OpenMPI GitHub repo and with the help of some of the developers we were able to find two options to solve my issue:
- Use the official arch linux OpenMPI package
- Configure your OpenMPI build with
--with-pmix=external
Original post:
I have been running into the same issue for weeks now.
In my experience, MPI_File_open
hangs every second time it is called. Here is a minimal working example that displays this behaviour:
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char* argv[]) {
MPI_Init(&argc, &argv);
int my_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
MPI_File handle;
int access_mode = MPI_MODE_CREATE /* Create the file if it does not exist */
| MPI_MODE_WRONLY; /* Open the file for writing only */
printf("[MPI process %02d] About to open file.\n", my_rank);
if (MPI_File_open(MPI_COMM_WORLD, "file.tmp", access_mode, MPI_INFO_NULL,
&handle) != MPI_SUCCESS) {
printf("[MPI process %02d] Failure in opening the file.\n", my_rank);
MPI_Abort(MPI_COMM_WORLD, EXIT_FAILURE);
}
printf("[MPI process %02d] File opened successfully.\n", my_rank);
if (MPI_File_close(&handle) != MPI_SUCCESS) {
printf("[MPI process %02d] Failure in closing the file.\n", my_rank);
MPI_Abort(MPI_COMM_WORLD, EXIT_FAILURE);
}
printf("[MPI process %02d] File closed successfully.\n", my_rank);
MPI_Finalize();
return EXIT_SUCCESS;
}
And an example of the behaviour:
mpicc MPIWriteTest.c
mpiexec -n 2 ./a.out
[MPI process 00] About to open file.
[MPI process 01] About to open file.
[MPI process 00] File opened successfully.
[MPI process 01] File opened successfully.
[MPI process 00] File closed successfully.
[MPI process 01] File closed successfully.
mpiexec -n 2 ./a.out
[MPI process 00] About to open file.
[MPI process 01] About to open file.
^C%
mpiexec -n 2 ./a.out
[MPI process 00] About to open file.
[MPI process 01] About to open file.
[MPI process 01] File opened successfully.
[MPI process 00] File opened successfully.
[MPI process 01] File closed successfully.
[MPI process 00] File closed successfully.
mpiexec -n 2 ./a.out
[MPI process 00] About to open file.
[MPI process 01] About to open file.
^C%
This behaviour occurs whether or not the file being opened already exists or not.
This behaviour does not coincide with any changes in the version of OpenMPI I am running. I first encountered it around 2023-08-21 on v4.1.4, which I had been running without issue for almost a year at that point. I have since upgraded to v4.1.5 but this didn’t change anything. This makes me think that the issue is due to something that occurred during a Manjaro update.
My first suspicion was that this was caused by the upgrade to gcc-13 in the 2023-06-04 stable update. I therefore tried recompiling OpenMPI using gcc-12, but this had no effect.