Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The latest version of MPI4PY appears to introduce runtime issues. #471

Open
DMCXE opened this issue Jan 22, 2025 · 0 comments
Open

The latest version of MPI4PY appears to introduce runtime issues. #471

DMCXE opened this issue Jan 22, 2025 · 0 comments

Comments

@DMCXE
Copy link

DMCXE commented Jan 22, 2025

In the simsopt suite, which is entirely compiled and managed via conda (including vmec, spec, and boozxform), a runtime error has been observed with mpi4py version 4.0.1. The error occurs during the early phases of the ‘mpirun task’, typically after a few iterations. This error, '_pickle.UnpicklingError: invalid load key, '\xee'.', has been encountered in both VMEC optimization and boozerQA_ls_mpi. The currently known solution is to downgrade mpi4py to a version below 4.0 (version 3.1.2 was chosen). The cause of the issue is still unclear, and it is uncertain whether a more effective solution exists. An example of the error encountered during VMEC optimization is shown below.

Traceback (most recent call last):
  File "/public/home/dmcxe/omnigenity/OALL/PO/PO-1step-iota75-iota-inner/ODriven.py", line 61, in <module>
Traceback (most recent call last):
  File "/public/home/dmcxe/omnigenity/OALL/PO/PO-1step-iota75-iota-inner/ODriven.py", line 61, in <module>
    least_squares_mpi_solve(prob, mpi, grad=True,bounds=prob.bounds)
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/solve/mpi.py", line 206, in least_squares_mpi_solve
    least_squares_mpi_solve(prob, mpi, grad=True,bounds=prob.bounds)
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/solve/mpi.py", line 206, in least_squares_mpi_solve
    with MPIFiniteDifference(prob.residuals, mpi, abs_step=abs_step,
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/_core/finite_difference.py", line 171, in __enter__
    with MPIFiniteDifference(prob.residuals, mpi, abs_step=abs_step,
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/_core/finite_difference.py", line 171, in __enter__
    self.mpi_apart()
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/_core/finite_difference.py", line 176, in mpi_apart
    self.mpi_apart()
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/_core/finite_difference.py", line 176, in mpi_apart
    self.mpi.apart(lambda mpi, data: self.mpi_leaders_task(),
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/util/mpi.py", line 263, in apart
    self.mpi.apart(lambda mpi, data: self.mpi_leaders_task(),
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/util/mpi.py", line 263, in apart
    self.leaders_loop(leaders_action)
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/util/mpi.py", line 223, in leaders_loop
    self.leaders_loop(leaders_action)
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/util/mpi.py", line 223, in leaders_loop
    action(self, data)
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/_core/finite_difference.py", line 176, in <lambda>
    action(self, data)
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/_core/finite_difference.py", line 176, in <lambda>
    self.mpi.apart(lambda mpi, data: self.mpi_leaders_task(),
    self.mpi.apart(lambda mpi, data: self.mpi_leaders_task(),
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/_core/finite_difference.py", line 309, in mpi_leaders_task
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/_core/finite_difference.py", line 309, in mpi_leaders_task
    self._jac()
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/_core/finite_difference.py", line 266, in _jac
    self._jac()
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/_core/finite_difference.py", line 266, in _jac
    evals = mpi.comm_leaders.reduce(evals, op=mpi4py.MPI.SUM, root=0)
  File "src/mpi4py/MPI.src/Comm.pyx", line 2157, in mpi4py.MPI.Comm.reduce
    evals = mpi.comm_leaders.reduce(evals, op=mpi4py.MPI.SUM, root=0)
  File "src/mpi4py/MPI.src/Comm.pyx", line 2157, in mpi4py.MPI.Comm.reduce
  File "src/mpi4py/MPI.src/msgpickle.pxi", line 1378, in mpi4py.MPI.PyMPI_reduce
  File "src/mpi4py/MPI.src/msgpickle.pxi", line 1378, in mpi4py.MPI.PyMPI_reduce
  File "src/mpi4py/MPI.src/msgpickle.pxi", line 1310, in mpi4py.MPI.PyMPI_reduce_intra
  File "src/mpi4py/MPI.src/msgpickle.pxi", line 1310, in mpi4py.MPI.PyMPI_reduce_intra
  File "src/mpi4py/MPI.src/msgpickle.pxi", line 1169, in mpi4py.MPI.PyMPI_reduce_p2p
  File "src/mpi4py/MPI.src/msgpickle.pxi", line 1169, in mpi4py.MPI.PyMPI_reduce_p2p
  File "src/mpi4py/MPI.src/msgpickle.pxi", line 1107, in mpi4py.MPI.PyMPI_recv_p2p
  File "src/mpi4py/MPI.src/msgpickle.pxi", line 1107, in mpi4py.MPI.PyMPI_recv_p2p
  File "src/mpi4py/MPI.src/msgpickle.pxi", line 206, in mpi4py.MPI.pickle_load
  File "src/mpi4py/MPI.src/msgpickle.pxi", line 206, in mpi4py.MPI.pickle_load
  File "src/mpi4py/MPI.src/msgpickle.pxi", line 195, in mpi4py.MPI.cloads
  File "src/mpi4py/MPI.src/msgpickle.pxi", line 195, in mpi4py.MPI.cloads
_pickle.UnpicklingError: invalid load key, '\xee'.
_pickle.UnpicklingError: invalid load key, '\xee'.
Traceback (most recent call last):
  File "/public/home/dmcxe/omnigenity/OALL/PO/PO-1step-iota75-iota-inner/ODriven.py", line 61, in <module>
    least_squares_mpi_solve(prob, mpi, grad=True,bounds=prob.bounds)
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/solve/mpi.py", line 206, in least_squares_mpi_solve
    with MPIFiniteDifference(prob.residuals, mpi, abs_step=abs_step,
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/_core/finite_difference.py", line 171, in __enter__
    self.mpi_apart()
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/_core/finite_difference.py", line 176, in mpi_apart
    self.mpi.apart(lambda mpi, data: self.mpi_leaders_task(),
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/util/mpi.py", line 263, in apart
    self.leaders_loop(leaders_action)
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/util/mpi.py", line 223, in leaders_loop
    action(self, data)
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/_core/finite_difference.py", line 176, in <lambda>
    self.mpi.apart(lambda mpi, data: self.mpi_leaders_task(),
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/_core/finite_difference.py", line 309, in mpi_leaders_task
    self._jac()
  File "/public/home/dmcxe/Codes/conda_simsopt/1227/simsopt/src/simsopt/_core/finite_difference.py", line 266, in _jac
    evals = mpi.comm_leaders.reduce(evals, op=mpi4py.MPI.SUM, root=0)
  File "src/mpi4py/MPI.src/Comm.pyx", line 2157, in mpi4py.MPI.Comm.reduce
  File "src/mpi4py/MPI.src/msgpickle.pxi", line 1378, in mpi4py.MPI.PyMPI_reduce
  File "src/mpi4py/MPI.src/msgpickle.pxi", line 1310, in mpi4py.MPI.PyMPI_reduce_intra
  File "src/mpi4py/MPI.src/msgpickle.pxi", line 1169, in mpi4py.MPI.PyMPI_reduce_p2p
  File "src/mpi4py/MPI.src/msgpickle.pxi", line 1107, in mpi4py.MPI.PyMPI_recv_p2p
  File "src/mpi4py/MPI.src/msgpickle.pxi", line 206, in mpi4py.MPI.pickle_load
  File "src/mpi4py/MPI.src/msgpickle.pxi", line 195, in mpi4py.MPI.cloads
_pickle.UnpicklingError: invalid load key, '\x00'.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant