Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hiop~mpi includes MPI headers #662

Open
cameronrutherford opened this issue Sep 25, 2023 · 6 comments
Open

hiop~mpi includes MPI headers #662

cameronrutherford opened this issue Sep 25, 2023 · 6 comments

Comments

@cameronrutherford
Copy link
Collaborator

https://github.com/LLNL/hiop/blob/develop/src/Interface/hiopInterface.hpp#L60

https://github.com/pnnl/ExaGO/actions/runs/6304772661/job/17116842345?pr=15

This is a really weird bug, as even when building petsc~mpi in the exago package here, petsc insists on having an mpi.h lying around that is also picked up...

I am still trying to figure out who to blame here, but this seemed like the right place to start.

@cnpetra
Copy link
Collaborator

cnpetra commented Sep 26, 2023

are you somehow building hiop without MPI? or different mpi headers are with hiop and petsc

@cameronrutherford
Copy link
Collaborator Author

are you somehow building hiop without MPI? or different mpi headers are with hiop and petsc

From the ExaGO pipeline, we are building:

exago@develop+hiop~ipopt~mpi~python+raja+tests arch=None-None-x86_64
 -   tx7nd5d  exago@develop%gcc@9.4.0~cuda+hiop~ipo~ipopt+logging~mpi~python+raja~rocm+tests build_system=cmake build_type=RelWithDebInfo dev_path=/__w/ExaGO/ExaGO arch=linux-ubuntu20.04-x86_64
 -   ybikngp      ^camp@0.2.3%gcc@9.4.0~cuda~ipo+openmp~rocm~tests build_system=cmake build_type=RelWithDebInfo arch=linux-ubuntu20.04-x86_64
 -   kl43gwj          ^blt@0.4.1%gcc@9.4.0 build_system=generic arch=linux-ubuntu20.04-x86_64
 -   7bzaewm      ^cmake@3.25.2%gcc@9.4.0~doc+ncurses+ownlibs~qt build_system=generic build_type=Release arch=linux-ubuntu20.04-x86_64
 -   3bxcabf          ^ncurses@6.4%gcc@9.4.0~symlinks+termlib abi=none build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   yekhgie          ^openssl@1.1.1t%gcc@9.4.0~docs~shared build_system=generic certs=mozilla arch=linux-ubuntu20.04-x86_64
 -   djeruao              ^ca-certificates-mozilla@2023-01-10%gcc@9.4.0 build_system=generic arch=linux-ubuntu20.04-x86_64
 -   jymwj6w              ^zlib@1.2.13%gcc@9.4.0+optimize+pic+shared build_system=makefile arch=linux-ubuntu20.04-x86_64
 -   vcsqn5o      ^hiop@0.7.1%gcc@9.4.0~cuda+deepchecking~ginkgo~ipo~jsrun~kron~mpi+raja~rocm~shared~sparse build_system=cmake build_type=RelWithDebInfo arch=linux-ubuntu20.04-x86_64
 -   wtvhbiz      ^openblas@0.3.21%gcc@9.4.0~bignuma~consistent_fpcsr+fortran~ilp64+locking+pic+shared build_system=makefile patches=114f95f,a4c642f,c20f518,d3d9b15 symbol_suffix=none threads=none arch=linux-ubuntu20.04-x86_64
 -   5qydzbx          ^perl@5.36.0%gcc@9.4.0+cpanm+open+shared+threads build_system=generic arch=linux-ubuntu20.04-x86_64
 -   e5g7oef              ^berkeley-db@18.1.40%gcc@9.4.0+cxx~docs+stl build_system=autotools patches=26090f4,b231fcc arch=linux-ubuntu20.04-x86_64
 -   gs4r33x              ^bzip2@1.0.8%gcc@9.4.0~debug~pic+shared build_system=generic arch=linux-ubuntu20.04-x86_64
 -   7wdyruu              ^gdbm@1.23%gcc@9.4.0 build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   wslvyrk      ^petsc@3.18.3%gcc@9.4.0~X~batch~cgns~complex~cuda~debug+double~exodusii~fftw+fortran~giflib~hdf5~hpddm~hwloc~hypre~int64~jpeg~knl~kokkos~libpng~libyaml~memkind+metis~mkl-pardiso~mmg~moab~mpfr~mpi~mumps~openmp~p4est~parmmg~ptscotch~random123~rocm~saws~scalapack+shared~strumpack~suite-sparse~superlu-dist~tetgen~trilinos~valgrind build_system=generic clanguage=C arch=linux-ubuntu20.04-x86_64
 -   kwz7ftm          ^diffutils@3.8%gcc@9.4.0 build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   y4xrp3s              ^libiconv@1.17%gcc@9.4.0 build_system=autotools libs=shared,static arch=linux-ubuntu20.04-x86_64
 -   wnqabk7          ^metis@5.1.0%gcc@9.4.0~gdb~int64~ipo~real64+shared build_system=cmake build_type=RelWithDebInfo patches=4991da9,93a7903,b1225da arch=linux-ubuntu20.04-x86_64
 -   quyjgw3          ^python@3.10.8%gcc@9.4.0+bz2+crypt+ctypes+dbm~debug+libxml2+lzma~nis~optimizations+pic+pyexpat+pythoncmd+readline+shared+sqlite3+ssl~tkinter+uuid+zlib build_system=generic patches=0d98e93,7d40923,f2fd060 arch=linux-ubuntu20.04-x86_64
 -   pgvwni4              ^expat@2.5.0%gcc@9.4.0+libbsd build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   en3zuay                  ^libbsd@0.11.7%gcc@9.4.0 build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   ps7sxlx                      ^libmd@1.0.4%gcc@9.4.0 build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   wlq5rko              ^gettext@0.21.1%gcc@9.4.0+bzip2+curses+git~libunistring+libxml2+tar+xz build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   j6aqcps                  ^libxml2@2.10.3%gcc@9.4.0~python build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   zt4ocio                  ^tar@1.34%gcc@9.4.0 build_system=autotools zip=pigz arch=linux-ubuntu20.04-x86_64
 -   xoxeujp                      ^pigz@2.7%gcc@9.4.0 build_system=makefile arch=linux-ubuntu20.04-x86_64
 -   3vtuapf                      ^zstd@1.5.2%gcc@9.4.0+programs build_system=makefile compression=none libs=shared,static arch=linux-ubuntu20.04-x86_64
 -   6sswith              ^libffi@3.4.4%gcc@9.4.0 build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   2evlwmd              ^libxcrypt@4.4.33%gcc@9.4.0~obsolete_api build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   iuswzm4              ^readline@8.2%gcc@9.4.0 build_system=autotools patches=bbf97f1 arch=linux-ubuntu20.04-x86_64
 -   ghcuaen              ^sqlite@3.40.1%gcc@9.4.0+column_metadata+dynamic_extensions+fts~functions+rtree build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   swhrnzy              ^util-linux-uuid@2.38.1%gcc@9.4.0 build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   qkxtzoa              ^xz@5.4.1%gcc@9.4.0~pic build_system=autotools libs=shared,static arch=linux-ubuntu20.04-x86_64
 -   w6opye6      ^pkgconf@1.8.0%gcc@9.4.0 build_system=autotools arch=linux-ubuntu20.04-x86_64
 -   7xqyl5b      ^raja@0.14.0%gcc@9.4.0~cuda+examples+exercises~ipo+openmp~rocm+shared~tests build_system=cmake build_type=RelWithDebInfo arch=linux-ubuntu20.04-x86_64
 -   4s36yj3      ^umpire@6.0.0%gcc@9.4.0+c~cuda~device_alloc~deviceconst+examples~fortran~ipo~numa~openmp~rocm+shared build_system=cmake build_type=RelWithDebInfo tests=none arch=linux-ubuntu20.04-x86_64

And so we get the backtrace:


     459    In file included from /__w/ExaGO/ExaGO/tpl/spack/opt/spack/linux-ub
            untu20.04-x86_64/gcc-9.4.0/hiop-0.7.1-vcsqn5ocwcwfihlrbjqozv3ku2rkg
            zo7/include/hiopInterface.hpp:60,
     460                     from /__w/ExaGO/ExaGO/tpl/spack/opt/spack/linux-ub
            untu20.04-x86_64/gcc-9.4.0/hiop-0.7.1-vcsqn5ocwcwfihlrbjqozv3ku2rkg
            zo7/include/hiopNlpFormulation.hpp:59,
     461                     from /__w/ExaGO/ExaGO/tpl/spack/opt/spack/linux-ub
            untu20.04-x86_64/gcc-9.4.0/hiop-0.7.1-vcsqn5ocwcwfihlrbjqozv3ku2rkg
            zo7/include/hiopAlgFilterIPM.hpp:59,
     462                     from /__w/ExaGO/ExaGO/src/opflow/solver/hiop/opflo
            w_hiop.h:7,
     463                     from /__w/ExaGO/ExaGO/src/opflow/solver/hiop/opflo
            w_hiop.cpp:4:
  >> 464    /__w/ExaGO/ExaGO/tpl/spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-9
            .4.0/petsc-3.18.3-wslvyrkkwofiig24a5rm7gctadb7g4fk/include/petsc/mp
            iuni/mpi.h:186:13: error: multiple types in one declaration
     465      186 | typedef int MPI_Comm;
     466          |             ^~~~~~~~
  >> 467    /__w/ExaGO/ExaGO/tpl/spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-9
            .4.0/petsc-3.18.3-wslvyrkkwofiig24a5rm7gctadb7g4fk/include/petsc/mp
            iuni/mpi.h:186:13: error: declaration does not declare anything [-f
            permissive]
  >> 468    make[2]: *** [src/opflow/CMakeFiles/OPFLOW_obj_static.dir/build.mak
            e:261: src/opflow/CMakeFiles/OPFLOW_obj_static.dir/solver/hiop/opfl
            ow_hiop.cpp.o] Error 1

So the HiOp header hiopInterface.hpp on line 60 (linked in the issue description originally) is including hiopMPI.h, which is then including mpi.h. This looks for any header, and picks up a random PETSc one which errors out.

We are building PETSc and HiOp without MPI here, so I honestly think this could be a HiOp and a PETSc bug?

@nychiang
Copy link
Collaborator

@cnpetra @cameronrutherford
I can successfully build HiOp without MPI.
In hiopMPI.h, mpi.h is not included if we set HIOP_USE_MPI = OFF.

From your log file, I think the problems are:

  1. When HIOP_USE_MPI = OFF, both HiOp and PETSc define their own MPI_Comm.
  2. Not sure where mpi.h is included. Seems to be it is included via PETSc.

See here

@cameronrutherford
Copy link
Collaborator Author

cameronrutherford commented Sep 26, 2023

@cnpetra @cameronrutherford

I can successfully build HiOp without MPI.

In hiopMPI.h, mpi.h is not included if we set HIOP_USE_MPI = OFF.

From your log file, I think the problems are:

  1. When HIOP_USE_MPI = OFF, both HiOp and PETSc define their own MPI_Comm.

  2. Not sure where mpi.h is included. Seems to be it is included via PETSc.

See here

I'm following, but some clarification. I am also able to build hiop~mpi, but issue only happens when exago~mpi tries to build with both petsc~mpi and hiop~mpi.

Why do HiOp and PETSc both need to define MPI_Comm in these non-mpi builds?

@cameronrutherford
Copy link
Collaborator Author

cameronrutherford commented Sep 26, 2023

Again this might technically be an ExaGO (or PETSc or HiOp) issue, but trying to figure out who's to blame here

@cnpetra
Copy link
Collaborator

cnpetra commented Sep 27, 2023

we had this issue before with mfem if I recall correctly. One the defines has to go. I think HiOp can take with however petsc defines MPI_Comm. So an easy fix would be for HiOp to check if already defined. This is for when HIOP_USE_MPI is off.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants