Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check issues with GCC-14 new errors #12169

Closed
devreal opened this issue Dec 19, 2023 · 11 comments
Closed

Check issues with GCC-14 new errors #12169

devreal opened this issue Dec 19, 2023 · 11 comments

Comments

@devreal
Copy link
Contributor

devreal commented Dec 19, 2023

Background information

The upcoming GCC 14 will be more picky about things that have been warnings so far:

https://discourse.nixos.org/t/rfc-more-c-errors-by-default-in-gcc-14/27390

This was already reported to break the 5.0.0 release on the user ML: https://www.mail-archive.com/users@lists.open-mpi.org/msg35255.html

The reported build failure was in PMIx oshmem component and there may be other places in OMPI that are affected once the build progresses past PMIx.

Ideally, we make sure Open MPI builds with GCC 14 before it is released.

For reference, the reported build error is:

../../../../oshmem/mca/memheap/base/memheap_base_mkey.c: In function 'mca_memheap_modex_recv_all':
/home/bigpack/openmpi-paq/openmpi-5.0.0/3rd-party/openpmix/include/pmix_deprecated.h:851:32: error: passing argument 2 of 'PMIx_Data_buffer_unload' from incompatible pointer type [-Wincompatible-pointer-types]
  851 |     PMIx_Data_buffer_unload(b, &(d), &(s))
      |                                ^~~~
      |                                |
      |                                void **
@rhc54
Copy link
Contributor

rhc54 commented Dec 19, 2023

Hmmm...little problem here. That macro is not used anywhere in PMIx itself, so the error isn't in PMIx - it is somewhere else in OMPI that uses the macro. In this case, the error is in oshmem:

../../../../oshmem/mca/memheap/base/memheap_base_mkey.c: In function 'mca_memheap_modex_recv_all'

@devreal
Copy link
Contributor Author

devreal commented Dec 19, 2023

Thanks @rhc54, I didn't realize that. The called function PMIx_Data_buffer_unload is in the pmix_deprecated.h header. While we're touching that, should it be replaced with something else?

@rhc54
Copy link
Contributor

rhc54 commented Dec 19, 2023

The Standard committee voted to deprecate all the macro versions of those functions, which is why the macro (PMIX_DATA_BUFFER_UNLOAD) is in the pmix_deprecated.h header. However, I would advise not changing your code to use the function as you would then have to add protection for older versions of PMIx - the macro may be deprecated, but it isn't going away.

If someday you bump your minimum supported version of PMIx to something post-deprecation, then it would be safe to remove those references. Deprecation occurred in PMIx v4.2.4. Right now, OMPI v5's min PMIx version is v4.2.0.

@rhc54
Copy link
Contributor

rhc54 commented Dec 19, 2023

BTW: it is entirely possible the error is in the PMIx header where we translate the macro to the functional form. It is a tad odd that it includes an ampersand in front of that argument. As I said, we don't use that macro, so it could easily go undetected.

Just took a peek at PRRTE, and we do use the macro there in one place. In that usage, the ampersand would be correct - just seems a tad weird when you look at it. I can take a look and see if I can resurrect the reasoning for its presence - might take me a day or two.

@devreal
Copy link
Contributor Author

devreal commented Dec 19, 2023

Looking at the 5.0 standard, the data argument is void* and OUT for PMIX_DATA_BUFFER_UNLOAD (hence the additional indirection, I guess). So the oshmem code is correct. The implementation (PMIx_Data_buffer_unload) has the argument as char**, so the datatype mismatch happens in the mapping from the macro to the implementation. I'm not sure where the interface for PMIx_Data_buffer_unload is standardized (if it is, didn't see it in the 5.0 document) but the oshmem code seems correct to me based on the definition of PMIX_DATA_BUFFER_UNLOAD.

@rhc54
Copy link
Contributor

rhc54 commented Dec 19, 2023

The issue is created by changing to the functional form. The macro can contain multiple lines, and so one can take an argument to the macro and place it on the left side of an = sign to assign a returned value to it.

Cannot do that for a function - instead, you have to pass the extra level of indirection to the function arguments. Hence the addition of the ampersand.

I suspect this new experimental compiler is complaining because the function argument is defined as char**, and I suspect the oshmem usage is passing something else. We probably need to add a cast into the macro translation to avoid the problem. I can add that, though it won't be in time for OMPI v5.0.1.

The Standard group decided to make the transition - but I don't believe anyone has yet written up the new functions. 🤷‍♂️ Pretty sure there is an issue about someone needing to do it.

@qkoziol
Copy link
Contributor

qkoziol commented Dec 19, 2023

I mentioned the perl script that we've used in HDF5 when fixing warnings on this morning's OMPI developer call, and thought that I'd add it here, so it's preserved for reference later, in case anyone would like to use it.

HDF5 turns on the following set of warnings for a "kitchen sink developer warnings" build ("env CC=gcc-13 ./configure --enable-build-mode=debug --enable-maintainer-mode --enable-developer-warnings --enable-diags --disable-shared --enable-map-api --enable-threadsafe --enable-build-all --enable-mirror-vfd --enable-unsupported") with GCC 13:

-std=c99  -Wall -Wcast-qual -Wconversion -Wextra -Wfloat-equal -Wfor
mat=2 -Winit-self -Winvalid-pch -Wmissing-include-dirs -Wshadow -Wundef -Wwrite-
strings -pedantic -Wno-c++-compat -Wlarger-than=2560 -Wlogical-op -Wframe-larger
-than=16384 -Wpacked-bitfield-compat -Wsync-nand -Wno-unsuffixed-float-constants
 -Wdouble-promotion -Wtrampolines -Wstack-usage=8192 -Wmaybe-uninitialized -Wdat
e-time -Warray-bounds=2 -Wc99-c11-compat -Wduplicated-cond -Whsa -Wnormalized -W
null-dereference -Wunused-const-variable -Walloca -Walloc-zero -Wduplicated-bran
ches -Wformat-overflow=2 -Wformat-truncation=1 -Wattribute-alias -Wcast-align=st
rict -Wshift-overflow=2 -Wattribute-alias=2 -Wmissing-profile -Wc11-c2x-compat -
ftrapv -fno-common  -g -fno-omit-frame-pointer  -Wbad-function-cast -Wcast-align
 -Wformat -Wimplicit-function-declaration -Wint-to-pointer-cast -Wmissing-declar
ations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wpacked -Wp
ointer-sign -Wpointer-to-int-cast -Wredundant-decls -Wstrict-prototypes -Wswitch
 -Wunused-but-set-variable -Wunused-variable -Wunused-function -Wunused-paramete
r -Wincompatible-pointer-types -Wint-conversion -Wshadow -Wrestrict -Wcast-funct
ion-type -Wmaybe-uninitialized -Waggregate-return -Wdisabled-optimization -Wmiss
ing-format-attribute -Wmissing-noreturn -Wswitch-default -Wswitch-enum -Wunsafe-
loop-optimizations -Wunused-macros -Wjump-misses-init -Wstrict-overflow=5 -Wsugg
est-attribute=const -Wsuggest-attribute=noreturn -Wsuggest-attribute=pure -Wvect
or-operation-performance -Wsuggest-attribute=format -Wstringop-overflow=2 -Wstri
ngop-overflow=4 -Wsuggest-attribute=cold -Wsuggest-attribute=malloc -Warith-conv
ersion -Warith-conversion -Og

I've attached thee output from "make 2>&1 | tee make.out.txt" to this comment, along with the "warnhist" (for "warning histograms") script: (rename the attached "warnhist.txt to just "warnhist" or "warnhist.pl", etc)

make.out.txt
warnhist.txt

"warnhist -h" will give you the options for running the script, but here's some examples.

"warnhist make.out.txt" :

Total unique [non-ignored] warnings: 21
Ignored notes / supplemental warning lines [not counted in unique warnings]: 0
Duplicated warning lines [not counted in unique warnings]: 0
Total ignored warnings: 0
Total unique kinds of warnings: 6
Total files with warnings: 16

# of Warnings by frequency (file count)
=======================================
[ 0]   13 ( 8) - function might be candidate for attribute '-' [-Wsuggest-attribute=-]
[ 1]    3 ( 3) - function '-' might be a candidate for '-' format attribute [-Wsuggest-attribute=-]
[ 2]    2 ( 2) - macro '-' is not used [-Wunused-macros]
[ 3]    1 ( 1) - string length '-' is greater than the length '-' ISO C99 compilers are required to support [-Woverlength-strings]
[ 4]    1 ( 1) - conversion to '-' from '-' may change the sign of the result [-Warith-conversion]
[ 5]    1 ( 1) - conversion from '-' to '-' may change value [-Warith-conversion]

# of Warnings by filename (warning type)
========================================
[  0]    6 ( 1) - hdf5/tools/src/h5import/h5import.c
[  1]    1 ( 1) - hdf5/tools/test/perform/zip_perf.c
[  2]    1 ( 1) - hdf5/src/H5Tfields.c
[  3]    1 ( 1) - hdf5/tools/libtest/h5tools_test_utils.c
[  4]    1 ( 1) - hdf5/utils/mirror_vfd/mirror_remote.c
[  5]    1 ( 1) - hl/src//H5LTanalyze.c
[  6]    1 ( 1) - hdf5/src/H5Pint.c
[  7]    1 ( 1) - hdf5/src/H5Dchunk.c
[  8]    1 ( 1) - H5build_settings.c
[  9]    1 ( 1) - hdf5/test/thread_id.c
[ 10]    1 ( 1) - hdf5/test/atomic_writer.c
[ 11]    1 ( 1) - hdf5/tools/test/h5dump/h5dumpgentest.c
[ 12]    1 ( 1) - hdf5/utils/mirror_vfd/mirror_writer.c
[ 13]    1 ( 1) - hdf5/src/H5ESint.c
[ 14]    1 ( 1) - hdf5/test/event_set.c
[ 15]    1 ( 1) - hdf5/src/H5EAhdr.c

"warnhist -w 1 make.out.txt" :

Total unique [non-ignored] warnings: 21
Ignored notes / supplemental warning lines [not counted in unique warnings]: 0
Duplicated warning lines [not counted in unique warnings]: 0
Total ignored warnings: 0
Total unique kinds of warnings: 6
Total files with warnings: 16

# of Warnings by frequency (file count)
=======================================
[ 0]   13 ( 8) - function might be candidate for attribute '-' [-Wsuggest-attribute=-]
[ 1]    3 ( 3) - function '-' might be a candidate for '-' format attribute [-Wsuggest-attribute=-]
	   1 - hdf5/test/thread_id.c
	   1 - hdf5/tools/test/perform/zip_perf.c
	   1 - hdf5/utils/mirror_vfd/mirror_remote.c
[ 2]    2 ( 2) - macro '-' is not used [-Wunused-macros]
[ 3]    1 ( 1) - conversion from '-' to '-' may change value [-Warith-conversion]
[ 4]    1 ( 1) - conversion to '-' from '-' may change the sign of the result [-Warith-conversion]
[ 5]    1 ( 1) - string length '-' is greater than the length '-' ISO C99 compilers are required to support [-Woverlength-strings]

# of Warnings by filename (warning type)
========================================
[  0]    6 ( 1) - hdf5/tools/src/h5import/h5import.c
[  1]    1 ( 1) - hdf5/tools/test/h5dump/h5dumpgentest.c
[  2]    1 ( 1) - hdf5/test/thread_id.c
[  3]    1 ( 1) - hdf5/tools/test/perform/zip_perf.c
[  4]    1 ( 1) - hl/src//H5LTanalyze.c
[  5]    1 ( 1) - hdf5/src/H5Dchunk.c
[  6]    1 ( 1) - hdf5/utils/mirror_vfd/mirror_writer.c
[  7]    1 ( 1) - hdf5/utils/mirror_vfd/mirror_remote.c
[  8]    1 ( 1) - hdf5/src/H5ESint.c
[  9]    1 ( 1) - hdf5/src/H5Tfields.c
[ 10]    1 ( 1) - hdf5/src/H5EAhdr.c
[ 11]    1 ( 1) - hdf5/test/atomic_writer.c
[ 12]    1 ( 1) - hdf5/tools/libtest/h5tools_test_utils.c
[ 13]    1 ( 1) - hdf5/test/event_set.c
[ 14]    1 ( 1) - hdf5/src/H5Pint.c
[ 15]    1 ( 1) - H5build_settings.c

"warnhist -f 0 make.out.txt" :

Total unique [non-ignored] warnings: 21
Ignored notes / supplemental warning lines [not counted in unique warnings]: 0
Duplicated warning lines [not counted in unique warnings]: 0
Total ignored warnings: 0
Total unique kinds of warnings: 6
Total files with warnings: 16

# of Warnings by frequency (file count)
=======================================
[ 0]   13 ( 8) - function might be candidate for attribute '-' [-Wsuggest-attribute=-]
[ 1]    3 ( 3) - function '-' might be a candidate for '-' format attribute [-Wsuggest-attribute=-]
[ 2]    2 ( 2) - macro '-' is not used [-Wunused-macros]
[ 3]    1 ( 1) - string length '-' is greater than the length '-' ISO C99 compilers are required to support [-Woverlength-strings]
[ 4]    1 ( 1) - conversion from '-' to '-' may change value [-Warith-conversion]
[ 5]    1 ( 1) - conversion to '-' from '-' may change the sign of the result [-Warith-conversion]

# of Warnings by filename (warning type)
========================================
[  0]    6 ( 1) - hdf5/tools/src/h5import/h5import.c
	   6 - function might be candidate for attribute '-' [-Wsuggest-attribute=-]
[  1]    1 ( 1) - hdf5/tools/test/h5dump/h5dumpgentest.c
[  2]    1 ( 1) - hdf5/src/H5Dchunk.c
[  3]    1 ( 1) - hdf5/test/atomic_writer.c
[  4]    1 ( 1) - hdf5/src/H5Pint.c
[  5]    1 ( 1) - hdf5/src/H5ESint.c
[  6]    1 ( 1) - hdf5/test/thread_id.c
[  7]    1 ( 1) - hdf5/utils/mirror_vfd/mirror_writer.c
[  8]    1 ( 1) - H5build_settings.c
[  9]    1 ( 1) - hdf5/src/H5EAhdr.c
[ 10]    1 ( 1) - hdf5/test/event_set.c
[ 11]    1 ( 1) - hdf5/tools/test/perform/zip_perf.c
[ 12]    1 ( 1) - hdf5/tools/libtest/h5tools_test_utils.c
[ 13]    1 ( 1) - hdf5/utils/mirror_vfd/mirror_remote.c
[ 14]    1 ( 1) - hl/src//H5LTanalyze.c
[ 15]    1 ( 1) - hdf5/src/H5Tfields.c

@rhc54
Copy link
Contributor

rhc54 commented Dec 19, 2023

@devreal I committed the change to PMIx master branch - if you get a chance, could you update your submodule pointer and see if the issue is gone?

@devreal
Copy link
Contributor Author

devreal commented Dec 21, 2023

Thanks @rhc54, with current openpmix master my build succeeds (and I made sure it fails without your patch). There does not seem to be another place in OMPI where GCC strikes us out, at least for the configuration I had.

@rhc54
Copy link
Contributor

rhc54 commented Dec 21, 2023

Very good - thanks!

@janjust
Copy link
Contributor

janjust commented Jan 25, 2024

fixed, in main and v5.0.x openpmix/openpmix@142625e

@janjust janjust closed this as completed Jan 25, 2024
lamikr added a commit to lamikr/rocm_sdk_builder that referenced this issue May 29, 2024
- add missing construct_event_strings.py
- add fix from upstream bug
  open-mpi/ompi#12169
  which then triggers the usage of construct_event_strings.py

Signed-off-by: Mika Laitio <lamikr@gmail.com>
lamikr added a commit to lamikr/rocm_sdk_builder that referenced this issue May 29, 2024
- add missing construct_event_strings.py
- add fix from upstream bug
  open-mpi/ompi#12169
  which then triggers the usage of construct_event_strings.py

fixes: #12

Signed-off-by: Mika Laitio <lamikr@gmail.com>
lamikr added a commit to lamikr/rocm_sdk_builder that referenced this issue May 29, 2024
- add missing construct_event_strings.py
- add fix from upstream bug
  open-mpi/ompi#12169
  which then triggers the usage of construct_event_strings.py

fixes: #12

Signed-off-by: Mika Laitio <lamikr@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants