Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate bigcount interfaces for Fortran and C #12226

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jtronge
Copy link
Contributor

@jtronge jtronge commented Jan 10, 2024

This adds scripts for generating the C and Fortran mpi_f08 API bindings from template files, while also generating bigcount interfaces for those that require them. On the Fortran side it also adds support for TS 29113 when possible, allowing for better Fortran array handling that matches the standard (some files were imported from PR #10302).

Python >=3.6 is required for running these scripts, which is only necessary when the binding files have not already been generated. Users of the distribution tarball should not need to generate these files and thus should not require Python.

We used https://github.com/cea-hpc/pcvs-benchmarks and the MPI4PY test suite to help ensure all big count interfaces (C and Fortran) are being generated.

PR #12033 is a previous version of this focused specifically on ABI support.

Copy link
Member

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't finished reading / understanding / reviewing both generate_bindings.py scripts yet, but since I'm so late for this and the clock is ticking on @jtronge's availability, I'm going to submit what I have so far.


lib@OMPI_LIBMPI_NAME@_usempif08_la_SOURCES = \
$(mpi_api_files) \
mpi-f08.F90

# These are generated; do not ship them
nodist_lib@OMPI_LIBMPI_NAME@_usempif08_la_SOURCES =
# nodist_lib@OMPI_LIBMPI_NAME@_usempif08_la_SOURCES =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go ahead and delete this line; don't just comment it out.


# JMS Somehow this variable substitution isn't quite working, and I
# don't have time to figure it out. So just wholesale copy the file
# list. :-(
#pmpi_api_files = $(mpi_api_files:%=profile/p%)
#pmpi_api_files = $(mpi_api_files:%=p%)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can figure out why this substitution isn't working, go ahead and uncomment it. Otherwise, it would probably be ok to delete this whole commented-out block.

@@ -1389,7 +1389,7 @@ OMPI_DECLSPEC extern struct ompi_predefined_datatype_t ompi_mpi_ub;
/*
* MPI API
*/

#ifndef OMPI_NO_MPI_PROTOTYPES
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only file in the whole tree where I see this macro used -- where does it come from?

If we're using it in our infrastructure, we prefer to always define boolean-like macros to be 0 or 1 (vs. defined or undefined). There's less of a chance for error that way.

Copy link
Contributor Author

@jtronge jtronge Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is left over from the standard ABI code; I can remove it for now.

IIRC the reason I didn't use a 0/1 macro there is because it would require adding the macro in mpicc or forcing the end user to add it. Both the standard ABI and the ompi ABI code were using mpi.h since some types defined there needed to be used in both versions, but the prototypes cannot be the same.

When we start working again on the standard ABI code, it might be better to refactor mpi.h out into multiple files to make this easier.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, gotcha -- those are good reasons. If you end up keeping something like this, having a comment to explain the deviation from our norm (i.e., using #ifndef) would be good.

ompi/mpi/c/Makefile.am Show resolved Hide resolved
ompi/mpi/c/Makefile.am Show resolved Hide resolved

def main():
if len(sys.argv) < 2:
# Fix required for Python 3.6
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this comment mean -- does this mean that this script does not work if Python is < v3.6?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This a fix for Python v3.6, since the argparse package has some limitations for that version. I have not tested this code below v3.6, unless the CI is using older versions.

sys.exit(1)

parser = argparse.ArgumentParser(description='generate ABI header file and conversion code')
subparsers = parser.add_subparsers()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by this CLI interface. Running generate_bindings.py --help shows this:

$ ./ompi/mpi/c/generate_bindings.py --help
usage: generate_bindings.py [-h] {header,source} ...

generate ABI header file and conversion code

positional arguments:
  {header,source}

options:
  -h, --help       show this help message and exit

Which doesn't give me a lot of information about how to use this script. In ompi/mpi/c/Makefile.am, I see it used like this:

$(PYTHON) $(srcdir)/generate_bindings.py source ompi $< > $@

I think that means it's falling down into the "source" sub-parser.

What's the header sub-parser for? Is that for future functionality or something?

Would it be possible to use more traditional --foo=bar kinds of CLI arguments, perchance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try to make --help a little more descriptive about the sub-parsers. The source sub-parser is designed to generate the source code from a template, in this case for the ompi ABI. The header sub-parser is designed to auto-generate the prototype definitions, although it was only used previously for the standard ABI code (it could be useful for the ompi ABI as well).

The standard ABI functionality was completely removed from the Makefiles, but I've left the functionality in the script, that's why some of these arguments are still there but unused.

print('ERROR: missing subparser argument (see --help)')
sys.exit(1)

parser = argparse.ArgumentParser(description='generate ABI header file and conversion code')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this generating ABI files, or BigCount stuff? I thought this PR was about BigCount and ABI was future...?

...after reading a bunch more of generate_bindings.py, I see that it looks like a lot of the ABI functionality is included, but it doesn't look like it's complete...? E.g., I don't see any other kind of ABI infrastructure (e.g., creating the other libraries, stacking the libraries and implementations like we talked about, etc.).

If we're too late in the game to fully dis-entangle ABI and bigcount into 2 wholly separate commits (with one almost certainly building upon the other), it would be good to explain that this commit has elements of stuff that will be used in an upcoming ABI commit and it wasn't separated because of lack of time blah blah blah. Otherwise, the reader (e.g., me) is left wondering why a "Bigcount" commit has a bunch of unused / half-done ABI stuff included.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I should have been more clear with my commit message. I removed all ABI-specific changes from the make-system, but the script still includes the unused ABI code. This seemed the best route to take, instead of completely removing the extra ABI code, which is not affecting how ompi is being built right now. I can remove it though.

args = parser.parse_args()

# Always add the header
print('/* THIS FILE WAS AUTOGENERATED BY ompi/mpi/c/abi.py. DO NOT EDIT BY HAND. */')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, it might be better to have a traditional --outfile=FILENAME kind of CLI arg that indicates the name of the file that you want to write to. This leaves stdout available for info/verbose/debug kinds of output.

* Functions requiring a bigcount implementation should have type COUNT in
place of MPI_Count or int for each count parameter. Bigcount functions will
be generated automatically for any function that includes a COUNT type.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a larger comment here at the top explaining the purpose of this script, and at least a high-level overview of what it does / how it works. This is a very large script; having an intro to it at the top would be most helpful to the reader (e.g., me).

@jsquyres
Copy link
Member

jsquyres commented Feb 7, 2024

Should #12033 be closed (if this PR wholly replaces it)?

@jtronge
Copy link
Contributor Author

jtronge commented Feb 7, 2024

I think I'll try to remove the extra standard ABI code from the script, as well as the additional items you caught @jsquyres.

I don't think this code completely replaces PR #12033, especially since that one includes the refactor of libmpi into multiple libraries that we discussed.

AS_IF([test $OMPI_TRY_FORTRAN_BINDINGS -ge $OMPI_FORTRAN_USEMPIF08_BINDINGS],
[OMPI_FORTRAN_CHECK_TS([OMPI_FORTRAN_HAVE_TS=1])])

AC_SUBST(OMPI_MPI_SUBARRAYS_SUPPORTED)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think lines 458 and 459 should b moved below line 472 since it is there that we may be setting OMPI_MPI_SUBARRAYS_SUPPORTED and OMPI_MPI_ASYNC_PROTECTS_NONBLOCKING to true.

The Fortran file ``api_f08_generated.F90`` contains all the internal subroutine
definitions, each of which makes a call into corresponding C functions. Two
different C files are generated: ``api_f08_ts_generated.c`` contains support
for compilers with TS 29113 support, allowing the use of ``CFI_cdesc_t`` types;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be nice to add a hyperlink here, e.g. https://fortranwiki.org/fortran/show/Fortran+2018

@jtronge
Copy link
Contributor Author

jtronge commented Feb 14, 2024

/azp run

Copy link

github-actions bot commented Apr 1, 2024

Hello! The Git Commit Checker CI bot found a few problems with this PR:

01fa44e: Add PARTITIONED_COUNT type

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

3 similar comments
Copy link

github-actions bot commented Apr 1, 2024

Hello! The Git Commit Checker CI bot found a few problems with this PR:

01fa44e: Add PARTITIONED_COUNT type

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Copy link

github-actions bot commented Apr 1, 2024

Hello! The Git Commit Checker CI bot found a few problems with this PR:

01fa44e: Add PARTITIONED_COUNT type

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Copy link

github-actions bot commented Apr 3, 2024

Hello! The Git Commit Checker CI bot found a few problems with this PR:

01fa44e: Add PARTITIONED_COUNT type

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Copy link

Hello! The Git Commit Checker CI bot found a few problems with this PR:

e673867: Add int, char, and MPI_Info-related types

  • check_signed_off: does not contain a valid Signed-off-by line

01fa44e: Add PARTITIONED_COUNT type

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Copy link

Hello! The Git Commit Checker CI bot found a few problems with this PR:

fc151a8: Add DATATYPE, STATUS, and GREQUEST_ types*

  • check_signed_off: does not contain a valid Signed-off-by line

e673867: Add int, char, and MPI_Info-related types

  • check_signed_off: does not contain a valid Signed-off-by line

01fa44e: Add PARTITIONED_COUNT type

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Copy link

Hello! The Git Commit Checker CI bot found a few problems with this PR:

5127029: Add ELEMENT_COUNT for MPI_Get_elements_x

  • check_signed_off: does not contain a valid Signed-off-by line

fc151a8: Add DATATYPE, STATUS, and GREQUEST_ types*

  • check_signed_off: does not contain a valid Signed-off-by line

e673867: Add int, char, and MPI_Info-related types

  • check_signed_off: does not contain a valid Signed-off-by line

01fa44e: Add PARTITIONED_COUNT type

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

1 similar comment
Copy link

Hello! The Git Commit Checker CI bot found a few problems with this PR:

5127029: Add ELEMENT_COUNT for MPI_Get_elements_x

  • check_signed_off: does not contain a valid Signed-off-by line

fc151a8: Add DATATYPE, STATUS, and GREQUEST_ types*

  • check_signed_off: does not contain a valid Signed-off-by line

e673867: Add int, char, and MPI_Info-related types

  • check_signed_off: does not contain a valid Signed-off-by line

01fa44e: Add PARTITIONED_COUNT type

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Copy link

Hello! The Git Commit Checker CI bot found a few problems with this PR:

5127029: Add ELEMENT_COUNT for MPI_Get_elements_x

  • check_signed_off: does not contain a valid Signed-off-by line

fc151a8: Add DATATYPE, STATUS, and GREQUEST_ types*

  • check_signed_off: does not contain a valid Signed-off-by line

e673867: Add int, char, and MPI_Info-related types

  • check_signed_off: does not contain a valid Signed-off-by line

01fa44e: Add PARTITIONED_COUNT type

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

4 similar comments
Copy link

Hello! The Git Commit Checker CI bot found a few problems with this PR:

5127029: Add ELEMENT_COUNT for MPI_Get_elements_x

  • check_signed_off: does not contain a valid Signed-off-by line

fc151a8: Add DATATYPE, STATUS, and GREQUEST_ types*

  • check_signed_off: does not contain a valid Signed-off-by line

e673867: Add int, char, and MPI_Info-related types

  • check_signed_off: does not contain a valid Signed-off-by line

01fa44e: Add PARTITIONED_COUNT type

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Copy link

Hello! The Git Commit Checker CI bot found a few problems with this PR:

5127029: Add ELEMENT_COUNT for MPI_Get_elements_x

  • check_signed_off: does not contain a valid Signed-off-by line

fc151a8: Add DATATYPE, STATUS, and GREQUEST_ types*

  • check_signed_off: does not contain a valid Signed-off-by line

e673867: Add int, char, and MPI_Info-related types

  • check_signed_off: does not contain a valid Signed-off-by line

01fa44e: Add PARTITIONED_COUNT type

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Copy link

Hello! The Git Commit Checker CI bot found a few problems with this PR:

5127029: Add ELEMENT_COUNT for MPI_Get_elements_x

  • check_signed_off: does not contain a valid Signed-off-by line

fc151a8: Add DATATYPE, STATUS, and GREQUEST_ types*

  • check_signed_off: does not contain a valid Signed-off-by line

e673867: Add int, char, and MPI_Info-related types

  • check_signed_off: does not contain a valid Signed-off-by line

01fa44e: Add PARTITIONED_COUNT type

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Copy link

Hello! The Git Commit Checker CI bot found a few problems with this PR:

5127029: Add ELEMENT_COUNT for MPI_Get_elements_x

  • check_signed_off: does not contain a valid Signed-off-by line

fc151a8: Add DATATYPE, STATUS, and GREQUEST_ types*

  • check_signed_off: does not contain a valid Signed-off-by line

e673867: Add int, char, and MPI_Info-related types

  • check_signed_off: does not contain a valid Signed-off-by line

01fa44e: Add PARTITIONED_COUNT type

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

@hppritcha
Copy link
Member

bot:ompi:retest

@hppritcha
Copy link
Member

bot:ompi:retest

@hppritcha
Copy link
Member

odd the jenkins community CI is failing in an autogen.pl run:

3. Running template-generating scripts

=== ompi/include/mpif-values.pl
Cannot find executable ompi/include/mpif-values.pl!
Aborting.

@hppritcha
Copy link
Member

bot:ompi:retest

@jtronge jtronge force-pushed the bigcount branch 2 times, most recently from b86d7de to c3e5885 Compare October 14, 2024 12:59
hppritcha added a commit to hppritcha/ompi that referenced this pull request Oct 14, 2024
related to PR open-mpi#12226

don't pay much attention to the top level MPI_Buffer_detach etc.
code as it will be redone in open-mpi#12226.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
This adds scripts for generating the C API bindings from template files,
while also generating bigcount interfaces for those that require them.
The binding script also include initial support for the mpi_f08 Fortran
bindings, but doesn't yet make any changes to fortran/use-mpi-f08

Python >=3.6 is required for running these scripts, which is only
necessary when the binding files have not already been generated.
Users of the distribution tarball should not need to generate these
files and thus should not require Python.

Co-authored-by: mphinney1100 <mphinney@lanl.gov>
Co-authored-by: Howard Pritchard <hppritcha@gmail.com>
Signed-off-by: Jake Tronge <jtronge@lanl.gov>
@jtronge jtronge marked this pull request as ready for review October 18, 2024 22:11
@hppritcha
Copy link
Member

@jsquyres its ready!

@hppritcha hppritcha self-requested a review October 19, 2024 17:04
This updates fortran/use-mpi-f08 to generate most of the Fortran
bindings from a script and template files. It also adds support for
Fortran TS 29113 when possible, allowing for better Fortran array
handling that matches the standard.

The C files were imported from PR open-mpi#10302 and converted to templates to
be fed into the binding script.

Co-authored-by: Gilles Gouaillardet <gilles@rist.or.jp>
Co-authored-by: Howard Pritchard <howardp@lanl.gov>
Signed-off-by: Jake Tronge <jtronge@lanl.gov>
@hppritcha
Copy link
Member

@jsquyres really need this reviewed sometime soon. @jtronge is moving on to other things.

@jeffhammond
Copy link
Contributor

I'll test it, Howard. Won't be sufficient, but I know this topic pretty well.

@hppritcha
Copy link
Member

@jeffhammond thanks very much

@jeffhammond
Copy link
Contributor

I ran https://github.com/jeffhammond/BigMPI/tree/tests/test on it and found bugs in the original tests but none in Open MPI.

This is not sufficient to prove correctness, but is a good sign.

kermit:~/MPI/BigMPI/test$ make -j && for t in `ls -1 *.x` ; do echo "========== $t ==========" && /opt/ompi/largecount/gcc/bin/mpirun -n 16 ./$t ; done
make: Nothing to be done for 'all'.
========== test_allgather_x.x ==========
SUCCESS
========== test_allreduce_x.x ==========
SUCCESS
========== test_alltoall_x.x ==========
SUCCESS
========== test_bcast_x.x ==========
SUCCESS
========== test_gather_x.x ==========
SUCCESS
========== test_irsend_irecv_x.x ==========
SUCCESS
========== test_isend_irecv_x.x ==========
SUCCESS
========== test_issend_irecv_x.x ==========
SUCCESS
========== test_reduce_x.x ==========
SUCCESS
========== test_rma2_x.x ==========
SUCCESS
========== test_rma_x.x ==========
SUCCESS
========== test_rsend_recv_x.x ==========
SUCCESS
========== test_scatter_x.x ==========
SUCCESS
========== test_send_recv_x.x ==========
SUCCESS
========== test_sendrecv_x.x ==========
SUCCESS
========== test_ssend_recv_x.x ==========
SUCCESS

@jeffhammond
Copy link
Contributor

Note that I modified the original BigMPI tests to not use BigMPI. That might not be obvious unless you look at the source code.

@hppritcha
Copy link
Member

Your help is much appreciated Jeff!

Copy link
Member

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not finished reviewing, but I thought I'd turn in what I've seen so far.

Generally: this looks very good!

I found several bindings files with type mismatch warnings; I noted many, but not all of them. You can just compile locally and the compiler should yell at you. If you need me to list out the rest of the files, let me know.

I'll keep reviewing more soon.

@@ -652,6 +667,8 @@ end type test_mpi_handle],
AS_IF([test $OMPI_MIN_REQUIRED_FORTRAN_BINDINGS -gt $OMPI_BUILD_FORTRAN_BINDINGS],
[AC_MSG_ERROR([Cannot build requested Fortran bindings, aborting])])

dnl AC_CONFIG_FILES([ompi/mpi/fortran/use-mpi-f08/bindings/mpi-f-interfaces-bind.h])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to comment this out -- go ahead and delete it.

}
}

rc = ompi_datatype_get_args( type, 0, num_integers, NULL, num_addresses, NULL,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line generates a compiler warning about mismatched types.

}
}

rc = ompi_datatype_get_args( mtype, 1, &max_integers, array_of_integers,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line generates a compiler warning about mismatched types.


rc = ompi_datatype_pack_external(datarep, inbuf, incount,
datatype, outbuf,
outsize, position);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line generates a compiler warning about mismatched types.

}
}

rc = ompi_datatype_create_darray( size, rank, ndims,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line creates a compiler warning about mismatched types.

}
}

rc = ompi_datatype_create_hindexed( count, array_of_blocklengths, array_of_displacements,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line creates a compiler warning about mismatched types.

}
/* data description */
{
const int* a_i[2] = {&count, array_of_blocklengths};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line creates a compiler warning about mismatched types.

}
}

rc = ompi_datatype_create_subarray( ndims, size_array, subsize_array, start_array,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line creates a compiler warning about mismatched types.


MOSTLYCLEANFILES = *.mod

CLEANFILES += *.i90

lib_LTLIBRARIES = lib@OMPI_LIBMPI_NAME@_usempif08.la
noinst_LTLIBRARIES = lib@OMPI_LIBMPI_NAME@_usempif08_profile.la
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is still relevant.

# TODO: Is there any way to get EXTRA_DIST to work with absolute paths? Or,
# better yet, is there some way to make these dependencies a little
# easier to work with?
extra_dist_prototype_files = \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to use *_ts.c.in?

@hppritcha
Copy link
Member

thanks Jeff. I think most all of the warnings are due to lack of embiggening of the internal ompi/opal datatype interfaces. but i'll double check to see if we missed something.

we don't intend to address the lower level datatype issues in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants