-
Notifications
You must be signed in to change notification settings - Fork 868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fortran mpi module: remove interface names #9445
Fortran mpi module: remove interface names #9445
Conversation
Fortran is a deceptively complicated language. It turns out that it is permissable to pass arrays of many sizes and shapes to subroutines that expect 1-dimensional array parameters. For example, a C programmer might look at the following subroutine declaration and assume that it means that the ARRAY_OF_REQUESTS parameter is a 1-dimensional array of any length: ``` MPI_WAITALL(COUNT, ARRAY_OF_REQUESTS, ARRAY_OF_STATUSES, IERROR) INTEGER COUNT, ARRAY_OF_REQUESTS(*), ARRAY_OF_STATUSES(MPI_STATUS_SIZE, *), IERROR ``` However, rules of the Fortran language allow the compiler to effectively morph non-1-dimensional array source parameters to match, for example, the 1-dimensional ARRAY_OF_REQUESTS dummy parameter. Sometimes this will be done by copy-in-copy-out; other times, the compiler can just swizzle the types to match because the source array data is already contiguous in memory. Regardless, there are several rules about when this type swizzling can happen, and there are differences in the rules between when the target subroutine in question is in a named interface such as: ``` INTERFACE FOO SUBROUTINE FOO(arg) INTEGER ARG(*) END SUBROUTINE FOO END INTERFACE ``` versus when the subroutine is in an unnamed interface such as: ``` INTERFACE SUBROUTINE FOO(arg) INTEGER ARG(*) END SUBROUTINE FOO END INTERFACE ``` I will not pretend to be a Fortran language expert and won't quote the language rules here. The short version is that unnamed interfaces allow more flexible type swizzling. For example, a user can pass a 2-dimensional INTEGER array to MPI_WAITALL's ARRAY_OF_REQUESTS argument, and -- assuming that the 2D array is contiguous in memory -- the compiler will make the 2D array type match the 1D ARRAY_OF_REQUESTS argument. Open MPI will get a contiguous chunk of memory of integers, so it doesn't know/care. The MPI standard does not address this issue at all. I.e., it doesn't say whether passing N-dimensional parameters to subroutines like MPI_WAITALL are permissable or not. It also does not specify whether interfaces need to be used, or if interfaces are used, whether they need to be named or unnamed -- these are all implementation decisions. However, there has been a lot of discussion about this in the Fortran group at the MPI Forum over the last week; there will almost certainly be an errata to MPI-4.0 and/or some text changes in MPI-4.1 to clarify the situation. I'm not going to speculate on the final language that will get accepted by the Forum, but it seems like since Open MPI's implentation decision to use named interfaces is perhaps the most restrictive in terms of compiler swizzling functionality, we should probably ease those restrictions so as not to violate the Law of Least Astonishment from the user's perspective (who would expect that the Fortran compiler will be able to swizzle array sizes and shapes). As such, this commit removes all interface names from the Fortran "mpi" bindings module where there's only a single subroutine in the interface, and that subroutine name exactly matches the interface name. Note that this specifically excludes the following routines, which have multiple subroutines in the interface (for two different types of Fortran pointers): * MPI_Alloc_mem * MPI_Win_allocate * MPI_Win_allocate_shared * MPI_Win_shared_query Deleting the interface names *may* cause ABI implications in some compilers. I have not exhaustively tested Fortran compilers to know if this is an issue or not. The safest thing to do is to implement this change for Open MPI v5.0.0 where we're breaking lots of ABI things as compared to prior versions of Open MPI. Note that this issue does *not* affect the mpi_f08 module. Using named interfaces were distinctly specified in the MPI standard. Also, this doesn't affect mpif.h, because Open MPI doesn't declare most interfaces/subroutines in mpif.h. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
@markalle @awlauria This PR should replace #9259 and #9367. I marked this as a blocker because it may have ABI implications, and we therefore need to get this in before 5.0.0. Also, note that we still need to fix the places where we accidentally specify array lengths (e.g., |
These subroutines are no longer referenced in the "mpi" module interface; remove them. I *believe* that these subroutines may have been kept for ABI / backwards linking compatibility at one point. But for Open MPI v5.0.0, we're breaking ABI, so these old symbols do not need to be maintained any more. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
The OMPI community jenkins build is in a weird state (it never came up with a "Details" link). Let's try triggering it again... |
bot:ompi:retest |
Looks fantastic. I found one place in the standard (17.1.5) where it gives an example
that supports the idea that they're using the generic interface and we thus don't have to support the fancy argument matching, but I still like this PR's approach of being conservative and assuming we should match it (least astonishment, and the standard isn't totally clear) and especially since it looks like an easy change. I just tested the same strategy in my toy program and it worked for XLF and gfortran for me. But I'm not following the ABI concern. An MPI call into MPI_Isend from either |
Yeah, I talked about this example with Rolf R. I'm quite sure this example is why Past Jeff initially implemented the But at the end of the day, that code snipit is in an advice to implementors, and therefore is non-normative. I.e., it's not mandating that we have to use named interfaces for the
👍
Oh yes, I think you're right: the back-end symbol is the C code in The 2nd commit that I added to this PR, though (removing the legacy symbols, which I think were there solely for preserving ABI with a prior change), definitely removes some symbols and breaks the ABI, though. 😈 |
Yeah, I saw a few symbols going away and wasn't sure what they were, but figured maybe they were old extensions or something. So yeah, removing symbols I agree is changing the ABI |
This fixes the inconsistency between what is tested at the configure time (the unnamed Fortran interface) and what is actually used in the source code (the named Fortran interface). As was mentioned in open-mpi#9445, the unnamed interfaces are more forgiving when it comes to TKR mismatches. Therefore, the current configure-time check is not strict enough and is prone to giving false results (e.g. for NVHPC 23.3+). The fix is to switch to the named Fortran interface in the configure script. Note that the inconsistency was resolved in the main branch with open-mpi#9445, which we cannot cherry-pick because it potentially breaks the ABI compatibility. Signed-off-by: Sergey Kosukhin <sergey.kosukhin@mpimet.mpg.de>
Fortran is a deceptively complicated language. It turns out that it
is permissable to pass arrays of many sizes and shapes to subroutines
that expect 1-dimensional array parameters. For example, a C
programmer might look at the following subroutine declaration and
assume that it means that the ARRAY_OF_REQUESTS parameter is a
1-dimensional array of any length:
However, rules of the Fortran language allow the compiler to
effectively morph non-1-dimensional array source parameters to match,
for example, the 1-dimensional ARRAY_OF_REQUESTS dummy parameter.
Sometimes this will be done by copy-in-copy-out; other times, the
compiler can just swizzle the types to match because the source array
data is already contiguous in memory.
Regardless, there are several rules about when this type swizzling can
happen, and there are differences in the rules between when the target
subroutine in question is in a named interface such as:
versus when the subroutine is in an unnamed interface such as:
I will not pretend to be a Fortran language expert and won't quote the
language rules here. The short version is that unnamed interfaces
allow more flexible type swizzling. For example, a user can pass a
2-dimensional INTEGER array to MPI_WAITALL's ARRAY_OF_REQUESTS
argument, and -- assuming that the 2D array is contiguous in memory --
the compiler will make the 2D array type match the 1D
ARRAY_OF_REQUESTS argument. Open MPI will get a contiguous chunk of
memory of integers, so it doesn't know/care.
The MPI standard does not address this issue at all. I.e., it doesn't
say whether passing N-dimensional parameters to subroutines like
MPI_WAITALL are permissable or not. It also does not specify whether
interfaces need to be used, or if interfaces are used, whether they
need to be named or unnamed -- these are all implementation decisions.
However, there has been a lot of discussion about this in the Fortran
group at the MPI Forum over the last week; there will almost certainly
be an errata to MPI-4.0 and/or some text changes in MPI-4.1 to clarify
the situation. I'm not going to speculate on the final language that
will get accepted by the Forum, but it seems like since Open MPI's
implentation decision to use named interfaces is perhaps the most
restrictive in terms of compiler swizzling functionality, we should
probably ease those restrictions so as not to violate the Law of Least
Astonishment from the user's perspective (who would expect that the
Fortran compiler will be able to swizzle array sizes and shapes).
As such, this commit removes all interface names from the Fortran
"mpi" bindings module where there's only a single subroutine in the
interface, and that subroutine name exactly matches the interface
name. Note that this specifically excludes the following routines,
which have multiple subroutines in the interface (for two different
types of Fortran pointers):
Deleting the interface names may cause ABI implications in some
compilers. I have not exhaustively tested Fortran compilers to know
if this is an issue or not. The safest thing to do is to implement
this change for Open MPI v5.0.0 where we're breaking lots of ABI
things as compared to prior versions of Open MPI.
Note that this issue does not affect the mpi_f08 module. Using
named interfaces were distinctly specified in the MPI standard.
Also, this doesn't affect mpif.h, because Open MPI doesn't declare
most interfaces/subroutines in mpif.h.
Signed-off-by: Jeff Squyres jsquyres@cisco.com