Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

let Fortran array match caller of different rank (in ignore_tkr) #9367

Closed
wants to merge 1 commit into from

Conversation

markalle
Copy link
Contributor

@markalle markalle commented Sep 10, 2021

The below example program uses a 2d array for requests and that gets rejected by the type/kind/rank checking when the MPI_Waitall module says the incoming argument should be an array.

    program main
    use mpi
    integer, contiguous, pointer :: requests(:, :)
    integer :: ierr, sbuf1, sbuf2, rbuf1, rbuf2
    allocate(requests(2,2))
    call MPI_Init(ierr)
    sbuf1 = 1
    sbuf2 = 2
    rbuf1 = -1
    rbuf2 = -1
    call MPI_Irecv(rbuf1, 1, MPI_INTEGER, 0, 99, MPI_COMM_SELF, requests(1,1), ierr)
    call MPI_Irecv(rbuf2, 1, MPI_INTEGER, 0, 99, MPI_COMM_SELF, requests(1,2), ierr)
    call MPI_Isend(sbuf1, 1, MPI_INTEGER, 0, 99, MPI_COMM_SELF, requests(2,1), ierr)
    call MPI_Isend(sbuf2, 1, MPI_INTEGER, 0, 99, MPI_COMM_SELF, requests(2,2), ierr)
    call MPI_Waitall(size(requests), requests, MPI_STATUSES_IGNORE, ierr)
    print *, 'rbufs:', rbuf1, rbuf2
    if (rbuf1 .ne. 1 .or. rbuf2 .ne. 2) then
        print *, 'failed'
        call MPI_Abort(MPI_COMM_WORLD, MPI_ERR_OTHER, ierr)
    else
        print *, 'passed'
    endif
    call MPI_Finalize(ierr)
    end
% mpifort -o x mpi.F90
> "mpi.F90", line 15.6: 1513-062 (S) Generic procedure reference
> can not be resolved due to incorrect actual argument attributes.
> ** main   === End of Compilation 1 ===
> 1501-511  Compilation failed for file mpi.F90.

The above is XLF, but gfortran is similar.

This commit changes array_of_requests and a bunch of other arrays from

   integer, dimension(count), intent(inout) :: array_of_requests

to

   @OMPI_FORTRAN_IGNORE_TKR_PREDECL@ array_of_requests
   @OMPI_FORTRAN_IGNORE_TKR_TYPE@, intent(inout) :: array_of_requests

This approach throws away more than just the rank check, so for example it would now start failing to detect if an array of reals was passed in as a request array. But I didn't find a good way to be more targeted.

Some compilers can be more targeted, eg XLF can use

    integer count
    !ibm* ignore_tkr (r) array
    integer, dimension(count) :: array

but gfortran as far as I know doesn't have an equivalent, and only has

    integer count
    !GCC$ ATTRIBUTES NO_ARG_CHECK :: array
    type(*), dimension(*) :: array

but the macro system would have to provide a lot more than settings for OMPI_FORTRAN_IGNORE_TKR_PREDECL and OMPI_FORTRAN_IGNORE_TKR_TYPE to produce that code. Plus XLF recognizes the gfortran pragmas so it takes the gfortran path in configure so it wouldn't necessarily even take the more targeted path if we did provide it.

Signed-off-by: Mark Allen markalle@us.ibm.com

The below example program uses a 2d array for requests and that gets
rejected by the type/kind/rank checking when the MPI_Waitall module
says the incoming argument should be an array.

    program main
    use mpi
    integer, contiguous, pointer :: requests(:, :)
    integer :: ierr, sbuf1, sbuf2, rbuf1, rbuf2
    allocate(requests(2,2))
    call MPI_Init(ierr)
    sbuf1 = 1
    sbuf2 = 2
    rbuf1 = -1
    rbuf2 = -1
    call MPI_Irecv(rbuf1, 1, MPI_INTEGER, 0, 99, MPI_COMM_SELF, requests(1,1), ierr)
    call MPI_Irecv(rbuf2, 1, MPI_INTEGER, 0, 99, MPI_COMM_SELF, requests(1,2), ierr)
    call MPI_Isend(sbuf1, 1, MPI_INTEGER, 0, 99, MPI_COMM_SELF, requests(2,1), ierr)
    call MPI_Isend(sbuf2, 1, MPI_INTEGER, 0, 99, MPI_COMM_SELF, requests(2,2), ierr)
    call MPI_Waitall(size(requests), requests, MPI_STATUSES_IGNORE, ierr)
    print *, 'rbufs:', rbuf1, rbuf2
    if (rbuf1 .ne. 1 .or. rbuf2 .ne. 2) then
        print *, 'failed'
        call MPI_Abort(MPI_COMM_WORLD, MPI_ERR_OTHER, ierr)
    else
        print *, 'passed'
    endif
    call MPI_Finalize(ierr)
    end

% mpifort -o x mpi.F90
> "mpi.F90", line 15.6: 1513-062 (S) Generic procedure reference
> can not be resolved due to incorrect actual argument attributes.
> ** main   === End of Compilation 1 ===
> 1501-511  Compilation failed for file mpi.F90.

The above is XLF, but gfortran is similar.

This commit changes array_of_requests and a bunch of other arrays from
   integer, dimension(count), intent(inout) :: array_of_requests
to
   @OMPI_FORTRAN_IGNORE_TKR_PREDECL@ array_of_requests
   @OMPI_FORTRAN_IGNORE_TKR_TYPE@, intent(inout) :: array_of_requests

This approach throws away more than just the rank check, so for example
it would now start failing to detect if an array of reals was passed in
as a request array.  But I didn't find a good way to be more targeted.

Some compilers can be more targeted, eg XLF can use
    integer count
    !ibm* ignore_tkr (r) array
    integer, dimension(count) :: array
but gfortran as far as I know doesn't have an equivalent, and only has
    integer count
    !GCC$ ATTRIBUTES NO_ARG_CHECK :: array
    type(*), dimension(*) :: array
but the macro system would have to provide a lot more than settings
for OMPI_FORTRAN_IGNORE_TKR_PREDECL and OMPI_FORTRAN_IGNORE_TKR_TYPE
to produce that code.  Plus XLF recognizes the gfortran pragmas so it
takes the gfortran path in configure so it wouldn't necessarily even
take the more targeted path if we did provide it.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
@bosilca
Copy link
Member

bosilca commented Sep 14, 2021

The incoming argument for MPI_Waitall must be an unidimensional array, in other words the elements must be stored in contiguous memory locations. The code presented here would allow for ranges in multi-dimensional arrays. Too dangerous to justify the minimal benefit provided here.

Second, Fortran arrays being column-major, while the indexes returned by the any and some flavors of these functions will be row-major, and therefore will require a carefully handling by the caller. Again, way too much opportunity for badness for extremely little benefit.

@markalle
Copy link
Contributor Author

markalle commented Sep 14, 2021

This originally came from a customer (somebody at max planck) and one of our Fortran guys was saying it should be legal according to the MPI standard. The "use mpi" version of the function specification says

MPI_WAITALL(COUNT, ARRAY_OF_REQUESTS, ARRAY_OF_STATUSES, IERROR)
  INTEGER COUNT, ARRAY_OF_REQUESTS(*)
  INTEGER ARRAY_OF_STATUSES(MPI_STATUS_SIZE,*), IERROR

so array_of_requests is an assumed-size array and should match an actual argument array of any rank.

I agree the actual argument has to be contiguous. Our Fortran guy said that's part of the rules for assumed-size arrays.

Copy link
Member

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to @bosilca's comment, I do not believe that this PR is correct.

The Fortran source code cited in the description is not correct. It is not a valid Fortran MPI program to pass a 2D array of requests to MPI_WAITALL -- the subroutine declaration in all versions of the MPI spec clearly show that it is a 1D array. To be clear: integer, dimension(2, 2) :: requests should never match a dummy parameter type integer array_of_requests(*).

Here's my breakdown:

  1. The allocatable part of the code is a red herring. The real issue is that requests has an array rank of 2, not 1 (i.e., it has 2 dimensions, not 1).
  2. Similarly, it also does not matter that the 2D array entries happen to be contiguous in memory. It's still a 2D array, not 1D.
  3. The MPI spec does show an assumed size dummy parameter array_of_requests in the official interface definition, and in that respect, I think Open MPI's mpi module is wrong and should be fixed to not specify the length of the array (either in this PR and Some fortran fixes #9259, or -- better yet -- consolidated into its own, self-contained PR).
  4. But making the dummy parameter an "ignore TKR" parameter so that we ignore everything about the parameter and potentially opening up the application developer to passing in incorrect parameters -- that's double plus ungood. The whole point of the mpi module was to introduce strict typing to the MPI interfaces so that app developers would be alerted -- at compile time -- when they passed in incorrect parameters.

Specifically: in this case, I believe that Open MPI is correct in failing to compile a call to MPI_WAITALL with a 2D array_of_requests in the mpi module. That's not something that should be fixed.

The same is true for all the other interfaces that were modified in this PR. Many of them were already correctly denoted as assumed size, and should stay that way. Making them "ignore TKR" is actually harmful, because it won't protect MPI app developers from passing incorrect dummy parameter types.

However, we should fix the mpi module to support assumed size array_of_requests in MPI_WAITALL (i.e., pass in any size array of requests; it does not have to be exactly count entries), and any other relevant interfaces where we have it wrong.

Sidenote: passing a 1D array_of_requests to MPI_WAITALL that is exactly count elements is required -- and correctly implemented in Open MPI -- in the mpi_f08 module. This is a difference in the MPI spec between the MPI_WAITALL interfaces in mpif.h/mpi module and the mpi_f08 module.

I'm guessing that the app developer may have used mpif.h in the past, where the prototype for MPI_Waitall was not specified. In that case, passing a contiguous 2D array of requests would happily compile and would probably even work when executed. But when upgrading to the mpi module, the same code failed to compile. This is the canonical definition of "This is not a bug in Open MPI, it's a feature!" 😛

@jsquyres
Copy link
Member

so array_of_requests is an assumed-size array and should match an actual argument array of any rank.

This is the crux of what I believe is incorrect. For example:

$ cat module.F90
module mpi
  interface MPI_Hello
     subroutine MPI_Hello(count, array_of_requests)
       integer :: count
       integer :: array_of_requests(*)
     end subroutine MPI_Hello
  end interface MPI_Hello
end module mpi

program main
  use mpi
  integer count
  integer array_of_requests_1d(2)
  integer array_of_requests_2d(2, 2)
  call MPI_Hello(count, array_of_requests_1d)
  call MPI_Hello(count, array_of_requests_2d)
end program main

Compiling with modern gfortran fails on the call to MPI_Hello with the 2D array:

$ gfortran --version
GNU Fortran (GCC) 10.2.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gfortran module.F90
module.F90:16:45:

   16 |   call MPI_Hello(count, array_of_requests_2d)
      |                                             1
Error: There is no specific subroutine for the generic ‘mpi_hello’ at (1)

And compiling it with ifort also fails similarly:

$ ifort --version
ifort (IFORT) 19.0.4.227 20190416
Copyright (C) 1985-2019 Intel Corporation.  All rights reserved.

$ ifort module.F90
module.F90(16): error #6285: There is no matching specific subroutine for this generic subroutine call.   [MPI_HELLO]
  call MPI_Hello(count, array_of_requests_2d)
-------^
compilation aborted for module.F90 (code 1)

@markalle
Copy link
Contributor Author

I was going by a Fortran guy overr here's original assessment. I've reached out again to ask if he's still convinced that's supposed to be legal. My intuitive expectation is actually more in line with what you guys are saying about the rank being 1d, but I'll update when I get a response.

@jsquyres
Copy link
Member

Hey @markalle: I also reached out to the Forum Fortran people about this issue. It actually turned out to be extremely murky. The short version is:

  1. I'm not a Fortran expert, but from what I understood, mixing assumed shape and assumed size is a corner case that is not supposed to match. Maybe I'm wrong.
  2. There are cases -- not shown by this specific sample program -- where an N-dimensional array can be passed to an assumed-size dummy parameter.

It's now an open debate as to what the Forum intends. Do they intend that only 1D arrays are supposed to be passed to MPI_WAITALL (and friends)? Or something else? This also gets mixed up into backwards compatibility discussions, etc. Ugh!

There's very lengthy emails going back and forth about this right now... 😝

@markalle
Copy link
Contributor Author

Thanks. I doubt the forum meant to support anything weird, but it wouldn't surprise me if the Fortran language supports weird things and thus the prototype they specified in the standard might be allowing stranger argument matching than they intended.

I looked at our old e-mail stream and what my Fortran guy said then is "in Fortran, assumed-size array dummy arguments can be associated with an array actual argument of any rank" and "the actual argument has to be contiguous. That's part of the rules for assumed-size arrays".

@bosilca
Copy link
Member

bosilca commented Sep 22, 2021

If the assumption that "in Fortran, assumed-size array dummy arguments can be associated with an array actual argument of any rank" is true, I would argue it is at the charge of the compiler to do all the necessary checks and/or the necessary rank conversion. Apparently, neither gfortran nor the IBM fortran compiler are not doing this, forcing us, software developers, to weaken the argument checking by preventing the compiler from doing any check. I'll stick with the current strong checking and not allow 2d arrays or requests to be used with MPI.

@jsquyres
Copy link
Member

@bosilca I agree that "ignore TKR" is the wrong way to go here. Despair not (at least, not for this reason): there is significant emails going around with The Fortran People(tm) to figure out what the Right Thing to do here is. This is effectively a Forum issue now...

@markalle
Copy link
Contributor Author

I got a response back from my Fortran guy. He says it depends if the MPI standard is specifying a specific or generic interface

  • specific interface : is valid to pass in the 2d array
  • generic interface : not legal to match the 2d array

To me the MPI standard isn't super clear which kind of interface it's specifyfing, but down in 17.1 it gives an example

For example, the interface specifications together with the specific procedure names can be implemented with

MODULE mpi
  INTERFACE MPI_Comm_rank  ! (as defined in Chapter 6)
    SUBROUTINE MPI_Comm_rank(comm, rank, ierror)
      INTEGER, INTENT(IN)  :: comm   ! The INTENT may be added although
      INTEGER, INTENT(OUT) :: rank   ! it is not defined in the
      INTEGER, INTENT(OUT) :: ierror ! official routine definition.
    END SUBROUTINE
  END INTERFACE
END MODULE mpi

where the word "interface" here makes it a generic procedure. So I think that's enough to let us off the hook and not support such matching.

@jsquyres
Copy link
Member

@markalle Yes, this is exactly the kind of discussion going on right now... (specific vs. generic interface, backwards compatibility ramifications, ... yadda yadda yadda)

@markalle
Copy link
Contributor Author

Well I'm okay with retracting this PR, do you want to keep it around instead?

@jsquyres
Copy link
Member

We know that we definitely need to fix the (count) arguments to MPI_WAITALL (and potentially others?).

I think we should need wait to see what the Forum decides about the rest of the issues, though.

@gpaulsen gpaulsen marked this pull request as draft September 28, 2021 15:09
@gpaulsen
Copy link
Member

gpaulsen commented Sep 28, 2021

Changed to Draft awaiting MPI Forum decision on what to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants