Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conditional Expressions #183

Open
certik opened this issue Oct 5, 2020 · 68 comments
Open

Conditional Expressions #183

certik opened this issue Oct 5, 2020 · 68 comments
Labels
Fortran 2023 Proposal targeting the next Fortran standard F2023 (previously called F202X)

Comments

@certik
Copy link
Member

certik commented Oct 5, 2020

Relevant papers:

Taking the second example from the 18-274 paper:

  IF (PRESENT(D)) THEN
    CALL SUB(A,B,C,D)
  ELSE IF (X<1) THEN
    CALL SUB(A,B,C,EPSILON(X))
  ELSE
    CALL SUB(A,B,C,SPACING(X))
  END IF

One proposed syntax is "keyword syntax":

CALL SUB(A, B, C, IF (PRESENT(D) THEN D ELSE IF (X < 1) THEN EPSILON(X) ELSE SPACING(X) END IF)

The second proposed syntax is "? syntax":

CALL SUB(A, B, C, ? (PRESENT(D) D :? (X < 1) EPSILON(X) : SPACING(X) ?)
@certik certik added the Fortran 2023 Proposal targeting the next Fortran standard F2023 (previously called F202X) label Oct 5, 2020
@certik
Copy link
Member Author

certik commented Oct 5, 2020

Which of these do people find the most readable and least readable?

My own feeling (from most readable to least readable):

  1. Original
  2. keyword syntax
  3. ? syntax

We should also consider more use cases. I also asked at https://fortran-lang.discourse.group/t/202x-feature-conditional-expressions/329 to get more feedback on this feature.

@dev-zero
Copy link

dev-zero commented Oct 5, 2020

To be honest I find both variants suboptimal (the first one would make an if/else/endif block return a value when being used inline, and simply run code as usual when not, but would be easier to read) and I wonder whether there is really a need for general conditional expressions or whether a ternary operator would be sufficient. In the latter case I would suggest the Python-based syntax (which one could even nest):

CALL SUB(A, B, C, D IF PRESENT(D) ELSE (EPSILON(X) IF (X < 1) ELSE SPACING(X)))

@certik
Copy link
Member Author

certik commented Oct 5, 2020

@dev-zero in your opinion, can you rate by readability / preference the three options above plus your proposed Python-based syntax?

The ternary operator (that you presumably like) is the "? syntax" which is one of the proposed ideas, but you also say they are "suboptimal", so I am confused.

@milancurcic
Copy link
Member

From most to least readable:

  1. Original
  2. Python/@dev-zero syntax
  3. Keyword syntax
  4. ? syntax

Caveat: I use 2 in Python a lot so I'm used to it, thus my preference.

@epagone
Copy link

epagone commented Oct 5, 2020

FWIW my preference from most to least readable is:

  1. Original
  2. keyword syntax
  3. ? syntax

Aside the reduced readability, IMHO I cannot see any practical advantage in the new proposed expressions.

@dev-zero
Copy link

dev-zero commented Oct 5, 2020

My preference in terms of readability:

  1. Original
  2. Python-inspired syntax
  3. Keyword syntax
  4. ? syntax

As for why I dislike the ? syntax and the difference to the Python-inspired operator:

A general start and end marker ? for conditional expression would to me indicate that more complex conditional expressions should be possible at some point, with the only condition that all codepaths inside it have to return a compatible type, but that is with the current proposal not the case (nor should it be). On the other hand does the nesting as shown in the example above actually create (at least visually) an alias for ELSE IF: the :?, since the expression should actually be:

CALL SUB(A, B, C, ? (PRESENT(D)) D : ? (X < 1) EPSILON(X) : SPACING(X) ? ?)

correct? With the second to last ? omitted? Which I would find even less readable.

@everythingfunctional
Copy link
Member

everythingfunctional commented Oct 5, 2020

I think the readability question might slightly be missing the point. The question isn't about how easy is it to see what the code is doing, but how easy is it to see the code's intent.

To me, an if-else block says the intent of the code is to do different things based on some conditions. The new feature allows one to specify the intent "this variable (or argument) depends on some condition(s)" more clearly.

To use the example from above, if I see

  IF (PRESENT(D)) THEN
    CALL SUB(A,B,C,D)
  ELSE IF (X<1) THEN
    CALL SUB(A,B,C,EPSILON(X))
  ELSE
    CALL SUB(A,B,C,SPACING(X))
  END IF

It's not immediately obvious (especially in codes that are more complicated than this example) that we are definitely going to be calling sub in all cases, and the only difference is the value of the last argument. Whereas with

call SUB(A, B, C, IF (PRESENT(D)) THEN (D) ELSE IF (X < 1) THEN (EPSILON(X)) ELSE (SPACING(X) END IF)

that intent is put front and center without having to compare each branch in an entire if-else block.

So, my preferences would be for

  1. Keyword syntax - most similar to existing Fortran syntax
  2. Python syntax - still quite likely to be easily understood by new programmers
  3. ? syntax - similar to other languages, but not immediately obvious to new programmers

@certik
Copy link
Member Author

certik commented Oct 5, 2020

If the idea is to call SUB just once, to make the intent obvious, as @everythingfunctional correctly points out, then here is the alternative to the original:

  IF (PRESENT(D)) THEN
    D2 = D
  ELSE IF (X<1) THEN
    D2 = EPSILON(X)
  ELSE
    D2 = SPACING(X)
  END IF
  CALL SUB(A,B,C,D2)

Which I personally find more readable than:

CALL SUB(A, B, C, IF (PRESENT(D) THEN D ELSE IF (X < 1) THEN EPSILON(X) ELSE SPACING(X) END IF)

But it is true you have to declare an extra variable (although the performance of the code should be identical with a good compiler).

@milancurcic
Copy link
Member

Is a subset of this a good candidate for a stdlib function if_then_else(condition, expression1, expression2)?

@everythingfunctional
Copy link
Member

Is a subset of this a good candidate for a stdlib function if_then_else(condition, expression1, expression2)?

It would be, but merge already would be sufficient, and the idea of the proposal is to not evaluate the unused expression (which merge can't do).

@everythingfunctional
Copy link
Member

@certik , while you've removed the difficulty of seeing "do we always call SUB in every case", you've now introduced the ambiguity of "do we always assign to D2 in every case?" You're back to the exact same problem. You're using a multi-statement block to perform a single logical operation. A reader must inspect and evaluate every branch (the block as a whole) to understand it is performing a single logical operation.

@certik
Copy link
Member Author

certik commented Oct 5, 2020

My understanding is that this feature was approved by WG5 for inclusion into 202X based on a survey, where people expressed a wish to have conditional expressions in Fortran, but I don't think there were concrete proposals how it would look like, just that the feature would be nice to have. @sblionel is that an accurate statement?

@everythingfunctional explained well the main argument for conditional expressions is that they enforce a single logical operation to be assigned somewhere (as the result of the conditional expression).

I agree that at this level of "requirements" it seems like a good idea and I am not against that (I use it sometimes in Python, although rarely; I never use the ? notation in C or C++). But when it comes down to syntax and how it would actually look like in Fortran, it seems less readable in practice to many people, and that is would should count. If a feature looks good in abstract terms, but does not look good in concrete (syntax) terms, in my opinion we should not put it in until it looks good both in abstract and concrete terms.

A prior compiler implementation of this would be very helpful, so that we can play with it more, before putting it into the language.

@pbrady
Copy link

pbrady commented Oct 5, 2020

Adopting ?: for a conditional expression but using a difference syntax from those languages which already use it for the same purpose (i.e. C, C++, Java, Javascript), is a bad idea and will only lead to confusion. As someone who uses multiple languages, I would appreciate it if Fortran did not do anything weird here and go off the beaten path for no apparent reason.

@klausler
Copy link

klausler commented Oct 5, 2020

I forgot: why aren't we just fixing MERGE() to guarantee non-evaluation of the operand that isn't selected?

@certik
Copy link
Member Author

certik commented Oct 5, 2020

@klausler the arguments that have been put forth against fixing merge are:

  • MERGE is elemental. Conditional expressions are not.

  • Conditional expressions are a top-level single selection, so we can lift all the dynamic requirements that merge needs

  • MERGE requires TSOURCE and FSOURCE to have the "same type and type parameters"

I don't personally understand the arguments, but I think @everythingfunctional does? Brad, can you summarize here why we cannot extend merge? If we could just extend merge, that would be the best way forward I think.

@sblionel
Copy link
Member

sblionel commented Oct 5, 2020

My understanding is that this feature was approved by WG5 for inclusion into 202X based on a survey, where people expressed a wish to have conditional expressions in Fortran, but I don't think there were concrete proposals how it would look like, just that the feature would be nice to have. @sblionel is that an accurate statement?

Yes. The way we work is that WG5 outlines the general idea and J3 develops that into a specific proposal.

@everythingfunctional
Copy link
Member

merge is elemental, and as such must evaluate both expressions to determine the shape of the result. The example is, if one argument to merge is a scalar and the other is an array, even if the scalar is the selected value, the array expression must still be evaluated to determine the shape of the result, because the result will still be an array with that shape, just with each value equal to the given scalar. The proposed conditional expressions would not be capable of that, because each expression must have the same rank.

@everythingfunctional
Copy link
Member

@klausler , I actually proposed just that on the J3 discussion board, but Malcolm was able to give me convincing reasons against, along the lines of my previous comment.

@certik
Copy link
Member Author

certik commented Oct 5, 2020

@everythingfunctional cannot the shape in merge be determined at compile time for both arguments?

@klausler
Copy link

klausler commented Oct 5, 2020

But MERGE() could guarantee evaluation of at most one of its first two arguments in the case where the third argument is scalar and the first two arguments have the same rank, yes?

@klausler
Copy link

klausler commented Oct 5, 2020

@everythingfunctional cannot the shape in merge be determined at compile time for both arguments?

Rank, yes, apart from assumed-rank dummy arguments, but not shape.

@everythingfunctional
Copy link
Member

But MERGE() could guarantee evaluation of at most one of its first two arguments in the case where the third argument is scalar and the first two arguments have the same rank, yes?

I suppose that's true. In fact, one could go a small step further and state that the "unused" argument is evaluated only if necessary to determine the resultant shape. I.e., if the array argument is selected, the scalar argument wouldn't need to be evaluated.

However, that still wouldn't satisfy one of the use cases (although it's not one I find particularly compelling); conditionally supplying an optional argument. I think the proposed conditional expressions provide a clean way to provide that, with a convenient place to put the desired deferred evaluation functionality.

@certik
Copy link
Member Author

certik commented Oct 5, 2020

conditionally supplying an optional argument

This use case should by done by #22, that seems like a cleaner solution anyway.

So extending merge and implementing #22, we might be able to cover this feature.

@sgeard
Copy link

sgeard commented Oct 5, 2020

I'd prefer to see something like

select
    case (present(d))
        call sub(a,b,c,d)
    case (i<n) then case (a(i)==0)
        call sub(a,b,c,epsilon(x))
    case default
        call sub(a,b,c,spacing(x))
end select

Or possibly use switch instead of select

@everythingfunctional
Copy link
Member

#22 is about providing a default value if an argument is not provided; it is about the callee. This proposal is about how an argument could be provided or not; it is about the caller.

For example, say if x < 0.1 I don't want to provide an argument to some procedure, how would merge be sufficient? What would you provide as the other argument? I.e.

call sub(a, merge(???, x, x < 0.1))

@certik
Copy link
Member Author

certik commented Oct 6, 2020

@everythingfunctional how would that be written using conditional expressions? I don't think that's possible either, or am I missing something?

@everythingfunctional
Copy link
Member

The proposal specifically allows for the else part to be omitted in cases of passing to optional arguments. So the example would be (in the case of the keyword syntax)

call sub(a, if (x >= 0.1) then (x))

@certik
Copy link
Member Author

certik commented Oct 6, 2020

I see, I didn't realize that. Btw, I think the syntax is:

call sub(a, if (x >= 0.1) then (x) end if)

or

call sub(a, if (x >= 0.1) then x end if)

But adding parentheses around x actually makes it more readable to me.

Well, we can have some kind of syntax or keyword for merge, such as call sub(a, merge(*, x, x < 0.1)) or something like that, but I am not sure I like it.

@14NGiestas
Copy link

Is XIFLELSE(1.) a "conditional expression" involving variables X and L or is it a reference to an external function?

So you are saying that the fortran tokenizer get rid of all whitespace... hmpf that would be a problem indeed :/
X IF L ELSE (1.) turns into XIFLELSE(1.) which is ambiguous (I think I figured out where I've failed in my VDF now xD)

@everythingfunctional
Copy link
Member

Is XIFLELSE(1.) a "conditional expression" involving variables X and L or is it a reference to an external function?

In my opinion, with fixed-form source having been made obsolescent, at some point we should be allowed to stop worrying about how new features can expressed without significant whitespace. And so, XIFLELSE(1.) is a function call, and X IF L ELSE (1.) is a conditional expression. Unless there's some other reason I'm missing.

@certik
Copy link
Member Author

certik commented Oct 27, 2020

It was discussed at the October 2020 Fortran call that another good use case for this feature would be for array initializers. In Python you can do (I corrected the syntax, thanks to @ivan-pi's comment below):

[x+1 if x >= 45 else x+5 for x in range(1, N+1)]

So in Fortran you could do:

[ (if (x >= 45) then x+1 else x+5 endif, x = 1, N) ]

Since conditional expressions are just expressions, this should just work.

@ivan-pi
Copy link

ivan-pi commented Nov 5, 2020

In Python you can do:

[x+1 for x in range(1,N+1) if x >= 45 else x+5]

This triggers a SyntaxError in Python. The correct way would be:

[x + 1 if x > 45 else x+5 for x in range(1,N+1)]

So the syntax is

expression_if_true if condition else expression_if_false

Edit: following Python if expression syntax, the example from above would be:

CALL SUB(A, B, C, D IF PRESENT(D) ELSE (EPSILON(X) IF X < 1 ELSE SPACING(X)))

It seems fairly nice, you just need to remember the condition is in the middle.

Edit2: Oops, I just noticed the posts from @14NGiestas above.

@certik
Copy link
Member Author

certik commented Jun 23, 2021

New syntax paper being proposed at the June 2021 J3 Committee meeting:

@klausler
Copy link

New syntax paper being proposed at the June 2021 J3 Committee meeting:

What a mess! Tell me again how this is supposed to be better than a MERGE() intrinsic with stronger non-evaluation guarantees, please.

@certik
Copy link
Member Author

certik commented Jun 23, 2021

@klausler thanks for the feedback. I personally think we should not do this feature at all (i.e. NO on the 21-157 proposal), as the two syntaxes (keyword and ?) seem worse than not doing this (based on the feedback both online above, as well as in private that I got). We can pursue the merge() idea and see if we can come up with a proposal.

@klausler
Copy link

@klausler thanks for the feedback. I personally think we should not do this feature at all (i.e. NO on the 21-157 proposal), as the two syntaxes (keyword and ?) seem worse than not doing this (based on the feedback both online above, as well as in private that I got). We can pursue the merge() idea and see if we can come up with a proposal.

I'm sure we've talked about this. In short, strengthen MERGE() so that when its mask (3rd) argument is scalar, and the TSOURCE&FSOURCE arguments have the same type and rank, it guarantees that exactly one of its TSOURCE/FSOURCE arguments are evaluated, and returns the appropriate value. Or define a new intrinsic function with these guarantees.

@sblionel
Copy link
Member

The idea of modifying MERGE was discussed several meetings ago (I can't find which one). I liked the idea, but there were complaints that it would slow down MERGE for everyone and it failed.

@klausler
Copy link

The idea of modifying MERGE was discussed several meetings ago (I can't find which one). I liked the idea, but there were complaints that it would slow down MERGE for everyone and it failed.

That's astonishing and credible at the same time.

@urbanjost
Copy link

I use MERGE but hate the name, so how about "CHOOSE", and it would short-circuit and allow character variables of different lengths as arguments. I think the short-circuit would be enough, but I would definitely want it to be able to work with optional parameters so it could be used with

subroutine a(opt)
character(len=*),intent(in),optional :: opt
character(len=:),allocatable :: opt_local
opt_local=choose(lower(opt_local),'default',present(opt))
   ...
so "lower" would not be called on an undefined value.

@urbanjost
Copy link

I really do mean it to be different than MERGE; it would return the first or second argument, which could be of different size, not elementally using a mask, but just based on whether the third argument is T or F.

@urbanjost
Copy link

And I wish MERGE had been called PICK, given that SELECT is taken.

@klausler
Copy link

Whoops, I accidentally closed this issue by hitting the wrong button. Fixing now.

@klausler klausler reopened this Jun 23, 2021
@certik
Copy link
Member Author

certik commented Jun 24, 2021

Here is the discussion of a new intrinsic, say, ifthen by the committee: #183 (comment)

@certik
Copy link
Member Author

certik commented Jun 24, 2021

Finally, here is another paper that we will vote on next Monday:

This is essentially the "arrow form": #183 (comment)

What is your opinion on that one?

@klausler
Copy link

Finally, here is another paper that we will vote on next Monday:

This is essentially the "arrow form": #183 (comment)

What is your opinion on that one?

It still changes the syntax of expressions, so it would affect parsing, AST definitions, &c.; this would be needless work in every implementation over and above the straightforward semantic analysis that a new intrinsic function would cost. Tokenizing the new "->' symbol correctly would impose a look-ahead requirement on tokenization of the current "-". And it implies but does not specify the operator precedence of a "cond-expr" -- does it replace a current parenthesized expression in the syntax? Can it be used as a variable?

I hate it less than the first two ("IF" and the "?") but not by much. I don't understand the aversion to using an intrinsic function; they're easy to parse, they nest in obvious ways, and they're more likely to be understood than new operator syntax.

@certik
Copy link
Member Author

certik commented Jun 24, 2021

I don't understand the aversion to using an intrinsic function; they're easy to parse, they nest in obvious ways, and they're more likely to be understood than new operator syntax.

The only arguments I've heard against an intrinsic function are summarized here: #183 (comment) (harder to do chaining, possibly more confusing due to some arguments not being evaluated).

@veryreverie
Copy link

veryreverie commented Jun 24, 2021

Would this proposal be limited to use in function calls? And to one-line if statements?

I'd like to suggest a slight frame-change, and propose allowing if statements which return values, e.g.

y = if (x>0) then
  x
else
  0
endif

If this syntax was included, then it gives you the power of "conditional arguments", and more besides. Like a lot of these proposals, this doesn't give you something you couldn't already do, but I feel like this is both clearer and more powerful than the existing alternatives, and as previously mentioned can be chained with other things like array initialisation.

I don't believe this would conflict with existing syntax, in that I don't believe you can currently have statements with return values on their own (e.g. the line x on its own is not allowed).

I guess you'd need to decide what to do if when there was no return value because there was no else clause. e.g. in y = if (x<0) then x endif if x>0. Personally I'd favour leaving y however it was before, probably with a compiler warning to suggest adding an else if y is not allocatable or an optional argument.

I suppose also if if statements with return values were allowed, then for consistency the various select case, select type etc. statements should also be allowed to have return values.

For language consistency reasons, I'm against the other syntax options (as much as I personally like the pythonic syntax in python). I can see the arguments for dropping the then and endif parts of the one-line if statement, but I think that if they're required in regular one-line if statements then they should also be required in one-line if statements with return values. I also think the proposed ifthen(x,y,z) function is not ideal, but better than nothing.

For clarity, my ranking of the proposed syntaxes is:

  1. if statements with return values.
  2. The f(a, b, if (x) then y else z endif) syntax.
  3. Some kind of ifthen function.
  4. The original syntax.
  5. [ Big gap ]
  6. everything else.

@klausler
Copy link

So you want Fortran parsers to be able to handle statements as parts of expressions, and add a new kind of expression-only statement. What happens if one of your "then" or "else" parts is "returning" the value of a variable named ELSE? or END?

@nshaffer
Copy link

I think all the proposed syntaxes are OK (just OK) for simple conditional expressions, but that they become very hard to read for compound conditional expressions. I can only speak for myself, but every example I read with compound conditional expressions I have to mentally step through branch-by-branch to make sure I understand what it's doing. Basically, in reading

call sub(a, b, c, ? (present(d) d :? (x < 1) epsilon(x) : spacing(x) ?)

I mentally reconstruct

block 
  real :: d_
  if (present(d)) then
    d_ = d
  else if (x<1) then
    d_ = epsilon(x)
  else
    d_ = spacing(x)
  end if
  call sub(a,b,c,d_)
end block

If I'm reading someone else's code, I'd much rather see the second form than the first (or any of it's proposed variations). Yes, it's verbose, but it's also obvious.

I'm more partial to a function-like syntax, i.e., ifthen mentioned upthread (or whatever name). The fact that it's clumsy to chain was cited as a con, but to me it is a pro. It should be awkward to write hard-to-read code.

@milancurcic
Copy link
Member

We started a poll to collect feedback on this feature. We’ll present the results of the poll to the Committee on Monday when the proposals for this feature are due for discussion and a vote.

@veryreverie
Copy link

veryreverie commented Jun 24, 2021

So you want Fortran parsers to be able to handle statements as parts of expressions, and add a new kind of expression-only statement.

Ideally, yes. I think it would add benefits to the language.

What happens if one of your "then" or "else" parts is "returning" the value of a variable named ELSE? or END?

I'm surprised these are not already reserved words. I guess a possible solution would be to require brackets or similar, so that the syntax would be

y = if (x>0) then
  (x)
else
  (0)
endif

and with an interesting choice of variable names,

if = if (end>else) then
  (end)
else
  (else)
endif

I guess this would also help with syntax parsing.

@Beliavsky
Copy link

Beliavsky commented Jun 24, 2021

I support the function syntax. My suggested name for ifthen is lazy_merge, which would emphasize that this function behaves differently from other Fortran functions, where all arguments are evaluated. Lazy_pick or lazy_choose are also possible names.

@klausler
Copy link

This is not the same thing as lazy evaluation as the term is commonly understood in programming language theory.

@certik
Copy link
Member Author

certik commented Jun 26, 2021

@klausler here is a draft of a paper for the intrinsic approach: #213, can you please help me finish it, so that I can submit it to the committee as an alternative?

@urbanjost
Copy link

urbanjost commented Jun 26, 2021

It is a lot easier to do in an interpreted language, and some might not like the reuse of IF but in a little scripting language I have where everything is a function IF acts like a function if given more than one parameter and only evaluates one of the following expressions depending on the results of the conditional, in the form if(expression,eval_if_true, eval_if_false). I have used it so long it seems natural to me. Trying to put that into a Fortran context it might look like

 program testit
contains
   call passto(20)
   call passto()

subroutine passto(a)
integer,optional :: a
integer          :: b
   b=if(present(a),a,10)
   write(*,*)b
end subroutine passto

end program testit

I can think of some reasons that might be disliked, but I have seen a lot of comments about the complexity of some of the solutions and I have used that for a long time and it is pretty easy to type even interactively. The language also lets logicals return an integer and a lot of other un-Fortranish things so things like like doing a sum() of a bunch of expressions and being able to do something if 2 out of 3 are true is easy, or doing a max() or min() on a list of logical expressions makes sense; but now that there is ANY() and ALL() Fortran can do something similiar now. Of course only evaluating one of the expressions is easy in a scripting language and very much against standard Fortran behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Fortran 2023 Proposal targeting the next Fortran standard F2023 (previously called F202X)
Projects
None yet
Development

No branches or pull requests