Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LEAVE phaser with return #417

Open
patrickbkr opened this issue Jan 16, 2024 · 46 comments
Open

LEAVE phaser with return #417

patrickbkr opened this issue Jan 16, 2024 · 46 comments
Labels
language Changes to the Raku Programming Language

Comments

@patrickbkr
Copy link
Member

What should happen when a LEAVE phaser contains a return statement?

Given these four snippets, what should each of those print?

sub s() {
    LEAVE return 5;
    return 7;
}
say s()
sub s(&code) {
    LEAVE return 5;
    code();
}
sub s2() {
    s({
        return 7;
    });
    return 1;
}
say s2()
sub s() {
    LEAVE return 5;
    7;
}
s()
sub s(&code) {
    LEAVE return 1;
    code();
}
sub s2() {
    s({
        return 7;
    });
}
s2()
@patrickbkr patrickbkr added the language Changes to the Raku Programming Language label Jan 16, 2024
@patrickbkr
Copy link
Member Author

With MoarVM/MoarVM#1785 and MoarVM/MoarVM#1782 merged current rakudo prints:

  1. 5
  2. 1
  3. 5
  4. 1

I especially wonder about 2. and 4. It's comprehensible what's happening and why, but feels like a WAT. Maybe it's simply a DIHWIDT, similar to what next does here:

sub s() { next; }
for ^10 { s(); say $_; }

@lizmat
Copy link
Collaborator

lizmat commented Jan 16, 2024

In my opinion a return in a LEAVE phaser shouldn't be different from a return in any other phaser, e.g. ENTER or FIRST. With the new Raku grammar, FIRST appears to be doing that apparently already:

sub a { FIRST return 42 }; dd a;  # 42

So in my book, the return values would be: 5, 1, 5, 1

@lizmat
Copy link
Collaborator

lizmat commented Jan 16, 2024

Or we should disallow return from phasers, period. And make it report an appropriate execution error (or possibly a compile time error in RakuAST).

@vrurg
Copy link
Contributor

vrurg commented Jan 16, 2024

There is a problem with cases 2 and 4. Both has to be 7. And here is why.

Return from a block applies to the closure where the block belongs to. This is why .map({ return 7 }) works as expected. From this point of view it doesn't matter what happens inside of sub s and what it returns – but return within the block must win.

Or we should disallow return from phasers, period.

Better not.

@lizmat
Copy link
Collaborator

lizmat commented Jan 16, 2024

Ok, thinking about this more: I change my answers to 5, 5, 5, 1. And here's why:

In case 2, since &code is a Block, the return is applied to s and not to s2. However, that return implies that it is leaving the scope of s, and thus its LEAVE phaser will fire causing it to return 5.

I don't see an issue with 4. The same logic as 2. applies here.

@vrurg
Copy link
Contributor

vrurg commented Jan 16, 2024

@lizmat you're wrong. Letting a block return from s2 via invoking &code makes the code in s2 unpredictable and will require additional complications like checking if it's a block or not in order to prevent the block from breaking our behavior. Not to mention that it wouldn't be backward compatible:

sub foo(&c) { 
    my \rc = &c(); 
    say "foo rc=", rc.raku; # no output
    rc 
} 
sub bar() { 
    foo { return 1 }; 
    return 2  
}
say bar; # 1

I will remain in the position that a block must return from its closure.

@patrickbkr
Copy link
Member Author

patrickbkr commented Jan 17, 2024

I think it will help to explain what Rakudo currently does technically. Looking at the 2nd example:

sub s(&code) {
    LEAVE return 5;
    code();
}
sub s2() {
    s({
        return 7;
    });
    return 1;
}
say s2()
  • s2 is entered
  • s is entered
  • &code is entered
  • return 7 is called
  • throwpayloadlexcaller is called (https://github.com/MoarVM/MoarVM/blob/main/src/core/interp.c#L4908C3-L4908C3)
  • A CONTROL handler is searched in the lexical scope of return 7. (MVM_exception_throwpayload and then search_for_handler_from)
  • The handler of s2 is found and a frame unwind (following the caller chain) is started targetting the handler in s2 (MVM_frame_unwind_to)
  • During that unwind the following two frames are unwound in order: {return 7}, s()
  • The exit handler of the s() frame is seen. The unwind process is paused (not aborted, only paused), and the LEAVE block is executed.
  • return 5 is called
  • throwpayloadlexcaller is called
  • A CONTROL handler is searched in the lexical scope of return 5.
  • The handler of s is found and a frame unwind (following the caller chain) is started targetting the handler in s
  • During that unwind only the LEAVE return 5 frame is unwound.
  • The handler in s() is reached. s() returns and hands the value 5 to s2().
  • Execution continues in s2().
  • The returned value 5 is sunk.
  • return 1 is called. Same process as above.
  • 1 is sayed

I think the behavior is consistent in that once I understood that &returns are effectively exceptions that search handlers in the outer chain and then unwind the caller chain, the resulting behavior is very comprehensible and expected. Departure from the described behavior will mean adding additional logic to the behavior of stack unwinding. So there would be more logic to know about to fully grasp what's happening.


@vrurg I believe the above behavior is interesting, because your requirement

Return from a block applies to the closure where the block belongs to.

is upheld. (I strongly support this requirement.) But the return 5 in s still manages to interrupt the unwind of return 7 with lasting effect.


Just to state a possibility of how we could adapt the above logic:
We detect the situation where a return (return 5 in the above code) interrupts the unwind of some other return (return 7). Then

if both exceptions are CONTROL {
    if target handlers of both exceptions are equal { # This means the returns are in the same lexical scope
        ... do interrupt the first unwind and proceed with the new return value
    }
    else {
        ... let the first unwind continue, ignore the second `return`
    }
}

This would have the consequence that the results would be:

  1. 5
  2. 7
  3. 5
  4. 7

I'm undecided if I like this or not. It limits action at a distance: The return 5 in s doesn't change the effect of return 7 in s2. But it also adds action at a distance: The return 7 in s2 does change the behavior of the return 5 in s.

@lizmat
Copy link
Collaborator

lizmat commented Jan 17, 2024

All of this explanation that is needed to explain behaviour, is convincing me that a return should NOT be allowed in a phaser. Sadly, there is no way to reliably enforce that at compile time.

But we could at run-time by codegenning a CONTROL block in any phaser block. Conceptually:

LEAVE {
    say "bye";
    return 42;
}

would be codegenned as:

LEAVE {
    CONTROL {
        die "return within a LEAVE phaser" if $_ ~~ CX::Return
    }
    say "bye";
    return 42;
}

Of course, if the return were to be directly in the phaser's body, we could make that a compile time error as well.

Since phasers cannot be inlined anyway, I don't think this will affect performance of phasers much.

@lizmat
Copy link
Collaborator

lizmat commented Jan 17, 2024

To actually not break any code out there that is depending in old "return in phaser" behaviour, we could limit this CONTROL codegenning to 6.e.

@vrurg
Copy link
Contributor

vrurg commented Jan 17, 2024

I'm undecided if I like this or not. It limits action at a distance: The return 5 in s doesn't change the effect of return 7 in s2. But it also adds action at a distance: The return 7 in s2 does change the behavior of the return 5 in s.

What would it change in return 5 behavior? The returned value is not going to be used by definition. So, we can sink it immediately and for the code in s it will go unnoticed.

The noticeable change is that immediate termination of return procedures will prevent a potential CONTROL block in s from being invoked. Perhaps, this can be easy to fix by adding a special if branch which would try the path of seeking for the CONTROL block in the lexical scope without initiating a new unwind.

So, I currently see no cardinal changes that might break s semantics or user expectations about it.

All of this explanation that is needed to explain behaviour, is convincing me that a return should NOT be allowed in a phaser.

While not being fully against this prohibition, I'd not appreciate it. Besides, consider that CATCH and CONTROL are phasers too. Returning a value from CATCH is often an important way to handle an error.

Sadly, there is no way to reliably enforce that at compile time.

Again, hope this won't be needed. But technically, an explicit return can be detected as soon as:

  1. the current lexical context belongs to a phaser (the compiler knows it)
  2. the current lexical scope doesn't belong to a routine (pointy block included)

Both being True means this return is going to cause a trouble.

But we could at run-time by codegenning a CONTROL block in any phaser block.

Even if we need it, I wouldn't be happy about extra implicit code injection. A VM can be more efficient in implementing this. And more universal.

But either way, we haven't considered all possibilities of optimizing the case. The fact that any return within frames, affected by unwind, is not going to be used makes a lot of things easier, as I see it. All we need is to make sure all related CONTROL blocks are fired up.

@patrickbkr
Copy link
Member Author

Adding case 5.

sub s(&code) {
    return 0;
    LEAVE {
        code()
    }
}

sub s2() {
    s({
        return 1;
    });
    return 2;
}

say s2();

This is in some sense the inverse of case 2. (I don't see how we can distinguish the two cases, i.e. to make case 2. give 7 and case 5. give 1.)

Current Rakudo (with the two PRs applied) prints 1.

@vrurg
Copy link
Contributor

vrurg commented Jan 20, 2024

Adding case 5.

It makes no difference with 2. return 1 wins, so it makes 1 eventually.

More interesting the case of s itself. Say, &code is not a block but a routine returning a value. Let it be the 1 we already return. Then we have two options:

  1. no explicit return makes the LEAVE block mute and only the side effects of it are what's important
  2. since LEAVE is the last statement and is executed after the return 0 its return value is the return value of s

But the second option loses for two reasons. First, the return 0 statement is explicit. Second, it would make it harder to learn the rules governing returns from routines.

@patrickbkr
Copy link
Member Author

@vrurg I try to explain some more why I think case 2. vs case 5. is difficult.

case 2.

sub s(&code) {            # target of second return
    LEAVE return 5;       # second return -> collision
    code();
}
sub s2() {                # target of first return
    s({ return 7 });      # first return
    return 1;
}
s2()

case 5.

sub s(&code) {          # target of first return
    return 0;           # first return
    LEAVE { code() }
}

sub s2() {              # target of second return
    s({ return 1 });    # second return -> collision
    return 2;
}
s2()

Let's consider the point during execution marked with "collision".

In case 2 that's return 5. At that point some other return is already in progress (return 7). That other return targets a different routine (s2()).

In case 5 that's return 1. At that point some other return is already in progress (return 0). That other return targets a different routine (s()).

When working with the information given in the previous two paragraphs, the outcome is either 5 and 1 or 7 and 0. I don't see what additional information could be used to help.

@vrurg
Copy link
Contributor

vrurg commented Jan 20, 2024

Hope, we're not discussing different things here. Also hope that you're not looking at the problem from the perspective of implementation.

Separate the concerns. Separate the sources of s and s2. Best if you separate them into modules from different developers. Now, from the point of view of s2, should it care about what happens inside s? It clearly should not. Because in either case 2 or case 5, the block submitted to s returns from s2. This is what I would see as a developer of s2.

Think of it from the day-to-day development. The author of s must remember to document that it's sub is using a LEAVE phaser which is using return. I matters not if one or the other is not used, but when together – we need to document it.

The author of s2 must remember, that s has a phaser which is using a return. No other routine from this module doesn't, but this one does. Ah, and that the return in LEAVE only fires if this, and this, and, perhaps, that condition are all met. And then there is a couple of other modules, and a routine in one of them has similar behavior. But which one??

That's why I say 2 and 5 are the same. Because all what matters here is s2 as the consumer.

@patrickbkr
Copy link
Member Author

Also hope that you're not looking at the problem from the perspective of implementation.

Maybe "inspired by"... Can't prevent that.

Separate the concerns. Separate the sources of s and s2. Best if you separate them into modules from different developers. Now, from the point of view of s2, should it care about what happens inside s? It clearly should not. Because in either case 2 or case 5, the block submitted to s returns from s2. This is what I would see as a developer of s2.

I see what angle you are coming from and I agree.

I just have difficulty mapping what that means technically. After some thinking: Always let the return win, that wants to unwind further down the stack. Could that be it?

@vrurg
Copy link
Contributor

vrurg commented Jan 21, 2024

I just have difficulty mapping what that means technically. After some thinking: Always let the return win, that wants to unwind further down the stack. Could that be it?

I would tentatively agree. Not sure if we miss no nuance here.

patrickbkr pushed a commit to patrickbkr/MoarVM that referenced this issue Jan 21, 2024
Calling &return triggers a stack unwind which can cause code (e.g. in
`LEAVE` phasers) to run. That code can also call `&return`. In such
situations one usually wants the return to win that unwinds more of the
stack. This is the behavior implemented by this change. Previously the
later return always won.

Related to Raku/problem-solving#417
patrickbkr pushed a commit to patrickbkr/roast that referenced this issue Jan 21, 2024
@patrickbkr
Copy link
Member Author

I have added an implementation of the "deeper unwind wins" approach in MoarVM/MoarVM#1786. Spec tests pass including those in Raku/roast#851

patrickbkr pushed a commit to patrickbkr/MoarVM that referenced this issue Jan 21, 2024
Calling &return triggers a stack unwind which can cause code (e.g. in
`LEAVE` phasers) to run. That code can also call `&return`. In such
situations one usually wants the return to win that unwinds more of the
stack. This is the behavior implemented by this change. Previously the
later return always won.

Related to Raku/problem-solving#417
@patrickbkr
Copy link
Member Author

I just have difficulty mapping what that means technically. After some thinking: Always let the return win, that wants to unwind further down the stack. Could that be it?
Not sure if we miss no nuance here.

We do. Always picking the return that unwinds deeper basically disables &return everywhere in respective phasers. That messes up all but the simplest code. It's a miracle that PR managed to get the spec tests to pass.

@patrickbkr
Copy link
Member Author

patrickbkr commented Jan 25, 2024

Another idea:

  • Before calling any exit handler during unwind, we mark the target frame as this frame is returning and save the return value in the target frame (in frame->extra->exit_handler_result). We could also do that right away on every &return, but that would make all &returns pay the performance penalty.
  • On every implicit return (that's all returns, not just &return), check the caller frame for the returning marker. If found, then unwind the current frame, restore the previously saved value and do another unwind. That's the equivalent of two implicit returns in a row (+ restoring the right return value).

I think the semantics are correct and the approach to have frames in a "returning" state feels like it matches the core of the issue.

I see issues though:

  • We add one if to every (implicit and explicit) return. This is not good for performance. Probably negligible, but I test this yet.
  • From an opcode perspective, is this behavior too much magic for throwpayloadlexcaller and the return_*?
  • We make return fundamentally more complex. Returning is involved in inlining and specialization. This makes all of that more complex. I think this point is the most problematic one.

@patrickbkr
Copy link
Member Author

Yet another idea:
We tune the target frames return_address to point at the return handler. To retrieve the correct value, we'll change lastexpayload to check for the frame->returning flag and if set, retrieve the value saved in frame->extra->exit_handler_result instead of tc->last_payload.

patrickbkr pushed a commit to patrickbkr/MoarVM that referenced this issue Jan 27, 2024
In general, if a &return was called in a routine, we don't want any other
code in exit handlers to interfere with that return from returning.

It's possible for exit handlers to escape normal control flow (e.g. via a
&return). In that case instead of reaching the handler the unwind targets,
we would return to the frame's return_address. To make things right, we
tune the unwind target frame's return address to point to the handler.
Also set that frames RETURNING flag and preserve the return value,
getlexpayload can then pick it up again.

Related to Raku/problem-solving#417
patrickbkr pushed a commit to patrickbkr/roast that referenced this issue Jan 27, 2024
@patrickbkr
Copy link
Member Author

I have implemented the last idea in MoarVM/MoarVM#1788, it turned out rather nice. I have also added another test to Raku/roast#851. It passes all those tests. I haven't done a spec test (need to add a JIT implementation to MoarVM/MoarVM#1788 first), but this looks very promising.

@lizmat
Copy link
Collaborator

lizmat commented Feb 25, 2024

To summarize: there's an implementation now that implements: "first return wins", and it's spectest clean?

@patrickbkr
Copy link
Member Author

patrickbkr commented Feb 25, 2024

To summarize: there's an implementation now that implements: "first return wins", and it's spectest clean?

Yes. <-- That was wrong. It's last return in a routine that wins.

@lizmat
Copy link
Collaborator

lizmat commented Feb 25, 2024

The PR is still marked as draft. Maybe it shouldn't? :-)

@patrickbkr
Copy link
Member Author

patrickbkr commented Feb 25, 2024 via email

@lizmat
Copy link
Collaborator

lizmat commented Feb 25, 2024

+1 for first return wins.

We already have a similar behaviour regarding exit:

$ raku -e 'END exit 42; exit 1' 
$ echo $?
1

@niner
Copy link

niner commented Feb 25, 2024

I know I'm late to the party, but anyway.

S06 says "The C<return> function notionally throws a control exception that is caught by the current lexically enclosing C<Routine> to force a return through the control logic code of any intermediate block constructs. (That is, it must unwind the stack of dynamic scopes to the proper lexical scope belonging to this routine.)"

It says this about ENTER/LEAVE/KEEP/UNDO/etc. phasers: "These phasers supply code that is to be conditionally executed before or after the subroutine's C<do> block (only if used at the outermost level within the subroutine; technically, these are added to the block traits on the C<do> block, not the subroutine object). These phasers are generally used only for their side effects, since most return values will be ignored. (Phasers that run before normal execution may be used for their values, however.)"

The first quoted paragraph makes me inclined to suggest "lastest wins" semantics for return. After all return throws a control exception and with exceptions in general it's that any exception occurring after the original exception (e.g. in the exception handler) will become the dominant one that get's passed up to handlers. This matches what I would expect.

But then S06 is quite explicit about phasers, especially LEAVE being about their side effects. So now I'm a bit torn.

@lizmat
Copy link
Collaborator

lizmat commented Feb 25, 2024

being about their side effects

Could also be interpreted as "don't allow return in a phaser" ?

@patrickbkr
Copy link
Member Author

patrickbkr commented Feb 25, 2024 via email

@patrickbkr
Copy link
Member Author

This discussion seems to have stalled.

Let's summarize the options again:

  1. The last return in a routine always wins. Returns in other lexical scopes have no say. That's the result of the discussion of lizmat, vrurg and me. An implementation exists. See the tests for how the implementation exactly behaves.
  2. First return wins. All later returns have no effect. With the exception of a corner case (A routine with a single &return in its LEAVE block. I think that's a bit dubious.), that's similar to silently disabling returns in LEAVE. I think it would be clearer to instead prohibit &return in LEAVE (see 3.).
  3. Prohibit &return in LEAVE.
  4. Return exceptions are not prioritized, last thrown wins. That last return also determines where we are returning to. That's the current behavior. I think most, maybe all in this discussion agree that this is undesired.

@lizmat, @niner: I gave wrong information a few posts earlier, I mixed things up. There is no implementation of first return wins.

I'm in favor of 1.

Everyone in this discussion (@niner @vrurg @lizmat): Can you give a quick info where each of you stands? I think it's possible there isn't even a disagreement here.

@vrurg
Copy link
Contributor

vrurg commented Mar 7, 2024

I still stand for 1, no changes here.

@patrickbkr
Copy link
Member Author

@niner, @lizmat Ping! Can you restate your positions? See #417 (comment)

@lizmat
Copy link
Collaborator

lizmat commented Apr 27, 2024

I can live with 1. However, that is different from the behavior of exit, where the first exit determines the exit value.

patrickbkr pushed a commit to patrickbkr/roast that referenced this issue May 21, 2024
@niner
Copy link

niner commented May 26, 2024

Re-reading this thing I'm with @vrurg that the correct result would e 5, 7, 5, 7. According to the design docs I quoted a return returns from the lexical scope it's contained in, i.e. s2. The LEAVE phasers in s' would still win and change the return value. But it's s1`'s return value that they change which is in both cases just ignored! Again, the return in the LEAVE is returning from the sub that is the return's lexical scope. In the number 4 example it's not as obvious that the return value would be ignored, but if you imagine that as

sub s2() {
    my $result = s({
        return 7;
    });
    return $result;
}

then it's more obvious that because of the return 7 we never even get to set $result and thus s's return value overwritten by the LEAVE phaser is irrelevant.

So yes, a return in a LEAVE would change the return value of the sub it's located at. Callers and their return values must not be affected (other by what value they get from the call).

@raiph

This comment was marked as resolved.

@patrickbkr
Copy link
Member Author

@raiph Thanks for your feedback!

In particular, my understanding is that, when you wrote:

From all I've seen in this discussion I'm pretty confident to say that lastest win semantics are effectively non-composable and the only safe move is this to never use return in LEAVE. We'd be better off prohibiting return in LEAVE.

You weren't talking about option 1.

Correct, I was talking about option 4.

This is consistent with what I think you've arrived at, which is that they're ignored unless the return value comes from combining both of these:

* An explicit `&return` in a `LEAVE`.

* That `&return` mattering because the returned value matters. (Which it doesn't in some scenarios -- as @niner wrote "The LEAVE phasers in `s1` would still win and change the return value. But it's `s1`'s return value that they change which is in both cases just ignored!".)

I'd appreciate you commenting on this second half of my comment if it seems wrong to you.

I believe we agree here.

@patrickbkr
Copy link
Member Author

patrickbkr commented May 26, 2024

We've finally reached consensus. Accordingly I have just removed draft marker from the PR implementing the desired behavior (MoarVM#1788).

For completeness sake, there are four PRs related to this issue:

I'd like to get all four of them merged in the not too far future.

patrickbkr pushed a commit to patrickbkr/roast that referenced this issue May 28, 2024
patrickbkr pushed a commit to patrickbkr/MoarVM that referenced this issue Jun 1, 2024
Calling &return triggers a stack unwind which can cause code (e.g. in
`LEAVE` phasers) to run. That code can also call `&return`. In such
situations one usually wants the return to win that unwinds more of the
stack. This is the behavior implemented by this change. Previously the
later return always won.

Related to Raku/problem-solving#417
patrickbkr added a commit to patrickbkr/MoarVM that referenced this issue Jun 1, 2024
Calling &return triggers a stack unwind which can cause code (e.g. in
`LEAVE` phasers) to run. That code can also call `&return`. The returns
conflict, if the later return unwinds over the return continuation
marker of the former return in the call stack, i.e. the `LEAVE` frame.
In such situations one wants the return to win that unwinds more of the
stack. This is the behavior implemented by this change. Previously the
later return always won.

This implements the behavior discussed in Raku/problem-solving#417.
@patrickbkr
Copy link
Member Author

I believe the interactions with other exception types need some thought as well. I guess &return in LEAVE shouldn't hijack in progress exception throws, right?

Intuitively I'd say that if a non-return exception is in progress, a return in LEAVE should never take over.

So is the distinction return exception <-> any other exception sufficient? Or are there other exception types that need considering as well?

@raiph

This comment was marked as off-topic.

@raiph

This comment was marked as off-topic.

@raiph
Copy link

raiph commented Jun 4, 2024

Have you read the original design doc verbiage related to this?

Excerpts from the last part of the Phasers section in S04:

Except for CATCH and CONTROL phasers, which run while an exception is looking for a place to handle it, all block-leaving phasers wait until the call stack is actually unwound to run. ... The stack is unwound and the phasers are called only if an exception is not resumed.

So LEAVE phasers for a given block are necessarily evaluated after any CATCH and CONTROL phasers. This includes the LEAVE variants, KEEP and UNDO. POST phasers are evaluated after everything else, to guarantee that even LEAVE phasers can't violate postconditions. ...

If exit phasers are running as a result of a stack unwind initiated by an exception, this information needs to be made available. In any case, the information as to whether the block is being exited successfully or unsuccessfully needs to be available to decide whether to run KEEP or UNDO blocks (also see "Definition of Success"). How this information is made available is implementation dependent.

An exception thrown from an ENTER phaser will abort the ENTER queue, but one thrown from a LEAVE phaser will not. ...

If a POST fails or any kind of LEAVE block throws an exception while the stack is unwinding, the unwinding continues and collects exceptions to be handled. When the unwinding is completed all new exceptions are thrown from that point. ...

The topic of the block outside a phaser is still available as OUTER::<$_>. Whether the return value is modifiable may be a policy of the phaser in question. In particular, the return value should not be modified within a POST phaser, but a LEAVE phaser could be more liberal.

@patrickbkr
Copy link
Member Author

Idea: Let's distinguish CONTROL exceptions from CATCH exceptions. That's an already well established distinction.

Given the word "conflict" means: An exception thrown within a phaser during an unwind (a LEAVE) wants to unwind out of that phasers frame.

Then I propose the following rules:

  • A CATCH exception is in progress, a CONTROL exception conflicts. -> We ignore the CONTROL exception and continue with the CATCH exceptions unwind.
  • A CONTROL exception is in progress, a CONTROL exception conflicts. (That's what has been discussed at length in this issue.) -> We compare the target frames of the two exception handlers. The one unwinding more of the stack wins. If it's a tie, the later one (the one originating from within the phaser) wins. (This behavior is implemented in Return prioritization 3rd try MoarVM/MoarVM#1812.)
  • A CONTROL or CATCH exception is in progress, a CATCH exception conflicts. -> We add the exception to a list (located in the frame the current unwind targets). After the unwind finishes and before we run the handler, we throw all the collected exceptions. That's what S04 is proposing.

@raiph
Copy link

raiph commented Jun 5, 2024

@patrickbkr

Given the word "conflict" means: An exception thrown within a phaser during an unwind (a LEAVE) wants to unwind out of that phasers frame.

I imagine your initial goal is to decide what to do for a LEAVE phaser in the scenarios you discuss.

Then, once you're confident there's core dev consensus for the behavior of a LEAVE phaser, you'll consider whether the behavior (for the same classes of conflicts that are identified for LEAVE -- presumptively the 3 classes you describe) are the same for the other LEAVE queue phasers (UNDO and KEEP).

  • A CONTROL or CATCH exception is in progress, a CATCH exception conflicts. -> We add the exception to a list (located in the frame the current unwind targets). After the unwind finishes and before we run the handler, we throw all the collected exceptions. That's what S04 is proposing.

Sounds sensible to me.

  • A CATCH exception is in progress, a CONTROL exception conflicts. -> We ignore the CONTROL exception and continue with the CATCH exceptions unwind.

The keywords listed in the doc as raising control exceptions are return, fail, redo, next, last, done, emit, take, warn, proceed and succeed.

To test my understanding of your proposal, I think the list of keywords whose corresponding control exceptions can't "conflict" as you've defined are something like return (!), redo, next, last, done, emit, proceed and succeed. Right? (When I say "can't" I mean I'm thinking things like they either logically can't, or can only do so given absurdist DIHWIDT coding that I'm not even sure is possible. I've not really thought through done and emit because I don't want to think about them. ;))

And presumably the ones that could "conflict" (though again, perhaps DIHWIDT applies) are at least take and fail. (What would warn do?)


I also wonder if the following should be considered before deciding what to do about these "conflicts":

If exit phasers are running as a result of a stack unwind initiated by an exception, this information needs to be made available. In any case, the information as to whether the block is being exited successfully or unsuccessfully needs to be available to decide whether to run KEEP or UNDO blocks ...

I mention that because I'm thinking that if that information has been made available to a LEAVE (is that already the case in Rakudo?) then that might alter the picture.

(Though perhaps one would just write separate KEEP and UNDO phasers if the picture is altered enough to matter?)

@patrickbkr
Copy link
Member Author

To test my understanding of your proposal, I think the list of keywords whose corresponding control exceptions can't "conflict" as you've defined are something like return (!), redo, next, last, done, emit, proceed and succeed. Right? (When I say "can't" I mean I'm thinking things like they either logically can't, or can only do so given absurdist DIHWIDT coding that I'm not even sure is possible. I've not really thought through done and emit because I don't want to think about them. ;))

A simple example of a conflicting return:

sub s() {
    LEAVE {
        return 1 # <- wants to unwind to s()'s exit handler, that's outside of the `LEAVE` scope, conflicts
    }
    die "Sad"
}
say s();

If exit phasers are running as a result of a stack unwind initiated by an exception, this information needs to be made available. In any case, the information as to whether the block is being exited successfully or unsuccessfully needs to be available to decide whether to run KEEP or UNDO blocks ...

I mention that because I'm thinking that if that information has been made available to a LEAVE (is that already the case in Rakudo?) then that might alter the picture.

(Though perhaps one would just write separate KEEP and UNDO phasers if the picture is altered enough to matter?)

Actually I know little about phasers other than LEAVE. I kind of hoped to be able to evade having to dig into all those phasers to find a sensible solution to these issues. I'm probably naive.

One more unsolved issue: There are exceptions that can resume, some always do, e.g. emit. Intuitively I'd expect for an exception that "comes back" to continue to work in LEAVE irrespective whether a CATCH or CONTROL exception caused the LEAVE. But you can't know before hand whether an exception handler resumes or not. Maybe the solution is to forbid / ignore conflicting resumable CONTROL exceptions altogether.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language Changes to the Raku Programming Language
Projects
None yet
Development

No branches or pull requests

5 participants