Add a `RewriteControlFlow` phase (`return`/`continue`/`break`) #393

W95Psp · 2023-12-20T13:52:33Z

We want a phase that eliminates explicit control flow (say a return or a break) by moving or duplicating code.

We want to rewrite the control flow so that every return, break or continue always appears in a return position in our AST. Returns in return positions can be dropped: e.g. if ... then {return 3;} else {return 5;} can be simplified in if ... then {3} else {5}.

rewrite control flow for returns appearing outside of a loop (this was PR Add RewriteControlFlow phase. #907);
rewrite control flow for return, break and continue within a loop.

See Details below for a lengthy explanation.

Details

Question marks (but also return and friends) used to be transformed in a monadic style.

In this style, computations are promoted to a Result or Option monad, and pure and bind nodes are inserted heavily.

For instance, let's consider the following function:

fn test(x: Option<u8>) -> Option<u8> {
    Some(f(x?) + 4)
}

This would be transformed to this monadic F* function, where (let*) is basically a bind node in the option monad:

let test (x: option u8) : option u8 =
  run_monad (
    let* hoist1:u8 = x in
      Some
      (let hoist2:u8 = f hoist1 in
        let hoist3:u8 = hoist2 + 4 in
        Some hoist3 <: option u8)
      <:
      option (option u8))

While this is a correct translation, lifting everything to a monad and running the monad is a bit heavy.

Instead, the control flow of the Rust function test can be rewritten into:

match x {
   Some(x) => Some(f(x) + 4),
   None => None
}

Rewriting control flow is about pushing ?s to return positions. When ? appears deep in a if/match node itself nested in a sequential expression (e.g. a sequence or a function call), then such rewriting can result in code duplication. Example:

fn fun(cond: bool, x: Option<u8>) -> Option<u8> {
    let b = x + 10;
    let a = if b > 20 { x? + long_code } else { 0 };
    let b = other_long_code;
    Some(a + b + 3)
}

fn fun_rewrite_step(x: Option<u8>) -> Option<u8> {
    let b = x + 10;
    if b > 20 {
        let a = x? + long_code;
        let b = other_long_code;
        Some(a + b + 3)
    } else {
        let b = other_long_code;
        Some(a + b + 3)
    };
}

fn fun_rewrited(x: Option<u8>) -> Option<u8> {
    let b = x + 10;
    if b > 20 {
        match x {
            Some(x) => {
                let a = x? + long_code;
                let b = other_long_code;
                Some(a + b + 3)
            }
            None => None,
        }
    } else {
        let b = other_long_code;
        Some(a + b + 3)
    }
}

The text was updated successfully, but these errors were encountered:

W95Psp · 2024-04-30T07:29:00Z

PR #559 made some progress toward this issue, but we need to improve it to close this.

maximebuyse · 2024-09-19T11:57:49Z

So if I understand well, to complete this issue we need this RewriteControlFlow phase that would come before DropNeedlessReturn and essentially deal with returns inside ifs (potentially duplicating some code).

Let's consider (a simplified version of) the example from the issue description:

fn question_mark(c: bool, x: Option<i32>) -> Option<i32> {
    let a = if c {x? + 10} else {0};
    let b = 20;
    Some(a + b)
}

And an example written with a return inside two ifs:

fn test(c1: bool, c2: bool) -> i32 {
    if c1 {
        if c2 {
            return 1;
        }
    }
    2
}

They should be rewritten respectively as:

fn question_mark_equivalent(c: bool, x: Option<i32>) -> Option<i32> {
    if c {
      let a = x? + 10;
      let b = 20;
      Some(a + b)
    } else {
       let a = 0;
       let b = 20;
       Some(a + b)
    };
    
}

fn test_equivalent(c1: bool, c2: bool) -> i32 {
    if c1 {
        if c2 {
            return 1;
        } else {
            2
        }
    } else {
        2
    }
}

The crucial rule to rewrite like this would be recognizing patterns like:

let lhs = if c then a (else b);
e

and rewrite it to:

if c {
  let lhs = a;
  e
} else {
 let lhs =  b; 
  e
}

In case there is no else originally we would insert it.
And this would be done recursively, only if there is a return statement somewhere in one of the branches of the if.

To avoid recursing twice, we would need to pass the following expression top-down in the recursive calls (in an argument) and propagate the information of whether there is a return statement bottom-up (in the return value of the recursive function). We should also take care to avoid inserting an expression after a return statement (otherwise it would not be simplified by drop_needless_returns.

maximebuyse · 2024-09-19T15:57:44Z

We also need another phase (or a second visitor inside drop_needless_returns) to rewrite

let a = return b;
c

as

return b

As they can be nested it needs its own visitor.

franziskuskiefer · 2024-09-25T10:13:03Z

I thought #907 is only partially implementing this?

maximebuyse · 2024-09-25T10:39:40Z

I thought #907 is only partially implementing this?

We have a lot of issues around this. My understanding is that #907 implements this one. But it is only a part of this meta-issue: #15 . And the main part that is now missing is #196

franziskuskiefer · 2024-09-25T10:52:39Z

ok, @W95Psp please clean up and close what's done 🙏🏻

maximebuyse · 2024-10-02T15:30:41Z

I am designing the part about loops. First attempt:

Apply the current RewriteControlFlow phase inside loops bodies but treat break and continue like return. After that, all of them should be in "return position" in the loop body.
Add an enum in Rust_primitives.Hax with 3 variants (break, continue and return) each carrying a value (the enum should be polymorphic to allow carrying values of any type)
Modify the body of loops to replace each of these 3 statements by its corresponding variant from the enum. In the case of break and continue the value they should contain is the accumulator (loops are translated as a fold). A value in return position without any of these statements should be wrapped in a continue
Replace the whole body of the loop by a pattern matching over the accumulator that executes one more iteration only if the previous gave a continue, and just pass along a break or return
Wrap the whole fold inside a pattern matching to extract the result. in case we find the result wrapped in a break or continue we should just unwrap it. In case we find a return we should return, that means we should also treat this new pattern matching with the current RewrtieControlFlow phase to make sure this return is treated like the others outside loops (of course we can analyze first to know if the body of the loop contains a return or not).

For all this to work well, I suspect that we should treat the body of the loop after phase LocalMutation. It might be better also for the rest of RewriteControlFlow. This means we should also move DropNeedlessReturns but there might be a reason they come in the other order.

maximebuyse · 2024-10-03T08:43:37Z

I am designing the part about loops. First attempt:

* Apply the current `RewriteControlFlow` phase inside loops bodies but treat `break` and `continue` like `return`. After that, all of them should be in "return position" in the loop body.

* Add an enum in `Rust_primitives.Hax` with 3 variants (`break`, `continue` and `return`) each carrying a value (the enum should be polymorphic to allow carrying values of any type)

* Modify the body of loops to replace each of these 3 statements by its corresponding variant from the enum. In the case of `break` and `continue` the value they should contain is the accumulator (loops are translated as a `fold`). A value in return position without any of these statements should be wrapped in a `continue`

* Replace the whole body of the loop by a pattern matching over the accumulator that executes one more iteration only if the previous gave a `continue`, and just pass along a `break` or `return`

* Wrap the whole `fold` inside a pattern matching to extract the result. in case we find the result wrapped in a `break` or `continue` we should just unwrap it. In case we find a `return` we should return, that means we should also treat this new pattern matching with the current `RewrtieControlFlow` phase to make sure this `return` is treated like the others outside loops (of course we can analyze first to know if the body of the loop contains a `return` or not).

For all this to work well, I suspect that we should treat the body of the loop after phase LocalMutation. It might be better also for the rest of RewriteControlFlow. This means we should also move DropNeedlessReturns but there might be a reason they come in the other order.

A few additions:

I think we should rewrite the control flow in RewriteControlFlow, and also remove everything that may come after a break or continue (like we already do for return). Then LocalMutation will possibly introduce expressions after them that we can reuse to know the value we should pass.
We might want, instead of inserting a pattern matching in the loop body, to define a specific fold that would do the pattern matching. We will still need the pattern matching on the result of the loop (in case there is a return, otherwise we could also have a specific fold that would extract the result out of its break or continue wrapper).
In case there are continue but no break or return we don't need any wrapping
In the case of nested loops, break and continue may contain a label that allows to choose which loop they refer to. I think we should not support that (at least for now).

maximebuyse · 2024-10-03T15:59:32Z

Here is an example of the rewriting:

fn f (v: &[i32]) -> i32 {
    let mut sum = 0;
    for i in v {
        if *i < 0 {
            return 0
        }
        sum += *i;
    }
    sum *= 2;
    sum
}

After the transformation (here with a classic fold but we will define a specific fold that will do this internal pattern matching on the accumulator, and we will also use std::ops::ControlFlow instead of this CF enum):

fn double_sum_eq (v: &[i32]) -> i32 {
    let sum = 0;
    match (
        v.fold(|sum, i| {
            match sum {
                CF::Continue(sum) => {
                    if *i < 0 {
                        CF::Return(0)
                    } else {
                        let sum = sum + *i;
                        CF::Continue(sum)
                    }
                },
                _ => sum
            }
            
        },
        sum
        )
    ) {
        CF::Return(v) => v,
        CF::Break(v) | CF::Continue(v) =>
         {
            let sum = v;
            let sum = sum * 2;
            sum
         }
    }
}

franziskuskiefer · 2024-10-17T06:17:39Z

@maximebuyse @W95Psp what's the state here?

W95Psp · 2024-10-17T06:24:42Z

We wanted to work on that together in CC, but that will not happen this week (I thought I could do CC today, but after being awake for a bit, I think it's wiser to work slowly from home and recover).

@maximebuyse correct me if I'm wrong, but I think the PR (#988) is mostly ready, only the F* parts are missing.

Thus there is two things to do:

clean up / make sure the PR is up-to-date and ready in terms of engine phases (@maximebuyse I think you were not certain about the order of phases?);
write F*.

maximebuyse · 2024-10-17T07:15:24Z

We wanted to work on that together in CC, but that will not happen this week (I thought I could do CC today, but after being awake for a bit, I think it's wiser to work slowly from home and recover).

@maximebuyse correct me if I'm wrong, but I think the PR (#988) is mostly ready, only the F* parts are missing.

Thus there is two things to do:
* [ ]  clean up / make sure the PR is up-to-date and ready in terms of engine phases ([@maximebuyse](https://github.com/maximebuyse) I think you were not certain about the order of phases?);[ ]  write F*.

I am quite happy with the order of phases now. But for sure there will be some cleanup to do, and we might change a bit the code we emit while writing the f*.

franziskuskiefer · 2024-10-25T14:08:47Z

This introduced a regression https://github.com/inria-prosecco/circus-green/actions/runs/11510030233

franziskuskiefer · 2024-10-31T15:54:34Z

#1056

github-project-automation bot added this to hax Dec 20, 2023

W95Psp added the engine Issue in the engine label Dec 20, 2023

W95Psp mentioned this issue Dec 20, 2023

Meta issue: support return/continue/?/break #15

Closed

7 tasks

W95Psp added the P1 Max priority label Dec 21, 2023

jschneider-bensch mentioned this issue Jan 24, 2024

Add infrastructure for translating known functions. #452

Merged

W95Psp changed the title ~~Better support for question marks~~ Add a "RewriteControlFlow" phase (?/return/continue/break...) Jan 29, 2024

W95Psp self-assigned this Jan 29, 2024

W95Psp mentioned this issue Mar 12, 2024

Question mark as early returns #559

Merged

maximebuyse assigned maximebuyse and unassigned W95Psp Sep 19, 2024

maximebuyse moved this to In Progress in hax Sep 19, 2024

franziskuskiefer removed this from hax Sep 20, 2024

maximebuyse mentioned this issue Sep 23, 2024

Add RewriteControlFlow phase. #907

Merged

W95Psp closed this as completed in #907 Sep 25, 2024

franziskuskiefer reopened this Sep 25, 2024

W95Psp changed the title ~~Add a "RewriteControlFlow" phase (?/return/continue/break...)~~ Add a "RewriteControlFlow" phase (return/continue/break) Oct 2, 2024

W95Psp changed the title ~~Add a "RewriteControlFlow" phase (return/continue/break)~~ Add a RewriteControlFlow phase (return/continue/break) Oct 2, 2024

franziskuskiefer mentioned this issue Oct 10, 2024

Monadic transformation is broken #822

Open

2 tasks

maximebuyse mentioned this issue Oct 10, 2024

feat(engine) Rewrite control flow inside loops. #988

Merged

W95Psp closed this as completed in #988 Oct 24, 2024

franziskuskiefer reopened this Oct 25, 2024

franziskuskiefer mentioned this issue Oct 25, 2024

Regression from #393 - ? not working #1044

Closed

franziskuskiefer closed this as completed Oct 31, 2024

franziskuskiefer mentioned this issue Oct 31, 2024

Question mark in a closure gives an error #1089

Closed

maximebuyse mentioned this issue Nov 6, 2024

Increase test coverage for control flow rewrites (return, break, continue, ?) #1103

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a `RewriteControlFlow` phase (`return`/`continue`/`break`) #393

Add a `RewriteControlFlow` phase (`return`/`continue`/`break`) #393

W95Psp commented Dec 20, 2023 •

edited

Loading

W95Psp commented Apr 30, 2024

maximebuyse commented Sep 19, 2024 •

edited

Loading

maximebuyse commented Sep 19, 2024

franziskuskiefer commented Sep 25, 2024

maximebuyse commented Sep 25, 2024

franziskuskiefer commented Sep 25, 2024

maximebuyse commented Oct 2, 2024

maximebuyse commented Oct 3, 2024

maximebuyse commented Oct 3, 2024

franziskuskiefer commented Oct 17, 2024

W95Psp commented Oct 17, 2024

maximebuyse commented Oct 17, 2024

franziskuskiefer commented Oct 25, 2024

franziskuskiefer commented Oct 31, 2024

Add a RewriteControlFlow phase (return/continue/break) #393

Add a RewriteControlFlow phase (return/continue/break) #393

Comments

W95Psp commented Dec 20, 2023 • edited Loading

W95Psp commented Apr 30, 2024

maximebuyse commented Sep 19, 2024 • edited Loading

maximebuyse commented Sep 19, 2024

franziskuskiefer commented Sep 25, 2024

maximebuyse commented Sep 25, 2024

franziskuskiefer commented Sep 25, 2024

maximebuyse commented Oct 2, 2024

maximebuyse commented Oct 3, 2024

maximebuyse commented Oct 3, 2024

franziskuskiefer commented Oct 17, 2024

W95Psp commented Oct 17, 2024

maximebuyse commented Oct 17, 2024

franziskuskiefer commented Oct 25, 2024

franziskuskiefer commented Oct 31, 2024

Add a `RewriteControlFlow` phase (`return`/`continue`/`break`) #393

Add a `RewriteControlFlow` phase (`return`/`continue`/`break`) #393

W95Psp commented Dec 20, 2023 •

edited

Loading

maximebuyse commented Sep 19, 2024 •

edited

Loading