Skip to content

Wishlist: MIR-level integer range analysis #76579

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task
moonheart08 opened this issue Sep 10, 2020 · 8 comments
Open
1 task

Wishlist: MIR-level integer range analysis #76579

moonheart08 opened this issue Sep 10, 2020 · 8 comments
Labels
A-mir-opt Area: MIR optimizations T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@moonheart08
Copy link
Contributor

This is a tracking issue for MIR-level integer range analysis.
I'm very new to this, and there's no associated PR (yet!) nor RFC, so apologies in advance for any issues or mistakes.

Description

Integration of integer range analysis into the MIR will allow the compiler to, on a best effort basis, understand roughly the range of values any particular expression can take on.

let x: u8 = ...; // VRange(0, 255)
let b = x / 2; // VRange(0, 127)
assert!(b < 128); // Elideable, rustc understands b cannot be larger than 128.
                  // Removal of assert can be done at a MIR level instead of by LLVM!
let c = b << 2; // VRange(0, 255). Could theoretically be recognized as VRange(0, 0) + VRange(4, 255) at the cost of extra complexity.
assert!(c != 3); // Theoretically elidable, more complicated in practice, but plenty doable.

Steps

  • Initial implementation (gated)

Unresolved Questions

  • How much should the compiler attempt to do? There is, theoretically, a very large number of situations it could recognize.
  • Should there be some way to access this best effort range data? It has it's uses, but the fact that this information is best-effort and not consistent would need special emphasis. Either way, an intrinsic in core would be helpful for writing unit tests.

@oli-obk per request.

@moonheart08 moonheart08 added the C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC label Sep 10, 2020
@jonas-schievink jonas-schievink added A-mir-opt Area: MIR optimizations T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 10, 2020
@leonardo-m
Copy link

Integer range analysis will get better once we have ranged integers in the Rust type system (similar to Ada ones, for for integral values only).

@oli-obk
Copy link
Contributor

oli-obk commented Sep 10, 2020

that's for the type system, and while that will make it easier for optimizations to handle such things, a basic version just as an optimization will give benefits to all existing cases

cc @rust-lang/wg-mir-opt

@moonheart08
Copy link
Contributor Author

I opened this issue as i have a decent level of interest in trying to produce an initial implementation. @oli-obk You said you would write something up. Once you do make sure to mention me.

@oli-obk
Copy link
Contributor

oli-obk commented Sep 11, 2020

I think we can start out with the integer range analysis being its own MIR pass and only later integrating it with const propagation. I'd put the new MIR pass right after const propagation, so we get a good starting point from which to run said analysis.
For creating a new MIR pass I wrote a quickstart guide on zulip (that we should probably put into the rustc guide): https://rust-lang.zulipchat.com/#narrow/stream/189540-t-compiler.2Fwg-mir-opt/topic/multiple.20return.20terminators/near/203699347

The MIR dataflow framework resides in https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/index.html

The relevant traits are AnalysisDomain and Analysis.

AnalysisDomain specifies the datastructures and join operation (via https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/dataflow/lattice/trait.JoinSemiLattice.html) that your analysis will use, while the Analysis will perform the actual operations (e.g. copying for assignments, or moving the legal range on additions and similar).

For the only example where we use this kind of analysis in rustc look at

type Domain = lattice::Dual<BitSet<MovePathIndex>>;
which, when you follow the lattice::Dual type should give you some examples on how to implement the appropriate traits for whatever datastructure you choose for representing the integer range. Since you need to track the integer range for each local, you're going to need something like FxHashMap<Local, YourState>

In general, I would expect that the MIR optimization would create your analysis, for which both the analysis traits and the https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/visit/trait.MutVisitor.html trait are implemented. You then implement the visitor's methods for the sites you want to check for possibly mutating it. I think as an MVP we should just look at Terminator where the kind is SwitchInt. You then fast forward the dataflow analysis to the current location and check its status, if that status allows eliminating targets from the terminator, then you drop them.

A test example that would work here is

fn foo(x: bool) -> &'static str {
    match x as u8 {
        0 => "null",
        1 => "one",
        2 => "meh",
        _ => "muh",
    }
}

The final mir should only contain the 0 and 1 arms.

cc @moonheart08

cc @ecstatic-morse in case I'm talking nonsense somewhere

@ecstatic-morse
Copy link
Contributor

ecstatic-morse commented Sep 11, 2020

It would be good to look at lattice::FlatSet, since it's the only (interesting) lattice that doesn't use set inclusion as the ordering operator at the moment (your Range lattice will be the second). Also, IndexVec<Local, T> implements JoinSemiLattice as long as T does. If you want to do the same for FxHashMap, just have a look at that impl.

Also, there was some brief discussion about convergence on Zulip (search for "widening"), since a lattice on set of all possible integer ranges is quite tall . There's no one solution to this, so you'll need to pick an approach.

@camelid
Copy link
Member

camelid commented Jan 23, 2021

For creating a new MIR pass I wrote a quickstart guide on zulip (that we should probably put into the rustc guide): https://rust-lang.zulipchat.com/#narrow/stream/189540-t-compiler.2Fwg-mir-opt/topic/multiple.20return.20terminators/near/203699347

@oli-obk Did that ever end up in the dev-guide? If not, I can try to put it there :)

@oli-obk
Copy link
Contributor

oli-obk commented Jan 23, 2021

Not as far as I know. That would be great! Thanks.

@pnkfelix
Copy link
Member

pnkfelix commented Oct 14, 2022

Visiting for T-compiler steering meeting backlog bonanza.

This is tagged C-tracking-issue, but it is not a tracking issue in the sense that that label is meant for. (The C-tracking-issue label is meant to tag issues that we want ongoing knowledge of their current state, for e.g. coordination within the team or across the project.)

Removing label.

@rustbot label: -C-tracking-issue

@rustbot rustbot removed the C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC label Oct 14, 2022
@pnkfelix pnkfelix changed the title Tracking issue for MIR-level integer range analysis Wishlist: MIR-level integer range analysis Oct 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-mir-opt Area: MIR optimizations T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

8 participants