-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rustc_mir: add a pass for fragmenting locals into their fields (aka SROA). #48300
Conversation
This comment has been minimized.
This comment has been minimized.
3a19c94
to
9a18264
Compare
@bors try |
rustc_mir: add a pass for splitting locals into their fields (aka SROA). **DO NOT MERGE**: based on #48052.
💔 Test failed - status-travis |
9a18264
to
49f1dda
Compare
@bors try |
⌛ Trying commit 49f1ddaec0f5c745909c5797cc030eb5448473db with merge d2592cc1fd6fd56b975206a2d1c3e4f4978be3a6... |
☀️ Test successful - status-travis |
49f1dda
to
633d39b
Compare
This comparison is a bit more accurate (this PR is rebased on top of #48052). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty nifty. Left a bunch of requests, mostly for comments. Will take another pass after that.
} | ||
|
||
impl<'tcx> LocalPathCollector<'tcx> { | ||
fn place_path(&mut self, place: &Place<'tcx>) -> Option<&mut LocalPath<'tcx>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to have a comment on this function -- even better if it can reference some kind of running example explaining what this optimization does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this setup is pretty elegant. I was getting worried about calls to make_opaque
coming later and destroying the LocalPath
instances that are returned by this function -- but of course the borrow checker prevents us from relying on them...
use syntax_pos::Span; | ||
use transform::{MirPass, MirSource}; | ||
|
||
pub struct SplitLocalFields; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be great to have a doc comment here that explains what this optimization does and gives some examples. Ideally, those examples can be referenced from comments later on.
first..mir.local_decls.len() | ||
} | ||
} | ||
}).collect::<IndexVec<Local, _>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have found it helpful to write the Range<Local>
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or -- better -- add a comment atop the let
like:
// Map each local variable X to a range of locals L1..L2 that replaces it.
// If the local is opaque, then this will just be `X..X+1`.
// If the local is being "unrolled", then it will be a range of variables, one for each unrolled field.
let replacements = ...;
} | ||
drop(replacements); | ||
|
||
// Lastly, replace all the opaque paths with their new locals. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok, I think I get it. The "leaves" are always opaque...
None | ||
} | ||
|
||
fn split_into_locals(&mut self, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment plz. My take:
Given some local variable base_decl
that is to be unrolled, creates N local variables -- one for each opaque leaf path reachable from base_decl
. These are appending to local_decls
.
} | ||
} | ||
|
||
fn project(&mut self, elem: &PlaceElem<'tcx>, variant_index: usize) -> Option<&mut Self> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment plz -- this seems to be a pretty key function. My take:
Tries to "unroll" P
and create a local variable for its projection F
. If P
is opaque, or P
cannot be unrolled, then returns None. Otherwise, returns Some
with the LocalPath
representing P.F
(this path may in turn be unrolled or made opaque).
During the first phase, where we determine what to unroll, we invoke this function on each path that we see being used. For example, if we see a MIR statement like Use(a.b.c)
, we would invoke "project" b
from a
, and then c
from that result.
Note that a projection can succeed at first, but then later have some base path made opaque, leading to a failure if the projection is re-attempted. For example, if we later saw a use of a.b
, then it would be made opaque, and attempting to project c
again would return None
.
} | ||
|
||
impl<'a, 'tcx> LocalPathReplacer<'a, 'tcx> { | ||
fn replace(&mut self, place: &mut Place<'tcx>) -> Option<&mut LocalPath<'tcx>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment plz =)
fb0c019
to
229f45b
Compare
@eddyb what's up with this code? |
Ping from triage, @eddyb ! We haven't heard from you in a while; will you be able to address some of the review comments soon? |
Closing this PR for inactivity. @eddyb please reopen this if you have some time to finish the PR! |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
☀️ Try build successful - checks-azure |
Queued 4efa7e626ffbc8d7ea613736e9e03c3809e95c9b with parent edb3684, future comparison URL. |
// conservative, until such uses have been definitely deemed UB. | ||
if context.is_borrow() { | ||
self.locals[place.local].make_opaque(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: this is not catching Rvalue::AddressOf
(which has a different context
).
This could've been caught by tests, if we had &raw
versions of ui/mir/mir_raw_fat_ptr.rs
and ui/raw-fat-ptr.rs
(which are the tests that failed before I added this special-case).
I doubt I'll do that (adding tests) in this PR, but maybe other people are interested.
cc @matthewjasper @RalfJung
Also I'm worried we might have code elsewhere that isn't handling &raw
correctly, if it relies on either Borrow
contexts (by pattern-matching or is_borrow
) or Rvalue::Ref
.
cc @ecstatic-morse @oli-obk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I verified that Rvalue::AddressOf
was handled correctly in const checking via explicit match in the PR that introduced &raw
. It seems clear to me that AddressOf
should not be its own PlaceContext
, but should instead map to SharedBorrow
and Borrow
. I can't say whether other passes in the compiler are matching on AddressOf
correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't mind there being something like BorrowOrAddrOf
and is_borrow_or_addr_of
if we want to be super explicit, but otherwise I think I agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we had &raw versions of ui/mir/mir_raw_fat_ptr.rs and ui/raw-fat-ptr.rs
Those two tests look almost identical (the former just has some more stuff)... any idea why we have that duplication? One was clearly created by copy-pasting the other, but there are no comments, unfortunately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it has mir in the name it used to have an opt-in to force mir codegen back when we had two modes (and -Zorbit
). I agree we should dedup, cc @Centril
Finished benchmarking try commit 4efa7e626ffbc8d7ea613736e9e03c3809e95c9b, comparison URL. |
For syn-opt the base and patched incremental benchmarks spent less time in ThinLTO, while the inflate-opt patched incremental benchmark spent more time in ThinLTO |
@bjorn3 So, it's just noise then? That run shouldn't change anything (compared to I am going to make a PR with just the first two commits, in case the order the LLVM |
I think so. |
5d5bb62
to
b1133d3
Compare
var_debug_info.source_info.span, | ||
"FIXME: implement fragmentation for {:?}", | ||
var_debug_info, | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is possible to hit without MIR inlining (because only this pass can create VarDebugInfoContents::Composite
, so it must've run in a callee that got inlined), but I need to implement it before landing anyway.
(gather_debug_info_fragments
is already set up for this, I just forgot to do it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good to have a couple of new mir-opt
tests which capture the before & after output of this pass especially the handling of SetDiscriminant
for enums.
add raw-addr-of variant to mir_raw_fat_ptr As suggested at #48300 (comment) r? @eddyb
codegen: create DIVariables ahead of using them with llvm.dbg.declare. Instead of having `rustc_codegen_llvm` bundle creation of a `DIVariable` and `llvm.dbg.declare` into a single operation, they are now two separate methods, and the `DIVariable` is created earlier, specifically when `mir::VarDebugInfo`s are being partitioned into locals. While this isn't currently needed, it's a prerequisite for #48300, which adds fragmented debuginfo, where one `mir::VarDebugInfo` has multiple parts of itself mapped to different `mir::Place`s. For debuggers to see one composite variable instead of several ones with the same name, we need to create a single `DIVariable` and share it between multiple `llvm.dbg.declare` calls, which are likely pointing to different MIR locals. That makes the `per_local_var_debug_info` partitioning a good spot to do this in, as we can create *exactly* one `DIVariable` per `mir::VarDebugInfo`, and refer to it as many things as needed. I'm opening this PR separately because I want to test its perf impact in isolation (see #48300 (comment)). r? @nagisa or @oli-obk cc @michaelwoerister @nikomatsakis
☔ The latest upstream changes (presumably #68665) made this pull request unmergeable. Please resolve the merge conflicts. |
Triaged |
Closing this after a discussion with the author. |
In order to make other MIR optimizations more effective and/or easier to implement, this pass breaks up ("fragments") aggregate locals into smaller locals, ideally leaf fields of primitive or generic types.
This roughly corresponds to LLVM's SROA ("Scalar Replacement of Aggregates"), although "scalar" is less applicable here, as MIR doesn't distinguish between register-like SSA and memory.
Locals are fragmented only when all accesses are directly through field/downcast projections, so that the number of statements is unchanged (ignoring
Storage{Live,Dead}
ones).For example,
x = y;
isn't transformed tox.f = y.f;
for each field, instead it completely preventsx
andy
from being split up in any way.Variable debuginfo is always preserved by this pass, by transforming it into a "composite" form, which maps pieces of an user variable to independent
Place
s.That corresponds to DWARF's
DW_OP_piece
"composition operator", which LLVM exposes as a more "random-access"DW_OP_LLVM_fragment
(as eachllvm.dbg.declare
can only point to onealloca
), indicating what byte range of the debugger-facing variable is being declared.However,
enum
s can't be broken up into their discriminant and variant fields, if any debuginfo refers to them, as there is more than one possible memory location for theenum
bytes after the discriminant, and the debugger would have to inspect the discriminant to use the right variant's fragmented fields.And LLVM doesn't currently (and couldn't easily AFAICT) support the more advanced DWARF features which would let us check the discriminant and use different locations based on it.
(While there is a workaround, namely faking type debuginfo for such cases, so the variants appear laid out like in a tuple instead of overlapping, it's complex enough that I don't want to tackle it in this PR)
As an example, this small snippet:
currently produces this MIR (after deaggregation, and some details omitted):
but after this PR, the pair is replaced with two separate locals:
The only thing left of the original pair is the debuginfo, which describes how a debugger can reconstruct the
pair
user variable, by combining the contents of_3
and_4
.More concretely, the debugger sees a single
pair: (i32, i32)
variable with contents drawn from:_3
, for bytes0..4
_4
, for bytes4..8
But outside of a debugger, those two halves are completely independent.
r? @nikomatsakis cc @rust-lang/wg-mir-opt @michaelwoerister
TODO: address review comments (about missing code comments, and some notes to self)