-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
interpret: Unify projections for MPlaceTy, PlaceTy, OpTy #114011
Conversation
The Miri subtree was changed cc @rust-lang/miri Some changes occurred to the CTFE / Miri engine cc @rust-lang/miri Some changes occurred to the CTFE / Miri engine cc @rust-lang/miri |
@@ -479,6 +480,16 @@ pub trait EvalContextExt<'mir, 'tcx: 'mir>: crate::MiriInterpCxExt<'mir, 'tcx> { | |||
self.ecx | |||
} | |||
|
|||
fn aggregate_field_order(memory_index: &IndexVec<FieldIdx, u32>, idx: usize) -> usize { | |||
// We need to do an *inverse* lookup: find the field that has position `idx` in memory order. | |||
for (src_field, &mem_pos) in memory_index.iter_enumerated() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop is unfortunate, it makes aggregate traversal n^2 in the number of fields. But at least this only affects SB (and I doubt it is a limiting factor there). I didn't find a way to avoid this without introducing overhead for the common case (where we don't care about the order). The only nice way I came up with required return-position impl trait
in a trait.
This comment has been minimized.
This comment has been minimized.
Ok(()) | ||
} | ||
|
||
fn visit_aggregate( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't entirely understand why we hooked visit_aggregate
for these optimizations, they seem entirely implementable by hooking visit_value
...
if base.layout().is_sized() { | ||
// An unsized field of a sized type? Sure... | ||
// But const-prop actually feeds us such nonsense MIR! | ||
throw_inval!(ConstPropNonsense); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wild, took me quite a while to debug. It is triggered in tests/ui/const_prop/issue-86351.rs.
c996252
to
f3f4cf4
Compare
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit f3f4cf470cd07fb0c53a261205edd3665d230038 with merge 5bc60a1ed7d987a7881444c4e80ed9ed697acd95... |
This comment has been minimized.
This comment has been minimized.
f3f4cf4
to
7338980
Compare
This comment has been minimized.
This comment has been minimized.
7338980
to
2c296dc
Compare
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit 2c296dcc079483d1cbc5845068f75b5645f16a72 with merge a367073b0b085658182079202c18c2b339640bbf... |
This comment has been minimized.
This comment has been minimized.
2c296dc
to
a0631dc
Compare
This comment has been minimized.
This comment has been minimized.
if layout.abi.is_uninhabited() { | ||
// `read_discriminant` should have excluded uninhabited variants... but ConstProp calls | ||
// us on dead code. | ||
throw_inval!(ConstPropNonsense) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This failed somewhere during bootstrap. Sadly I don't know how to reproduce it.
This comment has been minimized.
This comment has been minimized.
Some changes occurred in src/tools/clippy cc @rust-lang/clippy |
@bors try |
The command can be found in the details of every perf run: in https://perf.rust-lang.org/detailed-query.html?commit=945ab669165d66b3ab6088bece01844e36f6a029&base_commit=fc8a3e357a0a5e317132e5ff8858ec70970fb07a&benchmark=ctfe-stress-5-opt&scenario=full it says "Local Profile (diff)", that command can then be run within the https://github.com/rust-lang/rustc-perf repo (after building it in release mode) |
This comment has been minimized.
This comment has been minimized.
8187dc4
to
d042263
Compare
Okay, thanks. I can't promise when I will have the time to take a look at that. |
Here's the entire cachegrind diff for the perf run I linked above:
so not very helpful I guess. Optimizer noise? Or actually new work being done? |
Thanks! eval_place became more expensive? That is very strange. I would have expected it to become cheaper since it doesn't need to write_immediate_no_validate is now doing an extra case distinction (but that's just a discriminant read so should be cheap) and it is doing the force_allocation that previously were done during projection, so that one becoming more expensive is somewhat expected. It should just be balanced by projection becoming cheaper... |
d042263
to
d127600
Compare
@bors r+ Perf is mixed, and I'd rather have the CTFE stress test regress a bit while generally having improvements, and massively improving the CTFE logic. |
☀️ Test successful - checks-actions |
Finished benchmarking commit (4fc6b33): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDNext Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 651.438s -> 650.083s (-0.21%) |
Given that we have a bunch of improvements, in particular some on primary benchmarks, and given the code quality improvements, I am totally fine with accepting this perf loss on the ctfe stress test. |
For ~forever, we didn't really have proper shared code for handling projections into those three types. This is mostly because
PlaceTy
projections require&mut self
: they might have toforce_allocate
to be able to represent a project part-way into a local.This PR finally fixes that, by enhancing
Place::Local
with anoffset
so that such an optimized place can point into a part of a place without having requiring an in-memory representation. If we later write to that place, we will still doforce_allocate
-- for now we don't have an optimized path inwrite_immediate
that would avoid allocation for partial overwrites of immediately stored locals. But inwrite_immediate
we have&mut self
so at least this no longer pollutes all our type signatures.(Ironically, I seem to distantly remember that many years ago,
Place::Local
did have anoffset
, and I removed it to simplify things. I guess I didn't realize why it was so useful... I am also not sure if this was actually used to achieve place projection on&self
back then.)The
offset
had typeOption<Size>
, whereNone
represent "no projection was applied". This is needed because locals can be unsized (when they are arguments) butPlace::Local
cannot store metadata: if the offset isNone
, this refers to the entire local, so we can use the metadata of the local itself (which must be indirect); if a projection gets applied, since the local is indirect, it will turn into aPlace::Ptr
. (Note that even for indirect locals we can havePlace::Local
: when the local appears in MIR, we always start withPlace::Local
, and only checkframe.locals
later. We could eagerly normalize toPlace::Ptr
but I don't think that would actually simplify things much.)Having done all that, we can finally properly abstract projections: we have a new
Projectable
trait that has the basic methods required for projecting, and then all projection methods are implemented for anything that implements that trait. We can even implement it forImmTy
! (Not that we need that, but it seems neat.) The visitor can be greatly simplified; it doesn't need its own trait any more but it can use theProjectable
trait. We also don't need the separateMut
visitor any more; that was required only to reflect that projections onPlaceTy
needed&mut self
.It is possible that there are some more
&mut self
that can now become&self
... I guess we'll notice that over time.r? @oli-obk