-
Notifications
You must be signed in to change notification settings - Fork 13.4k
CTFE interning: don't walk allocations that don't need it #97585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
97a0b2e
266bab2
18cbc19
c9772d7
6d03c8d
d634f14
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -168,8 +168,51 @@ impl<'rt, 'mir, 'tcx: 'mir, M: CompileTimeMachine<'mir, 'tcx, const_eval::Memory | |
mplace: &MPlaceTy<'tcx>, | ||
fields: impl Iterator<Item = InterpResult<'tcx, Self::V>>, | ||
) -> InterpResult<'tcx> { | ||
// ZSTs cannot contain pointers, so we can skip them. | ||
if mplace.layout.is_zst() { | ||
// We want to walk the aggregate to look for references to intern. While doing that we | ||
// also need to take special care of interior mutability. | ||
// | ||
// As an optimization, however, if the allocation does not contain any references: we don't | ||
// need to do the walk. It can be costly for big arrays for example (e.g. issue #93215). | ||
let is_walk_needed = |mplace: &MPlaceTy<'tcx>| -> InterpResult<'tcx, bool> { | ||
// ZSTs cannot contain pointers, we can avoid the interning walk. | ||
if mplace.layout.is_zst() { | ||
return Ok(false); | ||
} | ||
|
||
// Now, check whether this allocation could contain references. | ||
// | ||
// Note, this check may sometimes not be cheap, so we only do it when the walk we'd like | ||
// to avoid could be expensive: on the potentially larger types, arrays and slices, | ||
// rather than on all aggregates unconditionally. | ||
if matches!(mplace.layout.ty.kind(), ty::Array(..) | ty::Slice(..)) { | ||
let Some((size, align)) = self.ecx.size_and_align_of_mplace(&mplace)? else { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The only possible source of the regression that I can see is this function call invoking Unsure how to debug this further
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. valgrind have finally fixed the dwarf bug and I was able to run the CI artifacts to reproduce the issue it sees, that oli posted.
I wonder if that's the case. Edit: After spending the better part of the day looking at this, I've noticed gathering my statistics (tracing rustc) was done over a diesel clean build (w/ dependencies and so on included), while rustc-perf just built the leaf crate. Unless I'm mistaken again, Diesel only has one aggregate that is walked during the check build benchmark, and it's a ZST hitting the early return... Making this regression either more puzzling, or easier to discard as noise. The number of I also wonder why this didn't show up in the previous run, if anything I'm seeing some tiny cachegrind differences locally as well, but they are completely unrelated (and it seems unlikely it's about PGO) but could be about allocation: cachegrind attributes more time to
The
Edit: the above are actually the clean build stats, not a leaf crate build like rustc-perf: IIUC they mostly come from dependencies (std/alloc, via proc-macros). The single call couldn't + early return make such a big regression. I was about to post the actual flow of the 4945 aggregates to list the ZSTs, the early returns, unsized mplaces, edit one million: and now after a rebase over master, the allocator noise I was seeing has switched to the other direction. Incredible.
|
||
// We do the walk if we can't determine the size of the mplace: we may be | ||
// dealing with extern types here in the future. | ||
return Ok(true); | ||
}; | ||
|
||
// If there are no relocations in this allocation, it does not contain references | ||
// that point to another allocation, and we can avoid the interning walk. | ||
if let Some(alloc) = self.ecx.get_ptr_alloc(mplace.ptr, size, align)? { | ||
if !alloc.has_relocations() { | ||
return Ok(false); | ||
} | ||
} else { | ||
// We're encountering a ZST here, and can avoid the walk as well. | ||
return Ok(false); | ||
} | ||
} | ||
|
||
// In the general case, we do the walk. | ||
Ok(true) | ||
}; | ||
|
||
// If this allocation contains no references to intern, we avoid the potentially costly | ||
// walk. | ||
// | ||
// We can do this before the checks for interior mutability below, because only references | ||
// are relevant in that situation, and we're checking if there are any here. | ||
if !is_walk_needed(mplace)? { | ||
return Ok(()); | ||
} | ||
|
||
|
Uh oh!
There was an error while loading. Please reload this page.