-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Fixing the performance regression of #76244 #76913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
r? @varkor (rust_highfive has picked a reviewer for you, use r? to override) |
r? @jackh726 |
@bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit 64f98169e0dda32efd981d568cebddad45d0b3cf with merge 62832cf234653ae26c54abc4801db9b0698a0f86... |
Do try builds ignore tidy? If not, it will fail. |
I think they ignore tidy |
☀️ Try build successful - checks-actions, checks-azure |
Queued 62832cf234653ae26c54abc4801db9b0698a0f86 with parent 4e8a8b4, future comparison URL. |
Finished benchmarking try commit (62832cf234653ae26c54abc4801db9b0698a0f86): comparison url. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up. @bors rollup=never |
Ok cool, reverting the |
Yeah, seems like maybe not all of it, but I think we should land this and separately explore why removing the DefId cost us so much performance on some benchmarks. (I guess maybe it would be fairly clear from the diff, but I'd need to view it locally I suspect). |
29cdc76
to
44920d0
Compare
The latest commit reintroduces the refactor for chalk mode, as these changes are still desired. At this point, the net effect of this PR essentially is to introduce dead code. A FIXME is added to remove the field. |
@bors try @rust-timer queue Let's re-confirm perf. |
Awaiting bors try build completion |
⌛ Trying commit 44920d01d523ae17e2da07ebe7d9a8c628cdadb8 with merge 7ccd8cf9305ba5c78cfd78c0fa454f18a20c4b16... |
☀️ Try build successful - checks-actions, checks-azure |
Queued 7ccd8cf9305ba5c78cfd78c0fa454f18a20c4b16 with parent 59fb88d, future comparison URL. |
Finished benchmarking try commit (7ccd8cf9305ba5c78cfd78c0fa454f18a20c4b16): comparison url. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up. @bors rollup=never |
This is a seriously small diff. But LGTM. I don't know who wants to actually review/r+ this? @Mark-Simulacrum? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r=me with field made private, unless you want to try the MaybeUninit thing as well.
compiler/rustc_middle/src/ty/mod.rs
Outdated
@@ -1745,6 +1745,9 @@ pub struct ParamEnv<'tcx> { | |||
/// | |||
/// Note: This is packed, use the reveal() method to access it. | |||
packed: CopyTaggedPtr<&'tcx List<Predicate<'tcx>>, traits::Reveal, true>, | |||
|
|||
/// FIXME: This field is not used, but removing it causes a performance degradation. See #76913. | |||
pub unused_field: Option<DefId>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like this to be made private.
Maybe we should also replace Option<DefId>
here with something like MaybeUninit<u64>
, which would never be initialized, and see if that's still enough to avoid the regression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Private: good point!
About the different type: I'm not sure what a nice solution would be. MaybeUninit
is not Eq
, for example. I now wrapped an array of u8
's to stress that the side (probably) should be 8 bytes. What do you think?
4c30bf8
to
106b74b
Compare
Could you squash the commits down as well? I imagine the revert re-apply dance are no longer necessary, given the small diff here. Let's kick off a try build to make sure the array doesn't perform worse. @bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit 106b74ba1a5abe1b617b33be0b440ea065080666 with merge 3b689b941bcd1844d1fae487ae64b061bb4492c3... |
☀️ Try build successful - checks-actions, checks-azure |
Queued 3b689b941bcd1844d1fae487ae64b061bb4492c3 with parent 1fd5b9d, future comparison URL. |
Finished benchmarking try commit (3b689b941bcd1844d1fae487ae64b061bb4492c3): comparison url. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up. @bors rollup=never |
Latest perf looks worse. |
Okay, let's go back to the DefId and we can try to follow-up separately on optimizing further. r=me with commits squashed so we don't have the back and forth in git history |
106b74b
to
ab83d37
Compare
@Mark-Simulacrum I think that this should do it. |
@bors r+ rollup=never |
📌 Commit ab83d37 has been approved by |
☀️ Test successful - checks-actions, checks-azure |
This issue finds continuation in #77058. |
Issue #74865 suggested that removing the
def_id
field fromParamEnv
would improve performance. PR #76244 implemented this change.Generally, results were as expected: an instruction count decrease of about a percent. The instruction count for the unicode crates increased by about 3%, which @nnethercote speculated to be caused by a quirk of inlining or codegen. As the results were generally positive, and for chalk integration, this was also a step in the right direction, the PR was r+'d regardless.
However, wall-time performance results show a much larger performance degradation: 25%, as mentioned by @Mark-Simulacrum.
This PR, for now, reverts #76244 and attempts to find out, which change caused the regression.