Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline read_{un,}signed_leb128 and opaque::Decoder functions. #37083

Merged
merged 1 commit into from
Oct 18, 2016

Conversation

nnethercote
Copy link
Contributor

read_unsigned_leb128 is hot within rustc because it's heavily used
during the reading of crate metadata. This commit tweaks its signature
(and that of read_signed_leb128, for consistency) so it can increment
the buffer index directly instead of maintaining its own copy, the
change in which is then used by the caller to advance the index.

This reduces the instruction count (as measured by Cachegrind) for some
benchmarks a bit, e.g. hyper-0.5.0 by 0.7%.

@rust-highfive
Copy link
Collaborator

r? @erickt

(rust_highfive has picked a reviewer for you, use r? to override)

@Mark-Simulacrum
Copy link
Member

Leaving a comment on these methods noting why this was done is probably a good idea so a future clean-up doesn't accidentally remove this optimization, since it's not intuitively a good idea to maintain a pointer to a usize instead of copying it back and forth.

@nnethercote
Copy link
Contributor Author

FWIW, I'm treating @Mark-Simulacrum's comment as a suggestion worth considering, but I'm still waiting for @erickt to review before I make any changes.

@erickt
Copy link
Contributor

erickt commented Oct 17, 2016

@nnethercote: Looks good to me if you add the comment. This seems a tad magical though. I would have expected llvm to generate similar code. Maybe it's having trouble doing return-value optimization? I wonder if this is an area a MIR optimization could target.

@eddyb
Copy link
Member

eddyb commented Oct 17, 2016

@erickt It's because the function is not marked as #[inline] 😞.

@nnethercote
Copy link
Contributor Author

@erickt It's because the function is not marked as #[inline] 😞.

I've added that, and also inlined all the methods within opaque::Decoder and the results are much better than the original change :) More measurements and an updated patch coming soon.

These functions are all hot in rustc and inlining them speeds up most of
the rustc-benchmarks by 1--2%.
@nnethercote
Copy link
Contributor Author

stage1:

futures-rs-test  4.416s vs  4.317s --> 1.023x faster (variance: 1.007x, 1.010x)
helloworld       0.234s vs  0.229s --> 1.021x faster (variance: 1.007x, 1.014x)
html5ever-2016-  5.442s vs  5.389s --> 1.010x faster (variance: 1.008x, 1.005x)
hyper.0.5.0      5.131s vs  5.054s --> 1.015x faster (variance: 1.001x, 1.010x)
inflate-0.1.0    4.902s vs  4.852s --> 1.010x faster (variance: 1.013x, 1.003x)
issue-32062-equ  0.348s vs  0.344s --> 1.012x faster (variance: 1.010x, 1.006x)
issue-32278-big  1.711s vs  1.706s --> 1.003x faster (variance: 1.006x, 1.017x)
jld-day15-parse  1.550s vs  1.541s --> 1.006x faster (variance: 1.016x, 1.007x)
piston-image-0. 12.046s vs 11.915s --> 1.011x faster (variance: 1.037x, 1.009x)
regex.0.1.30     2.557s vs  2.500s --> 1.023x faster (variance: 1.005x, 1.017x)
rust-encoding-0  2.109s vs  2.055s --> 1.026x faster (variance: 1.006x, 1.008x)
syntex-0.42.2   32.859s vs 32.609s --> 1.008x faster (variance: 1.004x, 1.032x)
syntex-0.42.2-i 17.998s vs 17.732s --> 1.015x faster (variance: 1.008x, 1.021x)

@nnethercote nnethercote changed the title Tweak read_{un,}signed_leb128. Inline read_{un,}signed_leb128 and opaque::Decoder functions. Oct 18, 2016
@eddyb
Copy link
Member

eddyb commented Oct 18, 2016

@bors r+

@bors
Copy link
Contributor

bors commented Oct 18, 2016

📌 Commit 6a4bb35 has been approved by eddyb

@bors
Copy link
Contributor

bors commented Oct 18, 2016

⌛ Testing commit 6a4bb35 with merge 753ea76...

bors added a commit that referenced this pull request Oct 18, 2016
Inline read_{un,}signed_leb128 and opaque::Decoder functions.

`read_unsigned_leb128` is hot within rustc because it's heavily used
during the reading of crate metadata. This commit tweaks its signature
(and that of `read_signed_leb128`, for consistency) so it can increment
the buffer index directly instead of maintaining its own copy, the
change in which is then used by the caller to advance the index.

This reduces the instruction count (as measured by Cachegrind) for some
benchmarks a bit, e.g. hyper-0.5.0 by 0.7%.
@bors bors merged commit 6a4bb35 into rust-lang:master Oct 18, 2016
@nnethercote nnethercote deleted the uleb128 branch October 18, 2016 21:50
@brson brson added the relnotes Marks issues that should be documented in the release notes of the next release. label Oct 24, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes Marks issues that should be documented in the release notes of the next release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants