-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a lower bound check to unicode-table-generator
output
#122013
Conversation
As doing this only for the ASCII lower bound only applied to a single table, how about covering the full range? As in: |
That doesn't sound like it would be faster, as it would be both a lower bound check and an upper bound check before moving on to the bitset search? Checking the lower bound seems good, but for the upper bound, I would assume letting the binary search play out would be faster. |
@@ -101,7 +102,10 @@ impl RawEmitter { | |||
) | |||
.unwrap(); | |||
writeln!(&mut self.file, "pub const fn lookup(c: char) -> bool {{").unwrap(); | |||
writeln!(&mut self.file, " super::bitset_search(",).unwrap(); | |||
if first_code_point > 0x7f { | |||
writeln!(&mut self.file, " (c as u32) >= {first_code_point} &&").unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
writeln!(&mut self.file, " (c as u32) >= {first_code_point} &&").unwrap(); | |
writeln!(&mut self.file, " (c as u32) >= {first_code_point:#04x} &&").unwrap(); |
Just to keep the hex consistent. Could also do u32::from(c)
rather than the as
cast since it has that impl, but makes no difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the other numbers in this file are printed as decimals, but I did this change. This might also solve the confusion in the other review comment.
Honestly I think you may as well close #121138 before it merges in favor of this, otherwise you'll just have to undo it 😄 |
@@ -316,6 +316,7 @@ pub mod grapheme_extend { | |||
128, 240, 0, | |||
]; | |||
pub fn lookup(c: char) -> bool { | |||
(c as u32) >= 768 && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsure: I see in SHORT_OFFSET_RUNS
that the first one is 768, which looks awfully similar to this 768. Should this be doing c as u32 - 768
and making each of the statics shorter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven’t really spent much time figuring out how the skip list search actually works, and what the meaning of those entries is.
Looking at all the other tables, the first entry in SHORT_OFFSET_RUNS
does not match the lower bound, so it might just be coincidence.
Maybe its possible to do a checked_sub
and take advantage of this, but I would have to dig deeper into the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, can separate this improvement from any tweak to the tables.
ad7b782
to
6d7daa0
Compare
I ended up reverting #121138 in this PR as well, since this impl is more general than that. |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Add a lower bound check to `unicode-table-generator` output This adds a dedicated check for the lower bound (if it is outside of ASCII range) to the output of the `unicode-table-generator` tool. This generalized the ASCII-only fast-path, but only for the `Grapheme_Extend` property for now, as that is the only one with a lower bound outside of ASCII.
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (554f230): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 647.485s -> 647.691s (0.03%) |
The |
Let's re-run perf to ensure that things are fine with the @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Add a lower bound check to `unicode-table-generator` output This adds a dedicated check for the lower bound (if it is outside of ASCII range) to the output of the `unicode-table-generator` tool. This generalized the ASCII-only fast-path, but only for the `Grapheme_Extend` property for now, as that is the only one with a lower bound outside of ASCII.
☀️ Try build successful - checks-actions |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Add a lower bound check to `unicode-table-generator` output This adds a dedicated check for the lower bound (if it is outside of ASCII range) to the output of the `unicode-table-generator` tool. This generalized the ASCII-only fast-path, but only for the `Grapheme_Extend` property for now, as that is the only one with a lower bound outside of ASCII.
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (2746faf): comparison URL. Overall result: ❌ regressions - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 676.731s -> 676.644s (-0.01%) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perf looks happy with the inline -- sorry it took weeks to actually get a run -- so I think this is nearly good to go. Just had a couple minor requests.
This adds a dedicated check for the lower bound (if it is outside of ASCII range) to the output of the `unicode-table-generator` tool. This generalized the ASCII-only fast-path, but only for the `Grapheme_Extend` property for now, as that is the only one with a lower bound outside of ASCII.
580c6a1
to
488598c
Compare
Rebased and applied the suggestions. |
@rustbot ready |
Thanks! This also has me curious what would happen if we added a lower-bound check outside the probably-not-inlined part for everything, but that's definitely not a this-PR kind of thing 🙃 |
☀️ Test successful - checks-actions |
Finished benchmarking commit (dbce3b4): comparison URL. Overall result: ❌ regressions - no action needed@rustbot label: -perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 671.454s -> 673.418s (0.29%) |
This adds a dedicated check for the lower bound
(if it is outside of ASCII range) to the output of the
unicode-table-generator
tool.This generalized the ASCII-only fast-path, but only for the
Grapheme_Extend
property for now, as that is the only one with a lower bound outside of ASCII.