Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a lower bound check to unicode-table-generator output #122013

Merged
merged 1 commit into from
Apr 20, 2024

Conversation

Swatinem
Copy link
Contributor

@Swatinem Swatinem commented Mar 5, 2024

This adds a dedicated check for the lower bound
(if it is outside of ASCII range) to the output of the unicode-table-generator tool.

This generalized the ASCII-only fast-path, but only for the Grapheme_Extend property for now, as that is the only one with a lower bound outside of ASCII.

@rustbot
Copy link
Collaborator

rustbot commented Mar 5, 2024

r? @scottmcm

rustbot has assigned @scottmcm.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 5, 2024
@Swatinem
Copy link
Contributor Author

Swatinem commented Mar 5, 2024

As doing this only for the ASCII lower bound only applied to a single table, how about covering the full range? As in: (lower..upper).contains(c as u32) or something like that?

@workingjubilee
Copy link
Member

workingjubilee commented Mar 5, 2024

That doesn't sound like it would be faster, as it would be both a lower bound check and an upper bound check before moving on to the bitset search? Checking the lower bound seems good, but for the upper bound, I would assume letting the binary search play out would be faster.

@@ -101,7 +102,10 @@ impl RawEmitter {
)
.unwrap();
writeln!(&mut self.file, "pub const fn lookup(c: char) -> bool {{").unwrap();
writeln!(&mut self.file, " super::bitset_search(",).unwrap();
if first_code_point > 0x7f {
writeln!(&mut self.file, " (c as u32) >= {first_code_point} &&").unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
writeln!(&mut self.file, " (c as u32) >= {first_code_point} &&").unwrap();
writeln!(&mut self.file, " (c as u32) >= {first_code_point:#04x} &&").unwrap();

Just to keep the hex consistent. Could also do u32::from(c) rather than the as cast since it has that impl, but makes no difference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the other numbers in this file are printed as decimals, but I did this change. This might also solve the confusion in the other review comment.

src/tools/unicode-table-generator/src/skiplist.rs Outdated Show resolved Hide resolved
@tgross35
Copy link
Contributor

tgross35 commented Mar 5, 2024

Honestly I think you may as well close #121138 before it merges in favor of this, otherwise you'll just have to undo it 😄

@@ -316,6 +316,7 @@ pub mod grapheme_extend {
128, 240, 0,
];
pub fn lookup(c: char) -> bool {
(c as u32) >= 768 &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsure: I see in SHORT_OFFSET_RUNS that the first one is 768, which looks awfully similar to this 768. Should this be doing c as u32 - 768 and making each of the statics shorter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven’t really spent much time figuring out how the skip list search actually works, and what the meaning of those entries is.
Looking at all the other tables, the first entry in SHORT_OFFSET_RUNS does not match the lower bound, so it might just be coincidence.

Maybe its possible to do a checked_sub and take advantage of this, but I would have to dig deeper into the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, can separate this improvement from any tweak to the tables.

@Swatinem Swatinem force-pushed the unicode-gen-fastpath branch 2 times, most recently from ad7b782 to 6d7daa0 Compare March 5, 2024 19:25
@Swatinem
Copy link
Contributor Author

Swatinem commented Mar 5, 2024

I ended up reverting #121138 in this PR as well, since this impl is more general than that.
Although I’m not entirely sure, since the lookup function does not have an #[inline] annotations, so I’m not sure if it will have the same perf impact.

@cuviper
Copy link
Member

cuviper commented Mar 8, 2024

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 8, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 8, 2024
Add a lower bound check to `unicode-table-generator` output

This adds a dedicated check for the lower bound
(if it is outside of ASCII range) to the output of the `unicode-table-generator` tool.

This generalized the ASCII-only fast-path, but only for the `Grapheme_Extend` property for now, as that is the only one with a lower bound outside of ASCII.
@bors
Copy link
Contributor

bors commented Mar 8, 2024

⌛ Trying commit 6d7daa0 with merge 554f230...

@bors
Copy link
Contributor

bors commented Mar 8, 2024

☀️ Try build successful - checks-actions
Build commit: 554f230 (554f2305f7f70bf404662f432d49f94ad0c72ec6)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (554f230): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.3% [-0.3%, -0.3%] 1
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
5.8% [5.1%, 6.6%] 2
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-5.4% [-5.4%, -5.4%] 1
All ❌✅ (primary) 5.8% [5.1%, 6.6%] 2

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 647.485s -> 647.691s (0.03%)
Artifact size: 172.63 MiB -> 172.63 MiB (0.00%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 8, 2024
@Swatinem
Copy link
Contributor Author

Swatinem commented Mar 8, 2024

The fmt-debug-derive Runtime benchmark reports a regression of 12.18%, so it seems like the #[inline] annotation is indeed significant 🤔

@scottmcm
Copy link
Member

Let's re-run perf to ensure that things are fine with the #[inline] still there.

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 27, 2024
@bors
Copy link
Contributor

bors commented Mar 27, 2024

⌛ Trying commit 6d7daa0 with merge fb17f9e...

bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 27, 2024
Add a lower bound check to `unicode-table-generator` output

This adds a dedicated check for the lower bound
(if it is outside of ASCII range) to the output of the `unicode-table-generator` tool.

This generalized the ASCII-only fast-path, but only for the `Grapheme_Extend` property for now, as that is the only one with a lower bound outside of ASCII.
@bors
Copy link
Contributor

bors commented Mar 27, 2024

☀️ Try build successful - checks-actions
Build commit: fb17f9e (fb17f9e4e044d2120c9a5a58f07bebaa24a1a93b)

@scottmcm scottmcm reopened this Apr 18, 2024
@scottmcm
Copy link
Member

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Apr 18, 2024

⌛ Trying commit 580c6a1 with merge 2746faf...

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 18, 2024
Add a lower bound check to `unicode-table-generator` output

This adds a dedicated check for the lower bound
(if it is outside of ASCII range) to the output of the `unicode-table-generator` tool.

This generalized the ASCII-only fast-path, but only for the `Grapheme_Extend` property for now, as that is the only one with a lower bound outside of ASCII.
@bors
Copy link
Contributor

bors commented Apr 18, 2024

☀️ Try build successful - checks-actions
Build commit: 2746faf (2746faf07f3279b425969d2ad1069c1f955af851)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (2746faf): comparison URL.

Overall result: ❌ regressions - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.9% [3.9%, 3.9%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
8.7% [8.7%, 8.7%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.0% [-1.0%, -1.0%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 3.9% [-1.0%, 8.7%] 2

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.6% [3.6%, 3.6%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 676.731s -> 676.644s (-0.01%)
Artifact size: 316.11 MiB -> 315.35 MiB (-0.24%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 18, 2024
Copy link
Member

@scottmcm scottmcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perf looks happy with the inline -- sorry it took weeks to actually get a run -- so I think this is nearly good to go. Just had a couple minor requests.

library/core/src/unicode/unicode_data.rs Outdated Show resolved Hide resolved
library/core/src/unicode/unicode_data.rs Outdated Show resolved Hide resolved
@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 18, 2024
This adds a dedicated check for the lower bound
(if it is outside of ASCII range) to the output of the `unicode-table-generator` tool.

This generalized the ASCII-only fast-path, but only for the `Grapheme_Extend` property for now,
as that is the only one with a lower bound outside of ASCII.
@Swatinem
Copy link
Contributor Author

Rebased and applied the suggestions.

@Swatinem
Copy link
Contributor Author

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 20, 2024
@scottmcm
Copy link
Member

Thanks!
@bors r+ rollup=iffy (should now be perf-neutral so I don't think it needs never)

This also has me curious what would happen if we added a lower-bound check outside the probably-not-inlined part for everything, but that's definitely not a this-PR kind of thing 🙃

@bors
Copy link
Contributor

bors commented Apr 20, 2024

📌 Commit 488598c has been approved by scottmcm

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 20, 2024
@bors
Copy link
Contributor

bors commented Apr 20, 2024

⌛ Testing commit 488598c with merge dbce3b4...

@bors
Copy link
Contributor

bors commented Apr 20, 2024

☀️ Test successful - checks-actions
Approved by: scottmcm
Pushing dbce3b4 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Apr 20, 2024
@bors bors merged commit dbce3b4 into rust-lang:master Apr 20, 2024
13 checks passed
@rustbot rustbot added this to the 1.79.0 milestone Apr 20, 2024
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (dbce3b4): comparison URL.

Overall result: ❌ regressions - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
1.3% [1.3%, 1.3%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
4.4% [0.8%, 8.0%] 2
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 4.4% [0.8%, 8.0%] 2

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.8% [-3.0%, -2.6%] 2
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 671.454s -> 673.418s (0.29%)
Artifact size: 315.20 MiB -> 315.27 MiB (0.02%)

@Swatinem Swatinem deleted the unicode-gen-fastpath branch May 25, 2024 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants