-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shrink unicode case-mapping LUTs by 24k #109216
Conversation
Since ascii chars are already handled by a special case in the `to_lower` and `to_upper` functions, there's no need to waste space on them in the LUTs.
The majority of char case replacements are single char replacements, so storing them as [char; 3] wastes a lot of space. This commit splits the replacement tables for both `to_lower` and `to_upper` into two separate tables, one with single-character mappings and one with multi-character mappings. This reduces the binary size for programs using all of these tables with roughly 24K bytes.
(rustbot has picked a reviewer for you, use r? to override) |
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
cc @thomcc |
It would be possible to avoid the second table lookup by storing an index into the table with the long expansions. Something like |
That is a very good idea! I'll see if I can get something working a bit later. EDIT: Seems to be working nicely @nikic, changed the slowdown on uppercasing into a speedup there too 👍 |
The indices are encoded as `u32`s in the range of invalid `char`s, so that we know that if any mapping fails to parse as a `char` we should use the value for lookup in the multi-table. This avoids the second binary search in cases where a multi-`char` mapping is needed. Idea from @nikic
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit 355e1dd with merge 7dff56ce084e1c8649af8ffed641f398d0e5ac9f... |
☀️ Try build successful - checks-actions |
1 similar comment
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
r=me presuming perf looks ok |
Finished benchmarking commit (7dff56ce084e1c8649af8ffed641f398d0e5ac9f): comparison URL. Overall result: no relevant changes - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. |
@bors r+ rollup=iffy |
🌲 The tree is currently closed for pull requests below priority 100. This pull request will be tested once the tree is reopened. |
@bors r+ rollup=never |
⌛ Testing commit 54f55ef with merge e5101b55d472f766f6e09e9077d59c575e31d3dd... |
💔 Test failed - checks-actions |
The job Click to see the possible cause of the failure (guessed by this bot)
|
Seems unrelated? |
Let's see if it's spurious @bors retry |
☀️ Test successful - checks-actions |
Finished benchmarking commit (f421586): comparison URL. Overall result: ✅ improvements - no action needed@rustbot label: -perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
|
I was looking into the binary bloat of a small program using
str::to_lowercase
andstr::to_uppercase
, and noticed that the lookup tables used for case mapping had a lot of zero-bytes in them. The reason for this is that since some characters map to up to three other characters when lower or uppercased, the LUTs store a[char; 3]
for each character. However, the vast majority of cases only map to a single new character, in other words most of the entries are e.g.(lowerc, [upperc, '\0', '\0'])
.This PR introduces a new encoding scheme for these tables.
The changes reduces the size of my test binary by about 24K.
I've also done some
#[bench]
marks on unicode-heavy test data, and found that the performance of bothstr::to_lowercase
andstr::to_uppercase
improves by up to 20%. These measurements are obviously very dependent on the character distribution of the data.Someone else will have to decide whether this more complex scheme is worth it or not, I was just goofing around a bit and here's what came out of it 🤷♂️ No hard feelings if this isn't wanted!