-
Notifications
You must be signed in to change notification settings - Fork 13.7k
Optimize core::unicode::printable
.
#139540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Optimize core::unicode::printable
.
#139540
Conversation
rustbot has assigned @workingjubilee. Use |
This comment has been minimized.
This comment has been minimized.
1a57f45
to
9109550
Compare
r? libs |
r? libs |
r? libs |
Not sure if he's a libs reviewer, but Manish is probably the best person to review anything Unicode. r? @Manishearth |
This appears to be an optimization. It doesn't seem to be doing anything super Unicodey; it appears to be doing some byte lookup optimizations. I don't think I have the bandwidth to review this right now. I'm wary about the I'd also suggest rewriting it so that the actual Rust code does not need to be autogenerated, just some constants and a small match for the large parts. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
0c06957
to
9e66027
Compare
This comment has been minimized.
This comment has been minimized.
36cd81d
to
2acf703
Compare
r? @jhpratt |
I really am not familiar with this kind of code; I passed off for a reason. r? libs |
2acf703
to
2d8509c
Compare
@jhpratt, to be fair, I am also only slightly familiar with the code. 😄 I know what it does, but not how the underlying Unicode logic works. This PR doesn't change the logic, it only refactors the code to make it easier to reason about why the bounds checks are unneeded. |
2d8509c
to
66ada05
Compare
r? libs |
// SAFETY: The invariant for `Singletons` guarantees that the sum of all lengths | ||
// in `upper` must be equal to the lengths of `lower`, so `lower_end` is guaranteed | ||
// to be in range. | ||
let lowers = unsafe { self.lower.get_unchecked(lower_start..lower_end) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have evidence that removing this unsafe has meaningful impact on performance or code size? I wouldn't expect the bounds checks to matter that much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Mark-Simulacrum, the problem here isn't performance or code size, since on most targets, it is already optimized. The problem is that on some embedded targets, e.g. thumbv7m-none-eabi
, the bounds check panic isn't optimized out in release mode, making it impossible to use panic-never
or no_panic
to ensure there are no reachable panics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you say more? Why do only some targets see the panics stay in?
In general, I don't think the standard library optimizes for "unreachable panics should not appear in the binary", and I'm reluctant to do so in this sort of code unless we gate that on e.g. a test-only assert and have exhaustive proof that it never triggers (e.g., by running it on every possible char).
@rustbot ready |
Follow-up to #138024, which apparently still wasn't enough to fully optimize
char::escape_debug
.Again, this only seems to make a difference on embedded targets, e.g.
thumbv7m-none-eabi
.