-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Int parsing optimisations (part 2) #96071
Conversation
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
r? @kennytm (rust-highfive has picked a reviewer for you, use r? to override) |
r? @scottmcm |
The job Click to see the possible cause of the failure (guessed by this bot)
|
return Err(PIE { kind: InvalidDigit }); | ||
let (first, mut digits) = (*src.get(0).ok_or_else(|| PIE { kind: Empty })?, &src[1..]); | ||
|
||
let (is_positive, mut result) = match first { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like there's a bit more repetition going on here than there needs to be.
Maybe try this with slice patterns, or something? I'm imagining something like
let (is_positive, digits) = match src {
[b'-', d] => (false, d),
[b'+', d] => (true, d),
d => (true, d),
};
To hopefully simplify a bit of the first
/digits
/result
dance that's currently happening.
// If the len of the str is short compared to the range of the type | ||
// we are parsing into, then we can be certain that an overflow will not occur. | ||
// This bound is when `radix.pow(digits.len()) - 1 <= T::MAX` but the condition | ||
// above is a faster (conservative) approximation of this. | ||
// in `safe_width` is a faster (conservative) approximation of this. | ||
// | ||
// Consider radix 16 as it has the highest information density per digit and will thus overflow the earliest: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: this comment is useful information, but doesn't seem like it belongs here, since the computation it's talking about here isn't here. Maybe put it in on/in safe_width
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Said otherwise, this code will be correct as long as safe_width
is correct, so the details of which approach -- faster or tighter -- doesn't really matter here.
// | ||
// Consider radix 16 as it has the highest information density per digit and will thus overflow the earliest: | ||
// `u8::MAX` is `ff` - any str of len 2 is guaranteed to not overflow. | ||
// `i8::MAX` is `7f` - only a str of len 1 is guaranteed to not overflow. | ||
let safe_width = safe_width::<T>(radix, is_signed_ty); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: the .take
in one spot not coupled with a corresponding .skip
in the other makes this read a bit strangely to me. Perhaps the splitting could just be put here, with no need to ever look at the length again later? As a first thought, something like this, with appropriate updates to the for
loops?
let (safe_digits, risky_digits) = if safe_width > digits.len() { (digits, &[]) } else { digits.split_at(safe_width) }; | |
@@ -126,15 +126,15 @@ fn test_can_not_overflow() { | |||
where | |||
T: std::convert::TryFrom<i8>, | |||
{ | |||
!can_not_overflow::<T>(radix, T::try_from(-1_i8).is_ok(), input.as_bytes()) | |||
safe_width::<T>(radix, T::try_from(-1_i8).is_ok()) < input.len() | |||
} | |||
|
|||
// Positive tests: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: how about testing the output of safe_width directly? Just seeing can_overflow
returning true
doesn't mean that it's correct -- it could be returning usize::MAX
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea here -- avoiding the cliff once the string gets longer than the threshold -- but I have a bunch of implementation thoughts.
Feel free to push back if some of them turn out to be bad ideas.
Ping from triage: |
Still on the todo list. Will ship it xmas.
…On Sun, 27 Nov 2022 at 04:18, John Simon ***@***.***> wrote:
Ping from triage:
@gilescope <https://github.com/gilescope> what is the status of this PR?
Looks like it hasn't been touched in a while.
—
Reply to this email directly, view it on GitHub
<#96071 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGEJCFFB5LWZDHBLU4BDD3WKLOJNANCNFSM5TQBKCJQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Closing this as inactive. Feel free to reöpen this pr or create a new pr if you get the time to work on this. Thanks |
Extension to #95399
We can combine the
src.is_empty()
check with theis_positive
check so that we get the first element once.Previously we started with
let mut result = T::from_u32(0);
which we would then callresult = result * T::from_u32(radix);
on which we already know will be0
. Instead if we parse the first digit and put that in the result then the first time around the loop the mul will be productive - we just need to shave the first element from thedigits
slice that we hand to the loop.Give that the loop is now only going round twice for u8 it's not worth trying to do any further optimisations - let's only do that for u32 size and above where we could be iterating a few times (
if mem::size_of::<T>() > 2 {
).The final observation is that we can use the unchecked path even for strings that are large enough to overflow - we just use the checked path for parsing the digits that could breach the type.
I've included
u128
/i128
in the benchmarks. The checked arithmetic ofi128
is particularly slow and really gains from using the unchecked arithmetic where possible (not that the current benchmarks show this as they are parsing too small numbers).