Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EscapeAscii no longer panics due to incorrect ExactSizeIterator #99913

Closed
wants to merge 6 commits into from
Closed

EscapeAscii no longer panics due to incorrect ExactSizeIterator #99913

wants to merge 6 commits into from

Conversation

softdevca
Copy link

Fixes #99878

The ExactSizeIterator implementation for EscapeAscii was causing a panic because the length of the iterator was not explicitly available. This commit precalculates the length to use as an exact size_hint.

Supersedes #99880

@rustbot rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Jul 29, 2022
@rustbot
Copy link
Collaborator

rustbot commented Jul 29, 2022

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Mark-Simulacrum (or someone else) soon.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 29, 2022
@rustbot rustbot added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Jul 29, 2022
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@@ -174,6 +176,9 @@ impl_fn_for_zst! {
#[must_use = "iterators are lazy and do nothing unless consumed"]
pub struct EscapeAscii<'a> {
inner: iter::FlatMap<super::Iter<'a, u8>, ascii::EscapeDefault, EscapeByte>,

/// Cached length of the iterator to implement `ExactSizeIterator`.
len: usize,
Copy link
Member

@the8472 the8472 Jul 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can't cache the value because len() must be updated every time an item is consumed from the iterator.

https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.size_hint

Returns the bounds on the remaining length of the iterator.

This was also recently clarified for ExactsizeIterator, as you can see in the nightly docs

Returns the exact remaining length of the iterator.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing out my oversight. I will recommit.

Copy link
Member

@the8472 the8472 Jul 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, caching may still be cheaper than recomputing it every time, but you'll at least have to update the cached value every time an item is consumed.
But the problem with that is that it makes every call to next() more expensive, so which approach is beneficial depends on how often len() is called.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel caching is the better choice even with the extra overhead of an decrement each next() or next_back() and an extra usize for each EscapeAscii. Caching avoids degenerate worst cases like users calling len() or size_hint() after each byte received from a slow source.

It's unfortunately to even have to go through the source iterator more than once but there isn't any other way to implement ExactSizeIterator correctly and it can't be removed because it's been released in a stable version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and it can't be removed because it's been released in a stable version.

The stability policy does have exceptions for accidental stabilizations and critical bugfixes. So removing it is an option. The crater run in #99880 will tell us whether it's worthwhile taking that option.

@clarfonthey
Copy link
Contributor

clarfonthey commented Jul 30, 2022

Hmm, it's a bit weird that ExactSizeIterator was implemented before-- that was definitely something that we should have caught during code review but didn't.

I guess that now that we've done it, we should precompute it and keep track of it. Although personally I should have probably just not implemented it for the slice version of EscapeAscii considering how it requires going through the string linearly upfront.

I'm wondering what we should do here: since len always panicked before, I wonder if we can just remove the implementation and accept the breakages since all code relying on this would have been broken anyway.

@softdevca
Copy link
Author

I'm wondering what we should do here: since len always panicked before, I wonder if we can just remove the implementation and accept the breakages since all code relying on this would have been broken anyway.

A minor technicality that might not matter: Empty iterators would not panic.

@clarfonthey
Copy link
Contributor

Crater answers the age-old question: if a change breaks in the forest and no one is around to notice, is it really a breaking change?

No, says crater.

@softdevca
Copy link
Author

Pull request #99913 has been updated to remove the implementation of ExactSizeIterator for EscapeAscii based on the crater run.

Also, thank you to those who created and run crater, without it there would be much more uncertainty about a potentially breaking change to stable versions.

@clarfonthey
Copy link
Contributor

Will we want to push a patch release for this just to let people know this wasn't supposed to be stable? Or is it fine to let it slide until next version?

@softdevca
Copy link
Author

Will we want to push a patch release for this just to let people know this wasn't supposed to be stable? Or is it fine to let it slide until next version?

Can it get into 1.63, scheduled for August 11?

@clarfonthey
Copy link
Contributor

Would need a beta backport, but that seems reasonable. Not sure who's following this PR who can help coordinate that though!

@Mark-Simulacrum
Copy link
Member

The timeline is pretty short here, and I think we'd want an FCP or similar before dropping stable support for something like this. If it was about to land on stable we could slip it in, but otherwise I don't think it's worth it.

@the8472
Copy link
Member

the8472 commented Aug 11, 2022

I think this PR can be closed, in the last libs-api meeting the team decided to try landing #99880

@softdevca
Copy link
Author

Closing this in favor of pull request #99880. Thanks for everybody's help getting issue #99878 fixed.

@softdevca softdevca closed this Aug 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Panic in escape_ascii().len()
7 participants