Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make char::DecodeUtf16::size_hist more precise #93347

Merged

Conversation

WaffleLapkin
Copy link
Member

New implementation takes into account contents of self.buf and rounds lower bound up instead of down.

Fixes #88762
Revival of #88763

New implementation takes into account contents of `self.buf` and rounds
lower bound up instead of down.
@rust-highfive
Copy link
Collaborator

r? @dtolnay

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 26, 2022
// char), or entirely non-surrogates (1 element per char)
(low / 2, high)

// `self.buf` will never contain the first part of a surrogate,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? It doesn't seem to me like that's the case.

For example the following would fail the test below.

check(&[0xD800, 0xD800, 0xDC00]);
thread 'char::test_decode_utf16_size_hint' panicked at 'lower = 2, upper = Some(2)', library/core/tests/char.rs:320:13

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a wrong assumption from the original PR that I haven't checked 😅

I pushed a fix that checks the contents of the buf.

@dtolnay dtolnay added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 28, 2022
@WaffleLapkin WaffleLapkin force-pushed the better_char_decode_utf16_size_hint branch from a40122c to 2c97d10 Compare January 28, 2022 09:40
`self.buf` can contain a surrogate, but only a leading one.
@rust-log-analyzer

This comment has been minimized.

Copy link
Member

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check(&[0xD800, 0xD800, 0x0]) fails your test.

thread 'char::test_decode_utf16_size_hint' panicked at 'lower = 1, count = 2, upper = Some(1)', library/core/tests/char.rs:320:13

There are cases, when data in the buf might or might not be an error.
@WaffleLapkin
Copy link
Member Author

@dtolnay I fixed this edge case too. I wander if I still missed something 😄

Copy link
Member

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good.

@dtolnay dtolnay added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Jan 30, 2022
@dtolnay
Copy link
Member

dtolnay commented Jan 30, 2022

@bors r+

@bors
Copy link
Contributor

bors commented Jan 30, 2022

📌 Commit 17cd2cd has been approved by dtolnay

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 30, 2022
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Jan 30, 2022
…16_size_hint, r=dtolnay

Make `char::DecodeUtf16::size_hist` more precise

New implementation takes into account contents of `self.buf` and rounds lower bound up instead of down.

Fixes rust-lang#88762
Revival of rust-lang#88763
bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 31, 2022
…askrgr

Rollup of 8 pull requests

Successful merges:

 - rust-lang#90277 (Improve terminology around "after typeck")
 - rust-lang#92918 (Allow eliding GATs in expression position)
 - rust-lang#93039 (Don't suggest inaccessible fields)
 - rust-lang#93155 (Switch pretty printer to block-based indentation)
 - rust-lang#93214 (Respect doc(hidden) when suggesting available fields)
 - rust-lang#93347 (Make `char::DecodeUtf16::size_hist` more precise)
 - rust-lang#93392 (Clarify documentation on char::MAX)
 - rust-lang#93444 (Fix some CSS warnings and errors from VS Code)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 76857fb into rust-lang:master Jan 31, 2022
@rustbot rustbot added this to the 1.60.0 milestone Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect upper bound for size_hint of char::DecodeUtf16
7 participants