-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
gh-94823: Improve coverage in tokenizer.c:valid_utf8 #94856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-94823: Improve coverage in tokenizer.c:valid_utf8 #94856
Conversation
When loading a source file from disk, there is a separate UTF-8 validator distinct from the one in `unicode_decode_utf8`. This exercises that code path with the same set of invalid inputs as we use for testing the "other" UTF-8 decoder.
6db7d84
to
e52f328
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! It mostly looks good. I noticed a couple of small typos.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
I have made the requested changes; please review again |
Thanks for making the requested changes! @ericsnowcurrently: please review the changes made to this pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @mdboom for the PR 🌮🎉.. I'm working now to backport this PR to: 3.11. |
…94856) When loading a source file from disk, there is a separate UTF-8 validator distinct from the one in `unicode_decode_utf8`. This exercises that code path with the same set of invalid inputs as we use for testing the "other" UTF-8 decoder. (cherry picked from commit f215d7c) Co-authored-by: Michael Droettboom <mdboom@gmail.com>
GH-96029 is a backport of this pull request to the 3.11 branch. |
Thanks for the test! |
When loading a source file from disk, there is a separate UTF-8 validator
distinct from the one in
unicode_decode_utf8
. This exercises that code pathwith the same set of invalid inputs as we use for testing the "other" UTF-8
decoder.
Automerge-Triggered-By: GH:ericsnowcurrently