Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: unicode on exactly index 7 should not panic #4

Merged
merged 1 commit into from
Dec 2, 2024

Conversation

DanGould
Copy link

@DanGould DanGould commented Dec 2, 2024

Problem

A unicode character on exactly the right place (where the previous code used to split the string) would cause a panic.

Fix

Treat the string as unicode in order to compare the scheme, turning the panic into error.

Thank you @lsunsi for opening the original PR

Kixunil#25

@DanGould DanGould requested a review from nothingmuch December 2, 2024 16:47
@DanGould
Copy link
Author

DanGould commented Dec 2, 2024

from @nothingmuch:

tl;dr - using [u8] can create an invalid utf8 string if the bad utf8 character is sliced mid way

"using [u8]" was sloppy pphrasing on my part, it's not treating a &str as a &[u8] exactly

the fix, i'm not sure if string.get(...) is the most correct way of doing this

I do believe using string.get(...) is the correct way of doing this

verified the fix in evcxr

@DanGould DanGould mentioned this pull request Dec 2, 2024
Copy link

@nothingmuch nothingmuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK, I verified that this implements the correct behavior. FWIW, as a rust n00b the incorrect behavior was not obviously wrong to me at all, perhaps a clarifying comment could improve readability but given that the branch is a PR against bip21 that would need to be in a followup PR i suppose

@DanGould
Copy link
Author

DanGould commented Dec 2, 2024

tACK on 1b19687 test fails without change and passes with it

@DanGould DanGould merged commit 76244b1 into payjoin:master Dec 2, 2024
@nothingmuch
Copy link

from @nothingmuch:

heh, my sloppy phrasing in DMs is so incoherent it almost calls into question my approval

tl;dr - slicing a &str using usize range interprets the usize offsets as byte offsets, but if those fall in the middle of a utf8 character this panics with byte index ... is not a char boundary. get uses the same indexing semantics but returns None instead of panicking if the byte offset is not aligned with character boundary

@DanGould
Copy link
Author

DanGould commented Dec 2, 2024

Rats. This introduced an MSRV break because the PR ran before that MSRV check was in CI. Deploying a fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants