Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-126601: pathname2url(): handle NTFS alternate data streams #126760

Merged
merged 8 commits into from
Nov 22, 2024

Conversation

barneygale
Copy link
Contributor

@barneygale barneygale commented Nov 12, 2024

Adjust pathname2url() to encode embedded colon characters in Windows paths, rather than bailing out with an OSError.

Adjust `pathname2url()` to encode embedded colon characters in Windows
paths, rather than bailing out with an `OSError`.
@barneygale barneygale marked this pull request as ready for review November 12, 2024 22:06
@barneygale barneygale requested a review from a team November 15, 2024 22:15
@barneygale
Copy link
Contributor Author

barneygale commented Nov 19, 2024

Hint for reviewers: if it helps to grok the changes, have a look at the initial commit in this MR (0ed23a5), which is a smaller patch.

Nevermind - I found a nice middle-ground I think!

Lib/nturl2path.py Outdated Show resolved Hide resolved
@zooba
Copy link
Member

zooba commented Nov 21, 2024

LGTM with one trivial change (or a reason we can't make it).

We should probably also .. versionchanged any functions that may be affected by this.

@barneygale
Copy link
Contributor Author

Thanks Steve!

We should probably also .. versionchanged any functions that may be affected by this.

I'm intending to backport this to 3.13 and 3.12 because it seems like a straightforward bugfix to me. So I'm not sure if .. versionchanged would be appropropriate. What say you?

@zooba
Copy link
Member

zooba commented Nov 21, 2024

I haven't tracked down what functions are impacted, but my assumption would be that existing users may notice behaviour changes where the path is an arbitrary input.

I'm most concerned about opening up a potential exploit, especially since alternate streams are not commonly used (and would rarely traverse a URL). Perhaps there's a scenario that's worth the risk to have the change, but I'm not aware of it, so I'm inclined to play it safe on a strict correctness issue that isn't actively hurting anyone.

@barneygale barneygale removed needs backport to 3.12 bug and security fixes needs backport to 3.13 bugs and security fixes labels Nov 21, 2024
barneygale and others added 2 commits November 21, 2024 18:57
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
@barneygale
Copy link
Contributor Author

Thanks again. I've added a .. versionchanged:: bit to the docs.

pathname2url() isn't used by any other standard library code, so we only need one change to the docs I think.

@zooba
Copy link
Member

zooba commented Nov 21, 2024

So the only question now is whether to backport, and all I can say is "I'm not sure" (and so default to saying no). Users running into this issue in real code would be pretty persuasive.

@barneygale
Copy link
Contributor Author

I'm erring towards not backporting, on reflection. I don't think the current behaviour is great, but it's better to get an exception than an incorrect result. I only logged this issue because I spotted it in the code (and I've fixed similar issues in pathlib reported by Eryk Sun) - I'm not aware of any user running into it.

@barneygale barneygale merged commit fd133d4 into python:main Nov 22, 2024
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants