Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URI.get_username() and URI.get_password() replace literal plus signs with spaces #245

Closed
AdrianoKF opened this issue May 28, 2020 · 2 comments
Labels
bug Something isn't working

Comments

@AdrianoKF
Copy link

AdrianoKF commented May 28, 2020

Description

RFC 3986 specifies that the userinfo subcomponent of a URI may contain the sub-delim tokens, so it is okay to include a literal plus sign (not percent-encoded) there, I believe (if not, please feel free to close this issue!).

model.url.URI seems to assume that these fields are always percent-encoded, leading to the replacement of + with a space in URI.get_username() and URI.get_password() (which are called internally by URI.to_string().

This is the root cause of pypa/pipenv#4271, which relies on this method for handling Git VCS URLs.

Expected behaviour

Parsing and formatting a URI containing a userinfo component where either the username or password contain a literal plus character (+) should not alter the content of these fields.

Suggested fix

If my understanding of the RFC as mentioned above is correct, URI._parse_auth should call quote_plus with the safe arg set to the sub-delim tokens from RFC 3986, preventing them from being escaped in the auth fields.

Steps to reproduce

from requirementslib.models.url import URI

u1 = URI.parse('git+https://foo+bar:foobar@example.org')
u2 = URI.parse('git+https://foo:bar+foobar@example.org')
print(f'{u1}, username={u1.username}, get_username()={u1.get_username()}, password={u1.password}, get_password()={u1.get_password()}, url={str(u1)}')
print(f'{u2}, username={u2.username}, get_username()={u2.get_username()}, password={u2.password}, get_password()={u2.get_password()}, url={str(u2)}')

yields

git+https://foo bar:----@example.org, username=foo+bar, get_username()=foo+bar, password=foobar, get_password()=foobar, url=git+https://foo bar:----@example.org
git+https://foo:----@example.org, username=foo, get_username()=foo, password=bar+foobar, get_password()=bar+foobar, url=git+https://foo:----@example.org

Note that the username and password fields are parsed correctly from the input but accessing them via get_username()/get_password() or formatting the URI as a string leads to the inadvertent replacement.

@AdrianoKF AdrianoKF changed the title URI.get_username() and URI.get_password() fail for credentials containing literal plus sign URI.get_username() and URI.get_password() replace literal plus signs with spaces May 28, 2020
@techalchemy
Copy link
Member

Thanks for digging into this, and apologies for the issue here.

@techalchemy techalchemy added the bug Something isn't working label May 28, 2020
@frostming
Copy link
Member

Closed by #244

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants