-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
url::from_str fails on strings containing non-ascii #8486
Comments
Thank you for opening this jack. Assuming my understanding of the abstraction is correct I'm inclined is to make two fixes, one here in rust and one in servo. In rust from_str() should fail with a better error message if the str is missing a defined scheme. In servo the scheme should be autodetected with heuristics matching firefox's (so file: if substring(0,1) == '/'). An alternative approach would be for from_str() to perform the autodetection, would that be useful or frustrating? Sorry Jack, I appear to be failing at keeping my thought train in one bug report. |
Update on my search: I was wrong about this being a servo issue. Rust does not like any url with unicode. The url's unicode must be encoded before hitting DNS but never for locla files. My understanding is thus: Rust should allow unicode urls but encode if sheme is not file: |
I am not sure how to handle unicode. It appears my general issue is URLs are ANSI only and it is IRIs (https://en.wikipedia.org/wiki/Internationalized_Resource_Identifier) which provide unicode support. So in theory the URL class should remain ANSI only and a second IRI class should exist that wraps URL and performs encoding. Yet that might just confuse developers who will default to using url. So the perfect solution should provide unicode support without developers paying attention. Current take might be to extend the URL class to act as IRI. I'll read the IRI RFC and assuming no one has any objections I'll extend the URL class to be an IRI class, no name change of course. |
The URL Standard describes how to parse an URL containing non-ASCII characters, and what to do for the various parts of the parsed URL (IDNA/punnycode, UTF-8 + percent-encoding, etc) |
The replacement is [rust-url](https://github.com/servo/rust-url), which can be used with Cargo. Fix rust-lang#15874 Fix rust-lang#10707 Close rust-lang#10706 Close rust-lang#10705 Close rust-lang#8486
Fix `let_undescore_lock` false-positive when binding without locking Fixes rust-lang#8486. changelog: Fix `let_undescore_lock` false-positive when binding without locking.
Test case:
badpath.rs:
to reproduce:
mkdir 例; cd 例; ../badpath
results:
Err(~"Invalid character in path.")
Originally reported as servo/servo#722
The text was updated successfully, but these errors were encountered: