-
Notifications
You must be signed in to change notification settings - Fork 335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
init commit to make idna optional #728
Conversation
To show the differences, run cargo build --release --no-default-features --benches
cargo build --release --benches
# then
ls -al ../target/release/deps/parse_url-* |
@valenting take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the feature should be documented, akin to how regex
does it.
Especially important to state the (heavy) disadvantages of opting out!
@madsmtm I've add a small note on the feature. Not sure how to especially state the disadvantage. If I disable the idna, then all idna url won't parse? Maybe add some examples? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not that knowledged in the actual workings of IDNA, so these are just vague suggestions.
I think it's fine that clients whom know the domains they're going to be accessing, disable the feature. E.g. a resource-constrained device that makes HTTP requests to a server under the developer's control.
But I worry that e.g. some servers will disable the idna
feature without really thinking further, and then their domain matching code will fail in weird (possibly security-relevant) ways.
url/src/host.rs
Outdated
let domain = idna::domain_to_ascii(&domain)?; | ||
#[cfg(not(feature = "idna"))] | ||
let domain = domain.to_string(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should panic if domain
is not ASCII? I think the rest of the code expects Host::Domain
to hold the normalized domain name, so this way we don't introduce weird incompatibilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not panic, but return an error?
Not only if the domain is not ASCII, but also if the domain contains any xn--
labels - which we can't know if they are correct or no.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue panicking is correct since it's an "unrecoverable programmer error"; disabling the idna
feature should only be done when the programmer is absolutely sure the program won't need to handle IDNs; if that assumption turns out to be false, there is no way to handle this gracefully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I happen to be a backend engineer also, panic is the last thing I expect a base lib to do, especially when it may handle external input. I disable the idna doesn't mean I want my service panic for idna input...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
panic should used when the program enter some bad state and not easy to recover. For this case, it has no problem to recover.
url/src/origin.rs
Outdated
#[cfg(feature = "idna")] | ||
let (domain, _errors) = idna::domain_to_unicode(domain); | ||
#[cfg(not(feature = "idna"))] | ||
let domain = domain.clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like with the above comment, this should maybe panic if domain
would change after idna::domain_to_unicode
(if we can determine that easily?); otherwise, this function would return two different things based on whether the feature is enabled or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If idna feature disabled, then the whole fn unicode_serialization can be opt out? I searched spec about unicode_serialization, it is strictly related with idna.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://html.spec.whatwg.org/multipage/origin.html#unicode-serialisation-of-an-origin now the spec even deletes the section of unicode serialization, just saying "There used to also be a Unicode serialization of an origin. However, it was never widely adopted."
…icode_serialziation for Origin if idna not enabled
@madsmtm @valenting ping~ |
@valenting ping |
Some Lint reports error on some changes not touched by this pr. Do I need to make all green? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good otherwise. I'll deal with the other lint failures in a separate PR.
@valenting updated~ |
fix #727