-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IDNA2008 #223
Comments
Maybe that's what #53 already is about? |
Yeah, switching is part of #110, but I suppose we can use this as a dedicated issue. But we need to reference UTS 46 because otherwise trivial things such as lowercasing "A" would not be defined. The thing that needs changing is the Transitional flag. That needs to become Non_Transitional. Chrome and Edge still pass Transitional however... |
By the way, I think you're reading UTS 46 wrong. It simply redefines ToUnicode (originally defined in IDNA 2003) as well as ToASCII to accommodate IDNA 2008, which doesn't define such operations. And then it defines two modes transitional and nontransitional which indeed end up affecting the sharp S (and a few other things). Anyway, I think I'll make a PR to switch to nontransitional and we can use this issue for that. |
Maybe I do read it wrong (I'm certainly certainly not an IDN expert), but the mere fact that TWUS (The WHATWG URL Specification) doesn't itself clearly clarify the IDNA2003/8 situation is actually my main complaint in this issue |
UTS 46 handles that, no? I don't see why we would duplicate that. |
Does it? Can you quote or link the parts (in that huge document) that clarifies this? |
It starts right with that in the introduction... http://www.unicode.org/reports/tr46/#Introduction That document is what browsers all implement. There are some subtle differences with regards to flags and which strings end up going through the algorithms defined there, for which we have some tracking bugs and tests in the works, but by and large there is agreement. |
... and then it goes on at length describing both IDNA 2003 and IDNA 2008.
To me, that's far from a clear statement on what a URL should use 2017. |
It explains the IDNA2003/2008 situation. I thought that's what you were looking for? What you actually need to implement is defined in the Processing section http://www.unicode.org/reports/tr46/#Processing. How exactly that section works depends on which flags are passed, which are the bits the URL Standard defines. |
I give up. |
Can we add explicit text saying that the IDNA standard to follow is IDNA2008 (RFC 5895) non-transitional ? I think the current way the spec is phrased, it is quite unclear how to handle international domain names.
The current link goes to the unicode consortium, which directs users to RFC 3490 (IDNA 2003), which is wrong for the .de TLD and probably others. (Using IDNA2003 instead of 2008 caused a curl security advisory. a while ago)
I believe at least Firefox already uses IDNA 2008? A fun test domain is "http://straße.de/" which should not show the same as http://strasse.de/ (as it does with IDNA2003).
The text was updated successfully, but these errors were encountered: