Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IDNA2008 #223

Closed
bagder opened this issue Jan 30, 2017 · 10 comments
Closed

IDNA2008 #223

bagder opened this issue Jan 30, 2017 · 10 comments

Comments

@bagder
Copy link

bagder commented Jan 30, 2017

Can we add explicit text saying that the IDNA standard to follow is IDNA2008 (RFC 5895) non-transitional ? I think the current way the spec is phrased, it is quite unclear how to handle international domain names.

The current link goes to the unicode consortium, which directs users to RFC 3490 (IDNA 2003), which is wrong for the .de TLD and probably others. (Using IDNA2003 instead of 2008 caused a curl security advisory. a while ago)

I believe at least Firefox already uses IDNA 2008? A fun test domain is "http://straße.de/" which should not show the same as http://strasse.de/ (as it does with IDNA2003).

@bagder
Copy link
Author

bagder commented Jan 30, 2017

Maybe that's what #53 already is about?

@annevk
Copy link
Member

annevk commented Jan 31, 2017

Yeah, switching is part of #110, but I suppose we can use this as a dedicated issue. But we need to reference UTS 46 because otherwise trivial things such as lowercasing "A" would not be defined. The thing that needs changing is the Transitional flag. That needs to become Non_Transitional. Chrome and Edge still pass Transitional however...

@annevk
Copy link
Member

annevk commented Feb 2, 2017

By the way, I think you're reading UTS 46 wrong. It simply redefines ToUnicode (originally defined in IDNA 2003) as well as ToASCII to accommodate IDNA 2008, which doesn't define such operations. And then it defines two modes transitional and nontransitional which indeed end up affecting the sharp S (and a few other things). Anyway, I think I'll make a PR to switch to nontransitional and we can use this issue for that.

@bagder
Copy link
Author

bagder commented Feb 2, 2017

Maybe I do read it wrong (I'm certainly certainly not an IDN expert), but the mere fact that TWUS (The WHATWG URL Specification) doesn't itself clearly clarify the IDNA2003/8 situation is actually my main complaint in this issue

@annevk
Copy link
Member

annevk commented Feb 2, 2017

UTS 46 handles that, no? I don't see why we would duplicate that.

@bagder
Copy link
Author

bagder commented Feb 2, 2017

Does it? Can you quote or link the parts (in that huge document) that clarifies this?

@annevk
Copy link
Member

annevk commented Feb 2, 2017

It starts right with that in the introduction... http://www.unicode.org/reports/tr46/#Introduction That document is what browsers all implement. There are some subtle differences with regards to flags and which strings end up going through the algorithms defined there, for which we have some tracking bugs and tests in the works, but by and large there is agreement.

@bagder
Copy link
Author

bagder commented Feb 2, 2017

... and then it goes on at length describing both IDNA 2003 and IDNA 2008.

The incompatibilities force implementers of client software, such as browsers and emailers, to face difficult choices during the transition period as registries shift from IDNA2003 to IDNA2008

To me, that's far from a clear statement on what a URL should use 2017.

@annevk
Copy link
Member

annevk commented Feb 2, 2017

It explains the IDNA2003/2008 situation. I thought that's what you were looking for?

What you actually need to implement is defined in the Processing section http://www.unicode.org/reports/tr46/#Processing. How exactly that section works depends on which flags are passed, which are the bits the URL Standard defines.

@bagder
Copy link
Author

bagder commented Feb 2, 2017

I give up.

@bagder bagder closed this as completed Feb 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants