-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IDNA #53
Comments
I’ve just submitted the following to http://www.unicode.org/reporting.html. I’ll update this when I get a response.
|
My report to Unicode from some time ago which seems to not be fixed yet:
The other thing is that there's only a single IdnaTest file, but there's no explanation to which algorithm it applies. Is it for IDNA2008, IDNA2003 or UTS46? It seems to be categorized according to Unicode standard instead of IDNA reference, which makes this really confusing. Haven't reported that one yet though. |
@Sebmaster regarding the other thing, http://www.unicode.org/reports/tr46/#Conformance_Testing explains how "To test for conformance to UTS46" using |
@SimonSapin I'm not sure that's totally correct either since:
is not described in TR46 at all. It's imported from the IDNA2008 standard, which has no relevance in the TR46 spec... I think 😕 |
Got a mail today from Unicode (regarding conformance test description):
So that's pretty sweet. |
Oh yeah, I came back into this and recall that the IdnaTest.txt is really bad at telling you how to process it. @Sebmaster: |
I got a response to #53 (comment):
|
… and today:
|
As per servo/rust-url#160 I submitted feedback regarding Validation rule no. 2 - "2. The label must not contain a U+002D HYPHEN-MINUS character in both the third and fourth positions." |
@valenting Your feedback is tracked as part of PRI317 http://www.unicode.org/review/pri317/ (being discussed now). By the way @SimonSapin I'd think the right way to track is via UTC agenda items http://www.unicode.org/L2/L-curdoc.htm |
It seems like Unicode has closed that ticket without removing the -- validity requirement 😞 Does anybody have the ability to look into the Unicode ... process to see what's going on there? |
|
Going forward, rather than tracking all UTS 46 feedback here, I suggest we just create new issues against this repository, so we can discuss each problem in isolation. I created an idna label that we can use to group them all. |
As an update to the original issue, it seems the proposed changes to UTS#46 have been incorporated into its latest draft: http://www.unicode.org/reports/tr46/proposed.html. Since traditionally UTS#46 updates are synced with Unicode Standard updates, a new version of UTS#46 with the CheckHyphens hook should be published next month, when Unicode 10.0.0 is scheduled to be released as well. |
I think we should go for a quick-fix first so that people who are trying to use spec-complaint libraries like Node.js's URL and jsdom/whatwg-url don't continue to suffer. We can use #267 to figure out a longer-term browser-compatible plan. |
|
Sure, I meant we should go for a quick-fix for the hyphens. Examples of suffering:
|
As reported at #53 (comment) this is causing issues in non-browser implementations.
Tests for whatwg/url#53 and friends.
Tests for whatwg/url#53 and friends.
FWIW, I changed my mind after seeing #309 (comment). I think what UTS46 revision 18 defines is reasonable and that's what we should go with. |
@Sebmaster did your testing issue ever got addressed? If not, could you file a new issue on that? I'm happy to help investigate that as I've made some attempts myself as well now. |
Fixes #53 and fixes #267 by no longer breaking on on hyphens in the 3rd and 4th position of a domain label. This is known to break YouTube: r3---sn-2gb7ln7k.googlevideo.com. This is done by setting the proposed CheckHyphens flag to false. Fixes #110 by clarifying that BIDI and CONTEXTJ checks are to be done by setting the proposed CheckBidi and CheckJoiners flags to true. Follow-up #313 is filed to remove the proposed bits once Unicode is updated.
Tests: web-platform-tests/wpt#5976. Fixes #53 and fixes #267 by no longer breaking on on hyphens in the 3rd and 4th position of a domain label. This is known to break YouTube: r3---sn-2gb7ln7k.googlevideo.com. This is done by setting the proposed CheckHyphens flag to false. Fixes #110 by clarifying that BIDI and CONTEXTJ checks are to be done by setting the proposed CheckBidi and CheckJoiners flags to true. Follow-up #313 is filed to remove the proposed bits once Unicode is updated.
It's not addressed yet, but the latest draft contains a TODO for it. |
Tests for whatwg/url#53 and friends, as fixed by whatwg/html#2627.
I reported the following editorial issue:
|
@Sebmaster it seems http://www.unicode.org/reports/tr46/#Conformance_Testing was updated. @domenic I asked for that change too. Note that if we want to track it here it should become its own issue. This is no longer a meta issue for all things IDNA as it got too unwieldy. |
No need to really track it. I do think though having a public archive of feedback we've submitted is good, and my bad for my part in derailing the thread away from that. Maybe www-archive is OK? |
Yeah or just a new issue for each piece of feedback. I don't think that would get crowded and if it does we can figure out a better approach. |
This issue tracks faults in http://www.unicode.org/reports/tr46/ since Unicode doesn't really do that well. If you find an issue, use http://www.unicode.org/reporting.html to report it and then report back here.
The text was updated successfully, but these errors were encountered: