-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect XN-Labels case-insensitively #636
Detect XN-Labels case-insensitively #636
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your submission @CBonnell !
In order to avoid the bug mentioned in my previous comment, would you mind adding the following code into v3/util/idna.go
and v3/util/idna_test.go
using them in your changes where appropiate? Or at least something similar?
v3/util.idna.go
/*
* ZLint Copyright 2021 Regents of the University of Michigan
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy
* of the License at http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License.
*/
package util
import (
"regexp"
"golang.org/x/net/idna"
)
var acePrefix = regexp.MustCompile(`^[xX][nN]--.*`)
// IsACEPrefixed checks whether-or-not the given string is prefixed
// with the case-insensitive string "xn--".
//
// This check is useful given the bug following bug report for IDNA wherein
// the ACE prefix incorrectly taken to be case-sensitive.
//
// https://github.com/golang/go/issues/48778
func IsACEPrefixed(s string) bool {
return acePrefix.MatchString(s)
}
// IdnaToUnicode is a wrapper around idna.ToUnicode.
//
// If the provided string is ACE prefixed, then that ACE prefix
// is coerced to a lowercase "xn--" before processing by
// then idna package.
//
// This is only necessary due to the bug at https://github.com/golang/go/issues/48778
func IdnaToUnicode(s string) (string, error) {
if IsACEPrefixed(s) {
s = "xn--" + s[4:]
}
return idna.ToUnicode(s)
}
v3/util/idna_test.go
/*
* ZLint Copyright 2021 Regents of the University of Michigan
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy
* of the License at http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License.
*/
package util
import (
"testing"
"golang.org/x/net/idna"
)
func TestIsACEPrefixed(t *testing.T) {
input := map[string]bool{
"xn--zlint.org": true,
"Xn--zlint.org": true,
"xN--zlint.org": true,
"XN--zlint.org": true,
"xn--": true,
"Xn--": true,
"xN--": true,
"XN--": true,
"-xn--": false,
"-Xn--": false,
"-xN--": false,
"-XN--": false,
"": false,
}
for input, want := range input {
got := IsACEPrefixed(input)
if got != want {
t.Errorf("got %v want %v for input '%s'", got, want, input)
}
}
}
func TestIdnaToUnicode(t *testing.T) {
type testData struct {
input string
want string
wantErr bool
}
input := []testData{
{"xn--Mnchen-Ost-9db", "München-Ost", false},
{"xn--Mnchen-ost-9db", "München-ost", false},
{"xn--", "", false},
{"xN--12311613412431243.com", "xn--12311613412431243.com", true},
{"Xn--Mnchen-Ost-9db", "München-Ost", false},
{"Xn--Mnchen-ost-9db", "München-ost", false},
{"Xn--", "", false},
{"xN--12311613412431243.com", "xn--12311613412431243.com", true},
{"xN--Mnchen-Ost-9db", "München-Ost", false},
{"xN--Mnchen-ost-9db", "München-ost", false},
{"xN--", "", false},
{"xN--12311613412431243.com", "xn--12311613412431243.com", true},
{"XN--Mnchen-Ost-9db", "München-Ost", false},
{"XN--Mnchen-ost-9db", "München-ost", false},
{"XN--", "", false},
{"xN--12311613412431243.com", "xn--12311613412431243.com", true},
}
for _, data := range input {
got, err := IdnaToUnicode(data.input)
gotErr := err != nil
if gotErr != data.wantErr || data.want != got {
t.Errorf("got string '%s' error '%v' for test data %v", got, err, data)
}
}
}
// This test checks whether or not https://github.com/golang/go/issues/48778 ever got fixed.
// If it did then we can likely delete some code from utils since we don't have to handle it
// with kid gloves anymore.
func TestIdnaToUnicodeBugIsStillThere(t *testing.T) {
s, _ := idna.ToUnicode("Xn--Mnchen-Ost-9db")
if s == "München-Ost" {
t.Fatal("https://github.com/golang/go/issues/48778 appears to have been fixed")
}
}
This should result in...
v3/lints/rfc/lint_idn_dnsname_malformed_unicode.go
func (l *IDNMalformedUnicode) Execute(c *x509.Certificate) *lint.LintResult {
for _, dns := range c.DNSNames {
labels := strings.Split(dns, ".")
for _, label := range labels {
if util.IsACEPrefixed(label) {
_, err := util.IdnaToUnicode(label)
if err != nil {
return &lint.LintResult{Status: lint.Error}
}
}
}
}
return &lint.LintResult{Status: lint.Pass}
}
...and...
v3/lints/rfc/lint_idn_dnsname_must_be_nfc.go
func (l *IDNNotNFC) Execute(c *x509.Certificate) *lint.LintResult {
for _, dns := range c.DNSNames {
labels := strings.Split(dns, ".")
for _, label := range labels {
if util.IsACEPrefixed(label) {
unicodeLabel, err := util.IdnaToUnicode(label)
if err != nil {
return &lint.LintResult{Status: lint.NA}
}
if !norm.NFC.IsNormalString(unicodeLabel) {
return &lint.LintResult{Status: lint.Error}
}
}
}
}
return &lint.LintResult{Status: lint.Pass}
}
if strings.HasPrefix(labelLower, "xn--") { | ||
// We need to use the lowercase label due to a bug in idna | ||
// See: https://github.com/golang/go/issues/48778 | ||
unicodeLabel, err := idna.ToUnicode(labelLower) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Punycode is relatively complex and I do believe that coercing the entire label into lowercase does have implications on the output string. Take the following (playground link) which borrows from Wikipedia's Punycode::Examples table
package main
import (
"golang.org/x/net/idna"
"fmt"
)
func main() {
upper, _ := idna.ToUnicode("xn--Mnchen-Ost-9db")
lower, _ := idna.ToUnicode("xn--Mnchen-ost-9db")
fmt.Println(lower == upper)
fmt.Println(upper)
fmt.Println(lower)
}
Here we have a label which has an ASCII literal embedded in it (the O
and o
) which ends up coming out of the other end raw.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @christopher-henderson. Your suggestion of creating reusable IDNA functions is definitely cleaner than my original proposal. I went ahead and incorporated your suggested implementation with some small changes.
In regard to case-sensitivity, the two lints that currently use ToUnicode won't have different results if the label is downcased (https://datatracker.ietf.org/doc/html/rfc3492#appendix-A), but the reusable functions definitely should preserve casing as you pointed out as returning a downcased U-label would be surprising API behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That appendix, however, is not mandatory.
Note, however, that mixed-case annotation is not used by the ToASCII and ToUnicode operations specified in [IDNA], and therefore implementors of IDNA can disregard this appendix.
And besides, the link to the Go playground code snippet clearly shows that this particular implementation does in fact get different answers in some cases depending on the casing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@christopher-henderson Regarding
Punycode is relatively complex and I do believe that coercing the entire label into lowercase does have implications on the output string.
and
That appendix, however, is not mandatory.
The point of that is that the mixed case preservation is the optional aspect, and purely a presentational decision. The appendix describes how one can preserve cases in the encoded string, in order to have it reflected in the display, but the underlying functions are insensitive.
With respect to host names (i.e. those for A and AAAA), we unambiguously know that they are case insensitive.
I may be misreading the comments here, but it should be unambiguous that
Punycode encoders and decoders need not support these annotations, and higher layers need not use them.
…/zlint into bugfix/case-insensitive-xnlabel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me now, thank you!
Adding some more people to the review to get a second pair of eyes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me as well. My caveat about just lowercasing being perfectly acceptable, but just because it’s permissible doesn’t mean we have to do it :)
lint_idn_dnsname_malformed_unicode
andlint_idn_dnsname_must_be_nfc
detect XN-labels using a case-sensitive comparison of the IDNA ACE prefix ("xn--"). As a result, some "fake A labels" or labels that whose Unicode representation is not NFC-normalized may not be correctly detected.