From 8dc2610b2b72d07c8b0e808df9cd6baa7376d70d Mon Sep 17 00:00:00 2001 From: Timothy Gu Date: Wed, 12 May 2021 18:20:47 -0700 Subject: [PATCH] Editorial: Add note about when ToASCII = ASCII lowercase Many implementations currently skip ToASCII if domain is ASCII-only, but as discovered in [1] and [2], this can result in some undesirable behavior. Adding a note prevents implementors from making the mistake of thinking ToASCII is a no-op if the input is ASCII, and also provides a recommendation on how to properly optimize the ToASCII step. [1]: https://github.com/whatwg/url/issues/267 [2]: https://github.com/whatwg/url/pull/309#issuecomment-301405332 --- url.bs | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/url.bs b/url.bs index cfed93e2..962bb487 100644 --- a/url.bs +++ b/url.bs @@ -528,11 +528,17 @@ decisions made, i.e. whether to use the same site or schemelessly same domain and an optional boolean beStrict (default false), runs these steps:
    -
  1. Let result be the result of running Unicode ToASCII - with domain_name set to domain, UseSTD3ASCIIRules set to - beStrict, CheckHyphens set to false, CheckBidi set to true, - CheckJoiners set to true, Transitional_Processing set to false, - and VerifyDnsLength set to beStrict. +

  2. +

    Let result be the result of running Unicode ToASCII + with domain_name set to domain, UseSTD3ASCIIRules set to + beStrict, CheckHyphens set to false, CheckBidi set to true, + CheckJoiners set to true, Transitional_Processing set to false, + and VerifyDnsLength set to beStrict. + +

    If beStrict is false, domain is an ASCII string, and + strictly splitting domain on U+002E (.) does not produce any + item that starts with "xn--", this step is + equivalent to ASCII lowercasing domain.

  3. If result is a failure value, validation error, return failure.