diff --git a/url.bs b/url.bs index fb75bb66..24b45782 100644 --- a/url.bs +++ b/url.bs @@ -88,7 +88,7 @@ number.

A validation error indicates a mismatch between input and valid input. User agents, especially conformance checkers, are encouraged to report them somewhere. -

+

A validation error does not mean that the parser terminates. Termination of a parser is always stated explicitly, e.g., through a return statement. @@ -278,7 +278,7 @@ optional boolean spaceAsPlus (default false), run these steps: shortest sequence of ASCII digits representing potentialError in base ten, followed by "%3B", to output. -

This can happen when encoding is not UTF-8. +

This can happen when encoding is not UTF-8.

  • Return output. @@ -699,8 +699,8 @@ inclusive, separated from each other by U+002E (.).

  • U+005B ([), followed by a valid IPv6-address string, followed by U+005D (]). -

    This is not part of the definition of valid host string as it -requires context to be distinguished. +

    This is not part of the definition of valid host string as it requires context +to be distinguished.

    Host parsing

    @@ -729,9 +729,8 @@ runs these steps:

    Let domain be the result of running UTF-8 decode without BOM on the percent-decoding of input. -

    Alternatively UTF-8 decode without BOM or fail can be used, - coupled with an early return for failure, as domain to ASCII fails on - U+FFFD REPLACEMENT CHARACTER. +

    Alternatively UTF-8 decode without BOM or fail can be used, coupled with an + early return for failure, as domain to ASCII fails on U+FFFD (�).

  • Let asciiDomain be the result of running domain to ASCII on domain. @@ -790,9 +789,9 @@ these steps:

  • Let validationError be false. -

    This uses validationError to track validation errors - to avoid reporting them before we are confident we want to parse input as an IPv4 - address as the host parser almost always invokes the IPv4 parser. +

    This uses validationError to track validation errors to avoid + reporting them before we are confident we want to parse input as an IPv4 address as the + host parser almost always invokes the IPv4 parser.

  • Let parts be the result of strictly splitting input on U+002E (.). @@ -1403,10 +1402,9 @@ an ASCII string that can be used for further processing on the resource t blob URL entry that is either null or a blob URL entry. It is initially null. -

    This is used to support caching the object a "blob" URL -refers to as well as its origin. It is important that these are cached as the URL might -be removed from the blob URL store between parsing and fetching, while fetching will still -need to succeed. +

    This is used to support caching the object a "blob" URL refers to as well +as its origin. It is important that these are cached as the URL might be removed from +the blob URL store between parsing and fetching, while fetching will still need to succeed.

    The following table lists how valid URL strings, when parsed, map @@ -1495,8 +1493,8 @@ not a special scheme.

    A URL can be designated as base URL. -

    A base URL is useful for the URL parser when the -input might be a relative-URL string. +

    A base URL is useful for the URL parser when the input might be a +relative-URL string.


    @@ -1611,8 +1609,8 @@ switching on base URL's scheme:

    any optionally followed by U+003F (?) and a URL-query string. -

    A non-null base URL is necessary when -parsing a relative-URL string. +

    A non-null base URL is necessary when parsing a +relative-URL string.

    A scheme-relative-special-URL string must be "//", followed by a valid host string, optionally followed by U+003A (:) and a URL-port string, optionally @@ -1736,8 +1734,8 @@ different document encoding. Using the UTF-8 encoding everywhere solves t


    -

    There is no way to express a username or -password of a URL record within a valid URL string. +

    There is no way to express a username or password of a +URL record within a valid URL string.

    URL parsing

    @@ -1747,10 +1745,9 @@ different document encoding. Using the UTF-8 encoding everywhere solves t optional encoding encoding (default UTF-8), and then runs these steps: -

    Non-web-browser implementations only need to implement the -basic URL parser. +

    Non-web-browser implementations only need to implement the basic URL parser. -

    How user input in the web browser's address bar is converted to a +

    How user input in the web browser's address bar is converted to a URL record is out-of-scope of this standard. This standard does include URL rendering requirements as they pertain trust decisions. @@ -1779,7 +1776,7 @@ optional encoding encoding (default UTF-8), an op state override state override, and then runs these steps: -

    +

    The encoding argument is a legacy concept only relevant for HTML. The url and state override arguments are only for use by various APIs. [[!HTML]] @@ -2457,8 +2454,8 @@ these steps: U+005C (\), append the empty string to url's path. -

    This means that for input /usr/.. the result is - / and not a lack of a path. +

    This means that for input /usr/.. the result is / + and not a lack of a path.

  • Otherwise, if buffer is a single-dot path segment and if neither @@ -2827,8 +2824,8 @@ handled with care to prevent spoofing:

  • Browsers should render a URL's host using domain to Unicode. -

    Note that various characters can be used in homograph spoofing attacks. - Consider detecting confusable characters and warning when they are in use. [[IDNFAQ]] [[UTS39]] +

    Various characters can be used in homograph spoofing attacks. Consider detecting + confusable characters and warning when they are in use. [[IDNFAQ]] [[UTS39]]

  • URLs are particularly prone to confusion between host and path when they contain bidirectional text, so in this case it is particularly advisable to only render a URL's @@ -2841,10 +2838,10 @@ handled with care to prevent spoofing:

  • Browsers should render bidirectional text as if it were in a left-to-right embedding. [[!BIDI]] -

    Unfortunately, as rendered URLs are strings and can appear - anywhere, a specific bidirectional algorithm for rendered URLs would not see wide - adoption. Bidirectional text interacts with the parts of a URL in ways that can cause - the rendering to be different from the model. Users of bidirectional languages can come to expect +

    Unfortunately, as rendered URLs are strings and can appear anywhere, a + specific bidirectional algorithm for rendered URLs would not see wide adoption. + Bidirectional text interacts with the parts of a URL in ways that can cause the + rendering to be different from the model. Users of bidirectional languages can come to expect this, particularly in plain text environments. @@ -2855,21 +2852,19 @@ handled with care to prevent spoofing:

    The application/x-www-form-urlencoded format provides a way to encode name-value pairs. -

    The application/x-www-form-urlencoded format is in many ways -an aberrant monstrosity, the result of many years of implementation accidents and compromises -leading to a set of requirements necessary for interoperability, but in no way representing good -design practices. In particular, readers are cautioned to pay close attention to the twisted details -involving repeated (and in some cases nested) conversions between character encodings and byte -sequences. Unfortunately the format is in widespread use due to the prevalence of HTML forms. -[[HTML]] +

    The application/x-www-form-urlencoded format is in many ways an aberrant +monstrosity, the result of many years of implementation accidents and compromises leading to a set +of requirements necessary for interoperability, but in no way representing good design practices. In +particular, readers are cautioned to pay close attention to the twisted details involving repeated +(and in some cases nested) conversions between character encodings and byte sequences. Unfortunately +the format is in widespread use due to the prevalence of HTML forms. [[HTML]]

    application/x-www-form-urlencoded parsing

    -

    A legacy server-oriented implementation might have to support -encodings other than UTF-8 as well as have special logic for tuples of which the -name is `_charset`. Such logic is not described here as only UTF-8 is -conforming. +

    A legacy server-oriented implementation might have to support encodings +other than UTF-8 as well as have special logic for tuples of which the name is +`_charset`. Such logic is not described here as only UTF-8 is conforming.

    The application/x-www-form-urlencoded parser @@ -3035,7 +3030,7 @@ constructor steps are: this. -

    +

    To parse a string into a URL without using a base URL, invoke the {{URL}} constructor with a single argument: @@ -3154,10 +3149,10 @@ url.pathname // "/%F0%9F%8F%B3%EF%B8%8F%E2%80%8D%F0%9F%8C%88" state override. -

    If the given value for the host -setter lacks a port, this's URL's -port will not change. This can be unexpected as host getter -does return a URL-port string so one might have assumed the setter to always "reset" both. +

    If the given value for the host setter lacks a +port, this's URL's port will not +change. This can be unexpected as host getter does return a URL-port string so +one might have assumed the setter to always "reset" both.

    The hostname getter steps are: @@ -3503,9 +3498,9 @@ examples of proper naming. [[!HTML]]

    Acknowledgments

    -

    There have been a lot of people that have helped make URLs -more interoperable over the years and thereby furthered the goals of this standard. Likewise many -people have helped making this standard what it is today. +

    There have been a lot of people that have helped make URLs more interoperable over +the years and thereby furthered the goals of this standard. Likewise many people have helped making +this standard what it is today.

    With that, many thanks to 100の人,