From 84e2ff373841175018136e7e48023fc7fa369c45 Mon Sep 17 00:00:00 2001 From: James M Snell Date: Wed, 4 Jan 2017 15:41:09 -0800 Subject: [PATCH] doc: add basic documentation for WHATWG URL API PR-URL: https://github.com/nodejs/node/pull/10620 Reviewed-By: Sam Roberts Reviewed-By: Joyee Cheung Reviewed-By: Michal Zasso Reviewed-By: Timothy Gu --- doc/api/url.md | 458 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 458 insertions(+) mode change 100644 => 100755 doc/api/url.md diff --git a/doc/api/url.md b/doc/api/url.md old mode 100644 new mode 100755 index 40a3440195e69a..50e7433e6110e5 --- a/doc/api/url.md +++ b/doc/api/url.md @@ -250,7 +250,465 @@ properties of URL objects: For example, the ASCII space character (`' '`) is encoded as `%20`. The ASCII forward slash (`/`) character is encoded as `%3C`. +## The WHATWG URL API + +> Stability: 1 - Experimental + +The `url` module provides an *experimental* implementation of the +[WHATWG URL Standard][] as an alternative to the existing `url.parse()` API. + +```js +const URL = require('url').URL; +const myURL = new URL('https://example.org/foo'); + +console.log(myURL.href); // https://example.org/foo +console.log(myURL.protocol); // https: +console.log(myURL.hostname); // example.org +console.log(myURL.pathname); // /foo +``` + +*Note*: Using the `delete` keyword (e.g. `delete myURL.protocol`, +`delete myURL.pathname`, etc) has no effect but will still return `true`. + +### Class: URL +#### Constructor: new URL(input[, base]) + +* `input` {String} The input URL to parse +* `base` {String | URL} The base URL to resolve against if the `input` is not + absolute. + +Creates a new `URL` object by parsing the `input` relative to the `base`. If +`base` is passed as a string, it will be parsed equivalent to `new URL(base)`. + +```js +const myURL = new URL('/foo', 'https://example.org/'); + // https://example.org/foo +``` + +A `TypeError` will be thrown if the `input` or `base` are not valid URLs. Note +that an effort will be made to coerce the given values into strings. For +instance: + +```js +const myURL = new URL({toString: () => 'https://example.org/'}); + // https://example.org/ +``` + +Unicode characters appearing within the hostname of `input` will be +automatically converted to ASCII using the [Punycode][] algorithm. + +```js +const myURL = new URL('https://你好你好'); + // https://xn--6qqa088eba +``` + +Additional [examples of parsed URLs][] may be found in the WHATWG URL Standard. + +#### url.hash + +Gets and sets the fragment portion of the URL. + +```js +const myURL = new URL('https://example.org/foo#bar'); +console.log(myURL.hash); + // Prints #bar + +myURL.hash = 'baz'; +console.log(myURL.href); + // Prints https://example.org/foo#baz +``` + +Invalid URL characters included in the value assigned to the `hash` property +are [percent-encoded](#whatwg-percent-encoding). Note that the selection of +which characters to percent-encode may vary somewhat from what the +[`url.parse()`][] and [`url.format()`][] methods would produce. + +#### url.host + +Gets and sets the host portion of the URL. + +```js +const myURL = new URL('https://example.org:81/foo'); +console.log(myURL.host); + // Prints example.org:81 + +myURL.host = 'example.com:82'; +console.log(myURL.href); + // Prints https://example.com:82/foo +``` + +Invalid host values assigned to the `host` property are ignored. + +#### url.hostname + +Gets and sets the hostname portion of the URL. The key difference between +`url.host` and `url.hostname` is that `url.hostname` does *not* include the +port. + +```js +const myURL = new URL('https://example.org:81/foo'); +console.log(myURL.hostname); + // Prints example.org + +myURL.hostname = 'example.com:82'; +console.log(myURL.href); + // Prints https://example.com:81/foo +``` + +Invalid hostname values assigned to the `hostname` property are ignored. + +#### url.href + +Gets and sets the serialized URL. + +```js +const myURL = new URL('https://example.org/foo'); +console.log(myURL.href); + // Prints https://example.org/foo + +myURL.href = 'https://example.com/bar' + // Prints https://example.com/bar +``` + +Setting the value of the `href` property to a new value is equivalent to +creating a new `URL` object using `new URL(value)`. Each of the `URL` object's +properties will be modified. + +If the value assigned to the `href` property is not a valid URL, a `TypeError` +will be thrown. + +#### url.origin + +Gets the read-only serialization of the URL's origin. Unicode characters that +may be contained within the hostname will be encoded as-is without [Punycode][] +encoding. + +```js +const myURL = new URL('https://example.org/foo/bar?baz'); +console.log(myURL.origin); + // Prints https://example.org +``` + +```js +const idnURL = new URL('https://你好你好'); +console.log(idnURL.origin); + // Prints https://你好你好 + +console.log(idnURL.hostname); + // Prints xn--6qqa088eba +``` + +#### url.password + +Gets and sets the password portion of the URL. + +```js +const myURL = new URL('https://abc:xyz@example.com'); +console.log(myURL.password); + // Prints xyz + +myURL.password = '123'; +console.log(myURL.href); + // Prints https://abc:123@example.com +``` + +Invalid URL characters included in the value assigned to the `password` property +are [percent-encoded](#whatwg-percent-encoding). Note that the selection of +which characters to percent-encode may vary somewhat from what the +[`url.parse()`][] and [`url.format()`][] methods would produce. + +#### url.pathname + +Gets and sets the path portion of the URL. + +```js +const myURL = new URL('https://example.org/abc/xyz?123'); +console.log(myURL.pathname); + // Prints /abc/xyz + +myURL.pathname = '/abcdef'; +console.log(myURL.href); + // Prints https://example.org/abcdef?123 +``` + +Invalid URL characters included in the value assigned to the `pathname` +property are [percent-encoded](#whatwg-percent-encoding). Note that the +selection of which characters to percent-encode may vary somewhat from what the +[`url.parse()`][] and [`url.format()`][] methods would produce. + +#### url.port + +Gets and sets the port portion of the URL. When getting the port, the value +is returned as a String. + +```js +const myURL = new URL('https://example.org:8888'); +console.log(myURL.port); + // Prints 8888 + +myURL.port = 1234; +console.log(myURL.href); + // Prints https://example.org:1234 +``` + +The port value may be set as either a number or as a String containing a number +in the range `0` to `65535` (inclusive). Setting the value to the default port +of the `URL` objects given `protocol` will result in the `port` value becoming +the empty string (`''`). + +Invalid URL port values assigned to the `port` property are ignored. + +#### url.protocol + +Gets and sets the protocol portion of the URL. + +```js +const myURL = new URL('https://example.org'); +console.log(myURL.protocol); + // Prints http: + +myURL.protocol = 'ftp'; +console.log(myURL.href); + // Prints ftp://example.org +``` + +Invalid URL protocol values assigned to the `protocol` property are ignored. + +#### url.search + +Gets and sets the serialized query portion of the URL. + +```js +const myURL = new URL('https://example.org/abc?123'); +console.log(myURL.search); + // Prints ?123 + +myURL.search = 'abc=xyz'; +console.log(myURL.href); + // Prints https://example.org/abc?abc=xyz +``` + +Any invalid URL characters appearing in the value assigned the `search` +property will be [percent-encoded](#whatwg-percent-encoding). Note that the +selection of which characters to percent-encode may vary somewhat from what the +[`url.parse()`][] and [`url.format()`][] methods would produce. + +#### url.searchParams + +Gets a [`URLSearchParams`](#url_class_urlsearchparams) object representing the +query parameters of the URL. + +#### url.username + +Gets and sets the username portion of the URL. + +```js +const myURL = new URL('https://abc:xyz@example.com'); +console.log(myURL.username); + // Prints abc + +myURL.username = '123'; +console.log(myURL.href); + // Prints https://123:xyz@example.com +``` + +Any invalid URL characters appearing in the value assigned the `username` +property will be [percent-encoded](#whatwg-percent-encoding). Note that the +selection of which characters to percent-encode may vary somewhat from what the +[`url.parse()`][] and [`url.format()`][] methods would produce. + +#### url.toString() + +The `toString()` method on the `URL` object returns the serialized URL. The +value returned is equivalent to that of `url.href`. + +### Class: URLSearchParams + +The `URLSearchParams` object provides read and write access to the query of a +`URL`. + +```js +const URL = require('url').URL; +const myURL = new URL('https://example.org/?abc=123'); +console.log(myURL.searchParams.get('abc')); + // Prints 123 + +myURL.searchParams.append('abc', 'xyz'); +console.log(myURL.href); + // Prints https://example.org/?abc=123&abc=xyz + +myURL.searchParams.delete('abc'); +myURL.searchParams.set('a', 'b'); +console.log(myURL.href); + // Prints https://example.org/?a=b +``` + +#### Constructor: new URLSearchParams([init]) + +* `init` {String} The URL query + +#### urlSearchParams.append(name, value) + +* `name` {String} +* `value` {String} + +Append a new name-value pair to the query string. + +#### urlSearchParams.delete(name) + +* `name` {String} + +Remove all name-value pairs whose name is `name`. + +#### urlSearchParams.entries() + +* Returns: {Iterator} + +Returns an ES6 Iterator over each of the name-value pairs in the query. +Each item of the iterator is a JavaScript Array. The first item of the Array +is the `name`, the second item of the Array is the `value`. + +Alias for `urlSearchParams\[\@\@iterator\]()`. + +#### urlSearchParams.forEach(fn) + +* `fn` {Function} Function invoked for each name-value pair in the query. + +Iterates over each name-value pair in the query and invokes the given function. + +```js +const URL = require('url').URL; +const myURL = new URL('https://example.org/?a=b&c=d'); +myURL.searchParams.forEach((value, name) => { + console.log(name, value); +}); +``` + +#### urlSearchParams.get(name) + +* `name` {String} +* Returns: {String} or `null` if there is no name-value pair with the given + `name`. + +Returns the value of the first name-value pair whose name is `name`. + +#### urlSearchParams.getAll(name) + +* `name` {String} +* Returns: {Array} + +Returns the values of all name-value pairs whose name is `name`. + +#### urlSearchParams.has(name) + +* `name` {String} +* Returns: {Boolean} + +Returns `true` if there is at least one name-value pair whose name is `name`. + +#### urlSearchParams.keys() + +* Returns: {Iterator} + +Returns an ES6 Iterator over the names of each name-value pair. + +#### urlSearchParams.set(name, value) + +* `name` {String} +* `value` {String} + +Remove any existing name-value pairs whose name is `name` and append a new +name-value pair. + +#### urlSearchParams.toString() + +* Returns: {String} + +Returns the search parameters serialized as a URL-encoded string. + +#### urlSearchParams.values() + +* Returns: {Iterator} + +Returns an ES6 Iterator over the values of each name-value pair. + +#### urlSearchParams\[\@\@iterator\]() + +* Returns: {Iterator} + +Returns an ES6 Iterator over each of the name-value pairs in the query string. +Each item of the iterator is a JavaScript Array. The first item of the Array +is the `name`, the second item of the Array is the `value`. + +Alias for `urlSearchParams.entries()`. + +### require('url').domainToAscii(domain) + +* `domain` {String} +* Returns: {String} + +Returns the [Punycode][] ASCII serialization of the `domain`. + +*Note*: The `require('url').domainToAscii()` method is introduced as part of +the new `URL` implementation but is not part of the WHATWG URL standard. + +### require('url').domainToUnicode(domain) + +* `domain` {String} +* Returns: {String} + +Returns the Unicode serialization of the `domain`. + +*Note*: The `require('url').domainToUnicode()` API is introduced as part of the +the new `URL` implementation but is not part of the WHATWG URL standard. + + +### Percent-Encoding in the WHATWG URL Standard + +URLs are permitted to only contain a certain range of characters. Any character +falling outside of that range must be encoded. How such characters are encoded, +and which characters to encode depends entirely on where the character is +located within the structure of the URL. The WHATWG URL Standard uses a more +selective and fine grained approach to selecting encoded characters than that +used by the older [`url.parse()`][] and [`url.format()`][] methods. + +The WHATWG algorithm defines three "encoding sets" that describe ranges of +characters that must be percent-encoded: + +* The *simple encode set* includes code points in range U+0000 to U+001F + (inclusive) and all code points greater than U+007E. + +* The *default encode set* includes the *simple encode set* and code points + U+0020, U+0022, U+0023, U+003C, U+003E, U+003F, U+0060, U+007B, and U+007D. + +* The *userinfo encode set* includes the *default encode set* and code points + U+002F, U+003A, U+003B, U+003D, U+0040, U+005B, U+005C, U+005D, U+005E, and + U+007C. + +The *simple encode set* is used primary for URL fragments and certain specific +conditions for the path. The *userinfo encode set* is used specifically for +username and passwords encoded within the URL. The *default encode set* is used +for all other cases. + +When non-ASCII characters appear within a hostname, the hostname is encoded +using the [Punycode][] algorithm. Note, however, that a hostname *may* contain +*both* Punycode encoded and percent-encoded characters. For example: + +```js +const URL = require('url').URL; +const myURL = new URL('https://%CF%80.com/foo'); +console.log(myURL.href); + // Prints https://xn--1xa.com/foo +console.log(myURL.origin); + // Prints https://π.com +``` + [`Error`]: errors.html#errors_class_error [`querystring`]: querystring.html [`TypeError`]: errors.html#errors_class_typeerror +[WHATWG URL Standard]: https://url.spec.whatwg.org/ +[examples of parsed URLs]: https://url.spec.whatwg.org/#example-url-parsing +[`url.parse()`]: #url_url_parse_urlstring_parsequerystring_slashesdenotehost +[`url.format()`]: #url_url_format_urlobject +[Punycode]: https://tools.ietf.org/html/rfc5891#section-4.4