Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(v7.x backport) url: updates to the WHATWG URL parser #12507

Merged
merged 21 commits into from
Apr 25, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
71d3f94
url: extend URLSearchParams constructor
TimothyGu Jan 28, 2017
c40a45f
doc: document URLSearchParams constructor
TimothyGu Jan 28, 2017
b0fecbe
url: enforce valid UTF-8 in WHATWG parser
TimothyGu Feb 4, 2017
c3366a5
url: prioritize toString when stringifying
TimothyGu Mar 8, 2017
6b2cb6d
url: spec-compliant URLSearchParams serializer
TimothyGu Feb 4, 2017
7e7fd66
src: remove explicit UTF-8 validity check in url
TimothyGu Mar 15, 2017
4a94c2d
querystring: move isHexTable to internal
TimothyGu Mar 15, 2017
d86f0d7
url: spec-compliant URLSearchParams parser
TimothyGu Mar 15, 2017
a2a3d6c
url: use a class for WHATWG url[context]
TimothyGu Mar 22, 2017
75ef213
url: add ToObject method to native URL class
jasnell Mar 27, 2017
5b7b775
src: WHATWG URL C++ parser cleanup
TimothyGu Mar 16, 2017
d912e28
url: change path parsing for non-special URLs
watilde Apr 3, 2017
dceb12e
test: synchronize WPT url test data
watilde Apr 3, 2017
43faf56
url: error when domainTo*() is called w/o argument
TimothyGu Mar 20, 2017
dafa600
url: avoid instanceof for WHATWG URL
mscdex Mar 5, 2017
68cf850
url: trim leading slashes of file URL paths
watilde Apr 10, 2017
752097c
url: remove javascript URL special case
watilde Apr 12, 2017
f484cfd
url: disallow invalid IPv4 in IPv6 parser
watilde Apr 14, 2017
9288b73
url: clean up WHATWG URL origin generation
TimothyGu Apr 5, 2017
8f702ef
url: improve WHATWG URL inspection
TimothyGu Apr 5, 2017
473bd5e
src: clean up WHATWG WG parser
TimothyGu Apr 6, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion benchmark/url/legacy-vs-whatwg-url-searchparams-parse.js
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ const inputs = require('../fixtures/url-inputs.js').searchParams;
const bench = common.createBenchmark(main, {
type: Object.keys(inputs),
method: ['legacy', 'whatwg'],
n: [1e5]
n: [1e6]
});

function useLegacy(n, input) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ const inputs = require('../fixtures/url-inputs.js').searchParams;
const bench = common.createBenchmark(main, {
type: Object.keys(inputs),
method: ['legacy', 'whatwg'],
n: [1e5]
n: [1e6]
});

function useLegacy(n, input, prop) {
Expand Down
2 changes: 1 addition & 1 deletion benchmark/url/url-searchparams-read.js
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ const { URLSearchParams } = require('url');
const bench = common.createBenchmark(main, {
method: ['get', 'getAll', 'has'],
param: ['one', 'two', 'three', 'nonexistent'],
n: [1e6]
n: [2e7]
});

const str = 'one=single&two=first&three=first&two=2nd&three=2nd&three=3rd';
Expand Down
2 changes: 1 addition & 1 deletion benchmark/url/whatwg-url-properties.js
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ const bench = common.createBenchmark(main, {
prop: ['href', 'origin', 'protocol',
'username', 'password', 'host', 'hostname', 'port',
'pathname', 'search', 'searchParams', 'hash'],
n: [1e4]
n: [3e5]
});

function setAndGet(n, url, prop, alternative) {
Expand Down
126 changes: 122 additions & 4 deletions doc/api/url.md
Original file line number Diff line number Diff line change
Expand Up @@ -693,15 +693,17 @@ console.log(JSON.stringify(myURLs));
### Class: URLSearchParams

The `URLSearchParams` API provides read and write access to the query of a
`URL`.
`URL`. The `URLSearchParams` class can also be used standalone with one of the
four following constructors.

The WHATWG `URLSearchParams` interface and the [`querystring`][] module have
similar purpose, but the purpose of the [`querystring`][] module is more
general, as it allows the customization of delimiter characters (`&` and `=`).
On the other hand, this API is designed purely for URL query strings.

```js
const URL = require('url').URL;
const { URL, URLSearchParams } = require('url');

const myURL = new URL('https://example.org/?abc=123');
console.log(myURL.searchParams.get('abc'));
// Prints 123
Expand All @@ -714,11 +716,125 @@ myURL.searchParams.delete('abc');
myURL.searchParams.set('a', 'b');
console.log(myURL.href);
// Prints https://example.org/?a=b

const newSearchParams = new URLSearchParams(myURL.searchParams);
// The above is equivalent to
// const newSearchParams = new URLSearchParams(myURL.search);

newSearchParams.append('a', 'c');
console.log(myURL.href);
// Prints https://example.org/?a=b
console.log(newSearchParams.toString());
// Prints a=b&a=c

// newSearchParams.toString() is implicitly called
myURL.search = newSearchParams;
console.log(myURL.href);
// Prints https://example.org/?a=b&a=c
newSearchParams.delete('a');
console.log(myURL.href);
// Prints https://example.org/?a=b&a=c
```

#### Constructor: new URLSearchParams([init])
#### Constructor: new URLSearchParams()

Instantiate a new empty `URLSearchParams` object.

#### Constructor: new URLSearchParams(string)

* `string` {string} A query string

Parse the `string` as a query string, and use it to instantiate a new
`URLSearchParams` object. A leading `'?'`, if present, is ignored.

```js
const { URLSearchParams } = require('url');
let params;

params = new URLSearchParams('user=abc&query=xyz');
console.log(params.get('user'));
// Prints 'abc'
console.log(params.toString());
// Prints 'user=abc&query=xyz'

params = new URLSearchParams('?user=abc&query=xyz');
console.log(params.toString());
// Prints 'user=abc&query=xyz'
```

#### Constructor: new URLSearchParams(obj)

* `obj` {Object} An object representing a collection of key-value pairs

* `init` {String} The URL query
Instantiate a new `URLSearchParams` object with a query hash map. The key and
value of each property of `obj` are always coerced to strings.

*Note*: Unlike [`querystring`][] module, duplicate keys in the form of array
values are not allowed. Arrays are stringified using [`array.toString()`][],
which simply joins all array elements with commas.

```js
const { URLSearchParams } = require('url');
const params = new URLSearchParams({
user: 'abc',
query: ['first', 'second']
});
console.log(params.getAll('query'));
// Prints ['first,second']
console.log(params.toString());
// Prints 'user=abc&query=first%2Csecond'
```

#### Constructor: new URLSearchParams(iterable)

* `iterable` {Iterable} An iterable object whose elements are key-value pairs

Instantiate a new `URLSearchParams` object with an iterable map in a way that
is similar to [`Map`][]'s constructor. `iterable` can be an Array or any
iterable object. That means `iterable` can be another `URLSearchParams`, in
which case the constructor will simply create a clone of the provided
`URLSearchParams`. Elements of `iterable` are key-value pairs, and can
themselves be any iterable object.

Duplicate keys are allowed.

```js
const { URLSearchParams } = require('url');
let params;

// Using an array
params = new URLSearchParams([
['user', 'abc'],
['query', 'first'],
['query', 'second']
]);
console.log(params.toString());
// Prints 'user=abc&query=first&query=second'

// Using a Map object
const map = new Map();
map.set('user', 'abc');
map.set('query', 'xyz');
params = new URLSearchParams(map);
console.log(params.toString());
// Prints 'user=abc&query=xyz'

// Using a generator function
function* getQueryPairs() {
yield ['user', 'abc'];
yield ['query', 'first'];
yield ['query', 'second'];
}
params = new URLSearchParams(getQueryPairs());
console.log(params.toString());
// Prints 'user=abc&query=first&query=second'

// Each key-value pair must have exactly two elements
new URLSearchParams([
['user', 'abc', 'error']
]);
// Throws TypeError: Each query pair must be a name/value tuple
```

#### urlSearchParams.append(name, value)

Expand Down Expand Up @@ -975,6 +1091,8 @@ console.log(myURL.origin);
[`require('url').format()`]: #url_url_format_url_options
[`url.toString()`]: #url_url_tostring
[Punycode]: https://tools.ietf.org/html/rfc5891#section-4.4
[`Map`]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Map
[`array.toString()`]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/toString
[WHATWG URL]: #url_the_whatwg_url_api
[`new URL()`]: #url_constructor_new_url_input_base
[`url.href`]: #url_url_href
Expand Down
4 changes: 4 additions & 0 deletions lib/internal/bootstrap_node.js
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,10 @@

_process.setupRawDebug();

// Ensure setURLConstructor() is called before the native
// URL::ToObject() method is used.
NativeModule.require('internal/url');

Object.defineProperty(process, 'argv0', {
enumerable: true,
configurable: false,
Expand Down
20 changes: 20 additions & 0 deletions lib/internal/querystring.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,32 @@ const hexTable = new Array(256);
for (var i = 0; i < 256; ++i)
hexTable[i] = '%' + ((i < 16 ? '0' : '') + i.toString(16)).toUpperCase();

const isHexTable = [
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // 0 - 15
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // 16 - 31
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // 32 - 47
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, // 48 - 63
0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, // 64 - 79
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // 80 - 95
0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, // 96 - 111
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // 112 - 127
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // 128 ...
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 // ... 256
];

// Instantiating this is faster than explicitly calling `Object.create(null)`
// to get a "clean" empty object (tested with v8 v4.9).
function StorageObject() {}
StorageObject.prototype = Object.create(null);

module.exports = {
hexTable,
isHexTable,
StorageObject
};
Loading