Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to using one tree per method instead of a map #168

Merged
merged 2 commits into from
Oct 31, 2020

Conversation

airhorns
Copy link
Contributor

@airhorns airhorns commented Oct 27, 2020

This makes pretty printing annoying, but increases performance!

With n trees instead of one tree, each tree is only split for handlers it actually has, so for HTTP verbs like POST or PUT that tend to have fewer routes, the trees are smaller and faster to traverse. For the HTTP GET tree, there are fewer nodes and I think better cache locality as that tree is traversed the most often. Each verb doesn't pay any traversal penalty for the other trees' size. This also results in more instances of more selective version stores, which means traversing them should be faster at the expense of a bit more memory consumption.

This also makes the constraints implementation (see #166) easier, and prevents bugs like #132, and avoids the extra checks we have to do to fix that bug.

This also prevents tree traversal for methods where there are no routes at all, which is a small optimization but kinda nice regardless.

For the pretty printing algorithm, I think a nice pretty print wouldn't be per method and would instead show all routes in the same list, so I added code to merge the separate node trees and then pretty print the merged tree! To make it look pretty I added some "compression" to the tree where branches that only had one branch get compressed down, which if you ask me results in some prettier output, see the tests.

Test master one-tree-per-method
lookup static route 42,774,309 ops/sec ±0.84% (580 runs sampled) 41,200,005 ops/sec ±0.98% (591 runs sampled)
lookup dynamic route 3,536,084 ops/sec ±0.70% (587 runs sampled) 3,553,160 ops/sec ±0.28% (591 runs sampled)
lookup dynamic multi-parametric route 1,842,343 ops/sec ±0.92% (587 runs sampled) 2,047,064 ops/sec ±0.83% (584 runs sampled)
lookup dynamic multi-parametric route with regex 1,477,768 ops/sec ±0.57% (590 runs sampled) 1,500,267 ops/sec ±0.64% (590 runs sampled)
lookup long static route 3,350,884 ops/sec ±0.62% (589 runs sampled) 3,406,235 ops/sec ±0.77% (588 runs sampled)
lookup long dynamic route 2,491,556 ops/sec ±0.63% (585 runs sampled) 2,338,285 ops/sec ±1.60% (589 runs sampled)
lookup static versioned route 9,241,735 ops/sec ±0.44% (586 runs sampled) 9,239,314 ops/sec ±0.40% (586 runs sampled)
find static route 36,660,039 ops/sec ±0.76% (581 runs sampled) 35,230,842 ops/sec ±0.92% (578 runs sampled)
find dynamic route 4,473,753 ops/sec ±0.72% (588 runs sampled) 4,469,776 ops/sec ±0.33% (590 runs sampled)
find dynamic multi-parametric route 2,202,207 ops/sec ±1.00% (578 runs sampled) 2,237,214 ops/sec ±1.39% (585 runs sampled)
find dynamic multi-parametric route with regex 1,680,101 ops/sec ±0.76% (579 runs sampled) 1,533,243 ops/sec ±1.04% (581 runs sampled)
find long static route 4,633,069 ops/sec ±1.04% (588 runs sampled) 4,585,833 ops/sec ±0.51% (588 runs sampled)
find long dynamic route 3,333,916 ops/sec ±0.76% (586 runs sampled) 3,491,155 ops/sec ±0.45% (589 runs sampled)
find static versioned route 10,779,325 ops/sec ±0.73% (586 runs sampled) 10,801,810 ops/sec ±0.89% (580 runs sampled)
find long nested dynamic route 1,379,726 ops/sec ±0.45% (587 runs sampled) 1,418,610 ops/sec ±0.68% (588 runs sampled)
find long nested dynamic route with other method 1,962,454 ops/sec ±0.97% (587 runs sampled) 2,499,722 ops/sec ±0.38% (587 runs sampled)

I split this out of the work I've been doing to complete the request constraints stuff from #166 . It hasn't changed routing behaviour at all, just the performance characteristics, so I don't think it needs a major release. The pretty print output has changed somewhat, but I'm not sure if that counts as a breaking change.

Copy link
Collaborator

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work! I like this change.

I've left a few notes

index.js Outdated
code.push(`this['${m}'] = null`)
}
return new Function(code.join('\n')) // eslint-disable-line
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the use of new Function here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just copying the code that was already there -- this object with one property per HTTP method was used in Node for the handler map before. See c70b2e8. I think it a little unclear and since there's only one instance of them per router I will switch it back to being a plain old constructor.

Copy link
Contributor Author

@airhorns airhorns Oct 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it back -- this makes a huge difference:


function buildMethodMap () {
  const code = []
  for (var i = 0; i < http.METHODS.length; i++) {
    var m = http.METHODS[i]
    code.push(`this['${m}'] = null`)
  }
  return new Function(code.join('\n')) // eslint-disable-line
}

// Object with prototype slots for all the HTTP methods
const MethodMap = buildMethodMap()

gives me lookup static route x 44,563,670 ops/sec ±0.88% (584 runs sampled)

whereas a constructor that just loops over the methods gives me lookup static route x 29,380,113 ops/sec ±1.03% (573 runs sampled)

It makes sense that every find touches this object so the whole thing is very sensitive to it's property access speed. You folks probably know better than me, but I am guessing that if V8 can see exactly what properties are set in the constructor it gets the Shape right from the start, whereas if the properties are assigned dynamically, even in a constructor, it's the same as assigning them one at a time to an object literal, which gets the nasty Shape chain problem.

index.js Show resolved Hide resolved
index.js Outdated Show resolved Hide resolved
@mcollina
Copy link
Collaborator

This has a 3% decrease in the static case routing, it should be solved.

@airhorns
Copy link
Contributor Author

airhorns commented Oct 28, 2020

@mcollina or @delvedor do you have any suggestions for what might be causing that small regression? I tried profiling locally and it's hard to assess because the find method dominates everything (duh) and is so big, so there's no granularity in the profile. Any secret techniques you know about to find where time is spent or anything obvious to you looking at this PR?

@airhorns
Copy link
Contributor Author

Also, I have gotten pretty widely varying benchmark results when running locally such that I thought that static line was about the same, but do you trust the error bars on those numbers enough that they really do need to match?

@mcollina
Copy link
Collaborator

I think they are the same. I'm going to run them on a dedicated machine.

@airhorns
Copy link
Contributor Author

I found this thing the other day which might be worth setting up: https://github.com/rhysd/github-action-benchmark , though I think it wouldn't resolve the benchmark noise issue because of noisy neighbours in the cloud, it would at least mean contributors can let computers do the computer stuff of running the benchmarks and formatting them all the time instead of having to rerun locally often and try not to touch their computer while they are running :)

@airhorns airhorns force-pushed the one-tree-per-method branch 4 times, most recently from 334be9f to 2314b77 Compare October 28, 2020 13:21
@mcollina
Copy link
Collaborator

Unfortunately it slows things down.

~/repositories/find-my-way (master)
$ node bench.js
lookup static route x 25,724,202 ops/sec ±0.26% (587 runs sampled)
lookup dynamic route x 1,674,518 ops/sec ±0.31% (577 runs sampled)
lookup dynamic multi-parametric route x 940,688 ops/sec ±0.41% (574 runs sampled)
lookup dynamic multi-parametric route with regex x 701,471 ops/sec ±0.49% (575 runs sampled)
lookup long static route x 1,442,851 ops/sec ±0.56% (571 runs sampled)
lookup long dynamic route x 1,115,337 ops/sec ±0.46% (575 runs sampled)
lookup static versioned route x 3,230,064 ops/sec ±0.47% (573 runs sampled)
find static route x 14,730,058 ops/sec ±0.39% (565 runs sampled)
find dynamic route x 1,875,483 ops/sec ±0.45% (576 runs sampled)
find dynamic multi-parametric route x 1,107,988 ops/sec ±0.46% (549 runs sampled)
find dynamic multi-parametric route with regex x 766,450 ops/sec ±0.51% (573 runs sampled)
find long static route x 1,904,768 ops/sec ±0.61% (567 runs sampled)
find long dynamic route x 1,468,946 ops/sec ±0.41% (575 runs sampled)
find static versioned route x 3,968,470 ops/sec ±0.57% (566 runs sampled)
~/repositories/find-my-way (master)
$ git checkout airhorns-one-tree-per-method
Switched to branch 'airhorns-one-tree-per-method'
~/repositories/find-my-way (airhorns-one-tree-per-method)
$ node -v
v12.18.2
~/repositories/find-my-way (airhorns-one-tree-per-method)
$ node bench.js
lookup static route x 24,634,341 ops/sec ±0.21% (588 runs sampled)
lookup dynamic route x 1,657,463 ops/sec ±0.34% (577 runs sampled)
lookup dynamic multi-parametric route x 939,049 ops/sec ±0.50% (576 runs sampled)
lookup dynamic multi-parametric route with regex x 696,668 ops/sec ±0.50% (575 runs sampled)
lookup long static route x 1,422,443 ops/sec ±0.52% (574 runs sampled)
lookup long dynamic route x 1,113,603 ops/sec ±0.47% (572 runs sampled)
lookup static versioned route x 3,314,353 ops/sec ±0.46% (574 runs sampled)
find static route x 14,735,541 ops/sec ±0.44% (566 runs sampled)
find dynamic route x 1,888,898 ops/sec ±0.44% (576 runs sampled)
find dynamic multi-parametric route x 1,064,308 ops/sec ±0.50% (575 runs sampled)
find dynamic multi-parametric route with regex x 754,310 ops/sec ±0.47% (574 runs sampled)
find long static route x 1,888,400 ops/sec ±0.63% (573 runs sampled)
find long dynamic route x 1,449,513 ops/sec ±0.38% (575 runs sampled)
find static versioned route x 4,013,044 ops/sec ±0.53% (572 runs sampled)
find long nested dynamic route x 579,224 ops/sec ±0.53% (572 runs sampled)
find long nested dynamic route with other method x 1,059,698 ops/sec ±0.57% (570 runs sampled)

@mcollina
Copy link
Collaborator

$ node bench
lookup static route x 25,818,969 ops/sec ±0.21% (587 runs sampled)
lookup dynamic route x 1,615,671 ops/sec ±0.39% (576 runs sampled)
lookup dynamic multi-parametric route x 937,708 ops/sec ±0.45% (574 runs sampled)
lookup dynamic multi-parametric route with regex x 807,883 ops/sec ±2.05% (544 runs sampled)
lookup long static route x 1,448,390 ops/sec ±0.53% (574 runs sampled)
lookup long dynamic route x 1,138,918 ops/sec ±0.44% (577 runs sampled)
lookup static versioned route x 3,208,351 ops/sec ±0.41% (573 runs sampled)
find static route x 15,094,211 ops/sec ±0.41% (565 runs sampled)
find dynamic route x 1,867,672 ops/sec ±0.43% (577 runs sampled)
find dynamic multi-parametric route x 1,071,705 ops/sec ±0.50% (575 runs sampled)
find dynamic multi-parametric route with regex x 756,572 ops/sec ±0.47% (575 runs sampled)
find long static route x 1,927,721 ops/sec ±0.67% (569 runs sampled)
find long dynamic route x 1,472,752 ops/sec ±0.42% (575 runs sampled)
find static versioned route x 4,000,588 ops/sec ±0.53% (571 runs sampled)
find long nested dynamic route x 585,418 ops/sec ±0.54% (573 runs sampled)
find long nested dynamic route with other method x 1,079,334 ops/sec ±0.57% (574 runs sampled)

You'll need this patch:

diff --git a/index.js b/index.js
index a4e28c6..f4027de 100644
--- a/index.js
+++ b/index.js
@@ -26,18 +26,6 @@ if (!isRegexSafe(FULL_PATH_REGEXP)) {

 const acceptVersionStrategy = require('./lib/accept-version')

-function buildMethodMap () {
-  const code = []
-  for (var i = 0; i < http.METHODS.length; i++) {
-    var m = http.METHODS[i]
-    code.push(`this['${m}'] = null`)
-  }
-  return new Function(code.join('\n')) // eslint-disable-line
-}
-
-// Object with prototype slots for all the HTTP methods
-const MethodMap = buildMethodMap()
-
 function Router (opts) {
   if (!(this instanceof Router)) {
     return new Router(opts)
@@ -63,7 +51,7 @@ function Router (opts) {
   this.maxParamLength = opts.maxParamLength || 100
   this.allowUnsafeRegex = opts.allowUnsafeRegex || false
   this.versioning = opts.versioning || acceptVersionStrategy
-  this.trees = new MethodMap()
+  this.trees = {}
   this.routes = []
 }

@@ -318,7 +306,7 @@ Router.prototype._insert = function _insert (method, path, kind, params, handler
 }

 Router.prototype.reset = function reset () {
-  this.trees = new MethodMap()
+  this.trees = {}
   this.routes = []
 }

@airhorns
Copy link
Contributor Author

Does it make sense to you why an object literal would be faster? Shouldn't it be slower because of the shape chain created?

@airhorns airhorns force-pushed the one-tree-per-method branch from 2314b77 to ce7f7cc Compare October 28, 2020 14:57
…ndlers on each node

This makes pretty printing annoying, but increases performance!

With n trees instead of one tree, each tree is only split for handlers it actually has, so for HTTP verbs like POST or PUT that tend to have fewer routes, the trees are smaller and faster to traverse. For the HTTP GET tree, there are fewer nodes and I think better cache locality as that tree is traversed the most often. Each verb doesn't pay any traversal penalty for the other trees' size. This also results in more instances of more selective version stores, which means traversing them should be faster at the expense of a bit more memory consumption.

This also makes the constraint implementation (see delvedor#166) easier, and prevents bugs like delvedor#132, and avoids the extra checks we have to do to fix that bug.

This also prevents tree traversal for methods where there are no routes at all, which is a small optimization but kinda nice regardless.

For the pretty printing algorithm, I think a nice pretty print wouldn't be per method and would instead show all routes in the same list, so I added code to merge the separate node trees and then pretty print the merged tree! To make it look pretty I added some "compression" to the tree where branches that only had one branch get compressed down, which if you ask me results in some prettier output, see the tests.

Benchmarks:

```
kamloop ~/C/find-my-way (master) ➜  npm run bench; git checkout one-tree-per-method; npm run bench

> find-my-way@3.0.4 bench /Users/airhorns/Code/find-my-way
> node bench.js

lookup static route x 42,774,309 ops/sec ±0.84% (580 runs sampled)
lookup dynamic route x 3,536,084 ops/sec ±0.70% (587 runs sampled)
lookup dynamic multi-parametric route x 1,842,343 ops/sec ±0.92% (587 runs sampled)
lookup dynamic multi-parametric route with regex x 1,477,768 ops/sec ±0.57% (590 runs sampled)
lookup long static route x 3,350,884 ops/sec ±0.62% (589 runs sampled)
lookup long dynamic route x 2,491,556 ops/sec ±0.63% (585 runs sampled)
lookup static versioned route x 9,241,735 ops/sec ±0.44% (586 runs sampled)
find static route x 36,660,039 ops/sec ±0.76% (581 runs sampled)
find dynamic route x 4,473,753 ops/sec ±0.72% (588 runs sampled)
find dynamic multi-parametric route x 2,202,207 ops/sec ±1.00% (578 runs sampled)
find dynamic multi-parametric route with regex x 1,680,101 ops/sec ±0.76% (579 runs sampled)
find long static route x 4,633,069 ops/sec ±1.04% (588 runs sampled)
find long dynamic route x 3,333,916 ops/sec ±0.76% (586 runs sampled)
find static versioned route x 10,779,325 ops/sec ±0.73% (586 runs sampled)
find long nested dynamic route x 1,379,726 ops/sec ±0.45% (587 runs sampled)
find long nested dynamic route with other method x 1,962,454 ops/sec ±0.97% (587 runs sampled)

> find-my-way@3.0.4 bench /Users/airhorns/Code/find-my-way
> node bench.js

lookup static route x 41,200,005 ops/sec ±0.98% (591 runs sampled)
lookup dynamic route x 3,553,160 ops/sec ±0.28% (591 runs sampled)
lookup dynamic multi-parametric route x 2,047,064 ops/sec ±0.83% (584 runs sampled)
lookup dynamic multi-parametric route with regex x 1,500,267 ops/sec ±0.64% (590 runs sampled)
lookup long static route x 3,406,235 ops/sec ±0.77% (588 runs sampled)
lookup long dynamic route x 2,338,285 ops/sec ±1.60% (589 runs sampled)
lookup static versioned route x 9,239,314 ops/sec ±0.40% (586 runs sampled)
find static route x 35,230,842 ops/sec ±0.92% (578 runs sampled)
find dynamic route x 4,469,776 ops/sec ±0.33% (590 runs sampled)
find dynamic multi-parametric route x 2,237,214 ops/sec ±1.39% (585 runs sampled)
find dynamic multi-parametric route with regex x 1,533,243 ops/sec ±1.04% (581 runs sampled)
find long static route x 4,585,833 ops/sec ±0.51% (588 runs sampled)
find long dynamic route x 3,491,155 ops/sec ±0.45% (589 runs sampled)
find static versioned route x 10,801,810 ops/sec ±0.89% (580 runs sampled)
find long nested dynamic route x 1,418,610 ops/sec ±0.68% (588 runs sampled)
find long nested dynamic route with other method x 2,499,722 ops/sec ±0.38% (587 runs sampled)
```
@airhorns airhorns force-pushed the one-tree-per-method branch from ce7f7cc to acf1283 Compare October 28, 2020 15:01
@airhorns
Copy link
Contributor Author

Ok, with the latest version of this branch I'm up to lookup static route x 43,797,356 ops/sec ±0.98% (589 runs sampled) which is faster than master for me, plus or minus the benchmarking noise! 🥳

Copy link
Collaborator

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@airhorns
Copy link
Contributor Author

@delvedor would love to get this in as it makes a bunch of other changes I'd like to propose clearer! Anything I can do to complete the process?

Copy link
Owner

@delvedor delvedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@delvedor delvedor changed the title Switch to using one tree per method instead of a map of per-method handlers on each node Switch to using one tree per method instead of a map Oct 31, 2020
@delvedor delvedor merged commit b9337ca into delvedor:master Oct 31, 2020
@delvedor
Copy link
Owner

Thanks @airhorns!
I'll wait for your other changes before to release :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants