Switch to using one tree per method instead of a map #168

airhorns · 2020-10-27T21:11:27Z

This makes pretty printing annoying, but increases performance!

With n trees instead of one tree, each tree is only split for handlers it actually has, so for HTTP verbs like POST or PUT that tend to have fewer routes, the trees are smaller and faster to traverse. For the HTTP GET tree, there are fewer nodes and I think better cache locality as that tree is traversed the most often. Each verb doesn't pay any traversal penalty for the other trees' size. This also results in more instances of more selective version stores, which means traversing them should be faster at the expense of a bit more memory consumption.

This also makes the constraints implementation (see #166) easier, and prevents bugs like #132, and avoids the extra checks we have to do to fix that bug.

This also prevents tree traversal for methods where there are no routes at all, which is a small optimization but kinda nice regardless.

For the pretty printing algorithm, I think a nice pretty print wouldn't be per method and would instead show all routes in the same list, so I added code to merge the separate node trees and then pretty print the merged tree! To make it look pretty I added some "compression" to the tree where branches that only had one branch get compressed down, which if you ask me results in some prettier output, see the tests.

Test	`master`	`one-tree-per-method`
lookup static route	42,774,309 ops/sec ±0.84% (580 runs sampled)	41,200,005 ops/sec ±0.98% (591 runs sampled)
lookup dynamic route	3,536,084 ops/sec ±0.70% (587 runs sampled)	3,553,160 ops/sec ±0.28% (591 runs sampled)
lookup dynamic multi-parametric route	1,842,343 ops/sec ±0.92% (587 runs sampled)	2,047,064 ops/sec ±0.83% (584 runs sampled)
lookup dynamic multi-parametric route with regex	1,477,768 ops/sec ±0.57% (590 runs sampled)	1,500,267 ops/sec ±0.64% (590 runs sampled)
lookup long static route	3,350,884 ops/sec ±0.62% (589 runs sampled)	3,406,235 ops/sec ±0.77% (588 runs sampled)
lookup long dynamic route	2,491,556 ops/sec ±0.63% (585 runs sampled)	2,338,285 ops/sec ±1.60% (589 runs sampled)
lookup static versioned route	9,241,735 ops/sec ±0.44% (586 runs sampled)	9,239,314 ops/sec ±0.40% (586 runs sampled)
find static route	36,660,039 ops/sec ±0.76% (581 runs sampled)	35,230,842 ops/sec ±0.92% (578 runs sampled)
find dynamic route	4,473,753 ops/sec ±0.72% (588 runs sampled)	4,469,776 ops/sec ±0.33% (590 runs sampled)
find dynamic multi-parametric route	2,202,207 ops/sec ±1.00% (578 runs sampled)	2,237,214 ops/sec ±1.39% (585 runs sampled)
find dynamic multi-parametric route with regex	1,680,101 ops/sec ±0.76% (579 runs sampled)	1,533,243 ops/sec ±1.04% (581 runs sampled)
find long static route	4,633,069 ops/sec ±1.04% (588 runs sampled)	4,585,833 ops/sec ±0.51% (588 runs sampled)
find long dynamic route	3,333,916 ops/sec ±0.76% (586 runs sampled)	3,491,155 ops/sec ±0.45% (589 runs sampled)
find static versioned route	10,779,325 ops/sec ±0.73% (586 runs sampled)	10,801,810 ops/sec ±0.89% (580 runs sampled)
find long nested dynamic route	1,379,726 ops/sec ±0.45% (587 runs sampled)	1,418,610 ops/sec ±0.68% (588 runs sampled)
find long nested dynamic route with other method	1,962,454 ops/sec ±0.97% (587 runs sampled)	2,499,722 ops/sec ±0.38% (587 runs sampled)

I split this out of the work I've been doing to complete the request constraints stuff from #166 . It hasn't changed routing behaviour at all, just the performance characteristics, so I don't think it needs a major release. The pretty print output has changed somewhat, but I'm not sure if that counts as a breaking change.

mcollina

Good work! I like this change.

I've left a few notes

mcollina · 2020-10-27T22:14:48Z

index.js

+    code.push(`this['${m}'] = null`)
+  }
+  return new Function(code.join('\n')) // eslint-disable-line
+}


Why the use of new Function here?

I was just copying the code that was already there -- this object with one property per HTTP method was used in Node for the handler map before. See c70b2e8. I think it a little unclear and since there's only one instance of them per router I will switch it back to being a plain old constructor.

I take it back -- this makes a huge difference:

function buildMethodMap () { const code = [] for (var i = 0; i < http.METHODS.length; i++) { var m = http.METHODS[i] code.push(`this['${m}'] = null`) } return new Function(code.join('\n')) // eslint-disable-line } // Object with prototype slots for all the HTTP methods const MethodMap = buildMethodMap()

gives me lookup static route x 44,563,670 ops/sec ±0.88% (584 runs sampled)

whereas a constructor that just loops over the methods gives me lookup static route x 29,380,113 ops/sec ±1.03% (573 runs sampled)

It makes sense that every find touches this object so the whole thing is very sensitive to it's property access speed. You folks probably know better than me, but I am guessing that if V8 can see exactly what properties are set in the constructor it gets the Shape right from the start, whereas if the properties are assigned dynamically, even in a constructor, it's the same as assigning them one at a time to an object literal, which gets the nasty Shape chain problem.

index.js

mcollina · 2020-10-28T07:59:37Z

This has a 3% decrease in the static case routing, it should be solved.

airhorns · 2020-10-28T13:02:09Z

@mcollina or @delvedor do you have any suggestions for what might be causing that small regression? I tried profiling locally and it's hard to assess because the find method dominates everything (duh) and is so big, so there's no granularity in the profile. Any secret techniques you know about to find where time is spent or anything obvious to you looking at this PR?

airhorns · 2020-10-28T13:03:19Z

Also, I have gotten pretty widely varying benchmark results when running locally such that I thought that static line was about the same, but do you trust the error bars on those numbers enough that they really do need to match?

mcollina · 2020-10-28T13:08:55Z

I think they are the same. I'm going to run them on a dedicated machine.

airhorns · 2020-10-28T13:11:10Z

I found this thing the other day which might be worth setting up: https://github.com/rhysd/github-action-benchmark , though I think it wouldn't resolve the benchmark noise issue because of noisy neighbours in the cloud, it would at least mean contributors can let computers do the computer stuff of running the benchmarks and formatting them all the time instead of having to rerun locally often and try not to touch their computer while they are running :)

mcollina · 2020-10-28T14:18:45Z

Unfortunately it slows things down.

~/repositories/find-my-way (master)
$ node bench.js
lookup static route x 25,724,202 ops/sec ±0.26% (587 runs sampled)
lookup dynamic route x 1,674,518 ops/sec ±0.31% (577 runs sampled)
lookup dynamic multi-parametric route x 940,688 ops/sec ±0.41% (574 runs sampled)
lookup dynamic multi-parametric route with regex x 701,471 ops/sec ±0.49% (575 runs sampled)
lookup long static route x 1,442,851 ops/sec ±0.56% (571 runs sampled)
lookup long dynamic route x 1,115,337 ops/sec ±0.46% (575 runs sampled)
lookup static versioned route x 3,230,064 ops/sec ±0.47% (573 runs sampled)
find static route x 14,730,058 ops/sec ±0.39% (565 runs sampled)
find dynamic route x 1,875,483 ops/sec ±0.45% (576 runs sampled)
find dynamic multi-parametric route x 1,107,988 ops/sec ±0.46% (549 runs sampled)
find dynamic multi-parametric route with regex x 766,450 ops/sec ±0.51% (573 runs sampled)
find long static route x 1,904,768 ops/sec ±0.61% (567 runs sampled)
find long dynamic route x 1,468,946 ops/sec ±0.41% (575 runs sampled)
find static versioned route x 3,968,470 ops/sec ±0.57% (566 runs sampled)
~/repositories/find-my-way (master)
$ git checkout airhorns-one-tree-per-method
Switched to branch 'airhorns-one-tree-per-method'
~/repositories/find-my-way (airhorns-one-tree-per-method)
$ node -v
v12.18.2
~/repositories/find-my-way (airhorns-one-tree-per-method)
$ node bench.js
lookup static route x 24,634,341 ops/sec ±0.21% (588 runs sampled)
lookup dynamic route x 1,657,463 ops/sec ±0.34% (577 runs sampled)
lookup dynamic multi-parametric route x 939,049 ops/sec ±0.50% (576 runs sampled)
lookup dynamic multi-parametric route with regex x 696,668 ops/sec ±0.50% (575 runs sampled)
lookup long static route x 1,422,443 ops/sec ±0.52% (574 runs sampled)
lookup long dynamic route x 1,113,603 ops/sec ±0.47% (572 runs sampled)
lookup static versioned route x 3,314,353 ops/sec ±0.46% (574 runs sampled)
find static route x 14,735,541 ops/sec ±0.44% (566 runs sampled)
find dynamic route x 1,888,898 ops/sec ±0.44% (576 runs sampled)
find dynamic multi-parametric route x 1,064,308 ops/sec ±0.50% (575 runs sampled)
find dynamic multi-parametric route with regex x 754,310 ops/sec ±0.47% (574 runs sampled)
find long static route x 1,888,400 ops/sec ±0.63% (573 runs sampled)
find long dynamic route x 1,449,513 ops/sec ±0.38% (575 runs sampled)
find static versioned route x 4,013,044 ops/sec ±0.53% (572 runs sampled)
find long nested dynamic route x 579,224 ops/sec ±0.53% (572 runs sampled)
find long nested dynamic route with other method x 1,059,698 ops/sec ±0.57% (570 runs sampled)

mcollina · 2020-10-28T14:41:19Z

$ node bench
lookup static route x 25,818,969 ops/sec ±0.21% (587 runs sampled)
lookup dynamic route x 1,615,671 ops/sec ±0.39% (576 runs sampled)
lookup dynamic multi-parametric route x 937,708 ops/sec ±0.45% (574 runs sampled)
lookup dynamic multi-parametric route with regex x 807,883 ops/sec ±2.05% (544 runs sampled)
lookup long static route x 1,448,390 ops/sec ±0.53% (574 runs sampled)
lookup long dynamic route x 1,138,918 ops/sec ±0.44% (577 runs sampled)
lookup static versioned route x 3,208,351 ops/sec ±0.41% (573 runs sampled)
find static route x 15,094,211 ops/sec ±0.41% (565 runs sampled)
find dynamic route x 1,867,672 ops/sec ±0.43% (577 runs sampled)
find dynamic multi-parametric route x 1,071,705 ops/sec ±0.50% (575 runs sampled)
find dynamic multi-parametric route with regex x 756,572 ops/sec ±0.47% (575 runs sampled)
find long static route x 1,927,721 ops/sec ±0.67% (569 runs sampled)
find long dynamic route x 1,472,752 ops/sec ±0.42% (575 runs sampled)
find static versioned route x 4,000,588 ops/sec ±0.53% (571 runs sampled)
find long nested dynamic route x 585,418 ops/sec ±0.54% (573 runs sampled)
find long nested dynamic route with other method x 1,079,334 ops/sec ±0.57% (574 runs sampled)

You'll need this patch:

diff --git a/index.js b/index.js
index a4e28c6..f4027de 100644
--- a/index.js
+++ b/index.js
@@ -26,18 +26,6 @@ if (!isRegexSafe(FULL_PATH_REGEXP)) {

 const acceptVersionStrategy = require('./lib/accept-version')

-function buildMethodMap () {
-  const code = []
-  for (var i = 0; i < http.METHODS.length; i++) {
-    var m = http.METHODS[i]
-    code.push(`this['${m}'] = null`)
-  }
-  return new Function(code.join('\n')) // eslint-disable-line
-}
-
-// Object with prototype slots for all the HTTP methods
-const MethodMap = buildMethodMap()
-
 function Router (opts) {
   if (!(this instanceof Router)) {
     return new Router(opts)
@@ -63,7 +51,7 @@ function Router (opts) {
   this.maxParamLength = opts.maxParamLength || 100
   this.allowUnsafeRegex = opts.allowUnsafeRegex || false
   this.versioning = opts.versioning || acceptVersionStrategy
-  this.trees = new MethodMap()
+  this.trees = {}
   this.routes = []
 }

@@ -318,7 +306,7 @@ Router.prototype._insert = function _insert (method, path, kind, params, handler
 }

 Router.prototype.reset = function reset () {
-  this.trees = new MethodMap()
+  this.trees = {}
   this.routes = []
 }

airhorns · 2020-10-28T14:55:22Z

Does it make sense to you why an object literal would be faster? Shouldn't it be slower because of the shape chain created?

…ndlers on each node This makes pretty printing annoying, but increases performance! With n trees instead of one tree, each tree is only split for handlers it actually has, so for HTTP verbs like POST or PUT that tend to have fewer routes, the trees are smaller and faster to traverse. For the HTTP GET tree, there are fewer nodes and I think better cache locality as that tree is traversed the most often. Each verb doesn't pay any traversal penalty for the other trees' size. This also results in more instances of more selective version stores, which means traversing them should be faster at the expense of a bit more memory consumption. This also makes the constraint implementation (see delvedor#166) easier, and prevents bugs like delvedor#132, and avoids the extra checks we have to do to fix that bug. This also prevents tree traversal for methods where there are no routes at all, which is a small optimization but kinda nice regardless. For the pretty printing algorithm, I think a nice pretty print wouldn't be per method and would instead show all routes in the same list, so I added code to merge the separate node trees and then pretty print the merged tree! To make it look pretty I added some "compression" to the tree where branches that only had one branch get compressed down, which if you ask me results in some prettier output, see the tests. Benchmarks: ``` kamloop ~/C/find-my-way (master) ➜ npm run bench; git checkout one-tree-per-method; npm run bench > find-my-way@3.0.4 bench /Users/airhorns/Code/find-my-way > node bench.js lookup static route x 42,774,309 ops/sec ±0.84% (580 runs sampled) lookup dynamic route x 3,536,084 ops/sec ±0.70% (587 runs sampled) lookup dynamic multi-parametric route x 1,842,343 ops/sec ±0.92% (587 runs sampled) lookup dynamic multi-parametric route with regex x 1,477,768 ops/sec ±0.57% (590 runs sampled) lookup long static route x 3,350,884 ops/sec ±0.62% (589 runs sampled) lookup long dynamic route x 2,491,556 ops/sec ±0.63% (585 runs sampled) lookup static versioned route x 9,241,735 ops/sec ±0.44% (586 runs sampled) find static route x 36,660,039 ops/sec ±0.76% (581 runs sampled) find dynamic route x 4,473,753 ops/sec ±0.72% (588 runs sampled) find dynamic multi-parametric route x 2,202,207 ops/sec ±1.00% (578 runs sampled) find dynamic multi-parametric route with regex x 1,680,101 ops/sec ±0.76% (579 runs sampled) find long static route x 4,633,069 ops/sec ±1.04% (588 runs sampled) find long dynamic route x 3,333,916 ops/sec ±0.76% (586 runs sampled) find static versioned route x 10,779,325 ops/sec ±0.73% (586 runs sampled) find long nested dynamic route x 1,379,726 ops/sec ±0.45% (587 runs sampled) find long nested dynamic route with other method x 1,962,454 ops/sec ±0.97% (587 runs sampled) > find-my-way@3.0.4 bench /Users/airhorns/Code/find-my-way > node bench.js lookup static route x 41,200,005 ops/sec ±0.98% (591 runs sampled) lookup dynamic route x 3,553,160 ops/sec ±0.28% (591 runs sampled) lookup dynamic multi-parametric route x 2,047,064 ops/sec ±0.83% (584 runs sampled) lookup dynamic multi-parametric route with regex x 1,500,267 ops/sec ±0.64% (590 runs sampled) lookup long static route x 3,406,235 ops/sec ±0.77% (588 runs sampled) lookup long dynamic route x 2,338,285 ops/sec ±1.60% (589 runs sampled) lookup static versioned route x 9,239,314 ops/sec ±0.40% (586 runs sampled) find static route x 35,230,842 ops/sec ±0.92% (578 runs sampled) find dynamic route x 4,469,776 ops/sec ±0.33% (590 runs sampled) find dynamic multi-parametric route x 2,237,214 ops/sec ±1.39% (585 runs sampled) find dynamic multi-parametric route with regex x 1,533,243 ops/sec ±1.04% (581 runs sampled) find long static route x 4,585,833 ops/sec ±0.51% (588 runs sampled) find long dynamic route x 3,491,155 ops/sec ±0.45% (589 runs sampled) find static versioned route x 10,801,810 ops/sec ±0.89% (580 runs sampled) find long nested dynamic route x 1,418,610 ops/sec ±0.68% (588 runs sampled) find long nested dynamic route with other method x 2,499,722 ops/sec ±0.38% (587 runs sampled) ```

airhorns · 2020-10-28T15:03:43Z

Ok, with the latest version of this branch I'm up to lookup static route x 43,797,356 ops/sec ±0.98% (589 runs sampled) which is faster than master for me, plus or minus the benchmarking noise! 🥳

mcollina

lgtm

airhorns · 2020-10-31T15:35:17Z

@delvedor would love to get this in as it makes a bunch of other changes I'd like to propose clearer! Anything I can do to complete the process?

delvedor

LGTM

delvedor · 2020-10-31T17:58:54Z

Thanks @airhorns!
I'll wait for your other changes before to release :)

Add some couple realistic routes and benchmarks for a REST API

57e7a6d

mcollina reviewed Oct 27, 2020

View reviewed changes

airhorns force-pushed the one-tree-per-method branch 4 times, most recently from 334be9f to 2314b77 Compare October 28, 2020 13:21

airhorns force-pushed the one-tree-per-method branch from 2314b77 to ce7f7cc Compare October 28, 2020 14:57

airhorns force-pushed the one-tree-per-method branch from ce7f7cc to acf1283 Compare October 28, 2020 15:01

mcollina approved these changes Oct 28, 2020

View reviewed changes

airhorns mentioned this pull request Oct 29, 2020

Pretty print handler names #169

Closed

delvedor approved these changes Oct 31, 2020

View reviewed changes

delvedor changed the title ~~Switch to using one tree per method instead of a map of per-method handlers on each node~~ Switch to using one tree per method instead of a map Oct 31, 2020

delvedor merged commit b9337ca into delvedor:master Oct 31, 2020

airhorns deleted the one-tree-per-method branch October 31, 2020 18:13

dependabot bot mentioned this pull request Mar 17, 2021

Bump find-my-way from 3.0.4 to 3.0.5 HariAcidReign/vscode-stories#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch to using one tree per method instead of a map #168

Switch to using one tree per method instead of a map #168

airhorns commented Oct 27, 2020 •

edited

Loading

mcollina left a comment

mcollina Oct 27, 2020

airhorns Oct 28, 2020

airhorns Oct 28, 2020 •

edited

Loading

mcollina commented Oct 28, 2020

airhorns commented Oct 28, 2020 •

edited

Loading

airhorns commented Oct 28, 2020

mcollina commented Oct 28, 2020

airhorns commented Oct 28, 2020

mcollina commented Oct 28, 2020

mcollina commented Oct 28, 2020

airhorns commented Oct 28, 2020

airhorns commented Oct 28, 2020

mcollina left a comment

airhorns commented Oct 31, 2020

delvedor left a comment

delvedor commented Oct 31, 2020

Switch to using one tree per method instead of a map #168

Switch to using one tree per method instead of a map #168

Conversation

airhorns commented Oct 27, 2020 • edited Loading

mcollina left a comment

Choose a reason for hiding this comment

mcollina Oct 27, 2020

Choose a reason for hiding this comment

airhorns Oct 28, 2020

Choose a reason for hiding this comment

airhorns Oct 28, 2020 • edited Loading

Choose a reason for hiding this comment

mcollina commented Oct 28, 2020

airhorns commented Oct 28, 2020 • edited Loading

airhorns commented Oct 28, 2020

mcollina commented Oct 28, 2020

airhorns commented Oct 28, 2020

mcollina commented Oct 28, 2020

mcollina commented Oct 28, 2020

airhorns commented Oct 28, 2020

airhorns commented Oct 28, 2020

mcollina left a comment

Choose a reason for hiding this comment

airhorns commented Oct 31, 2020

delvedor left a comment

Choose a reason for hiding this comment

delvedor commented Oct 31, 2020

airhorns commented Oct 27, 2020 •

edited

Loading

airhorns Oct 28, 2020 •

edited

Loading

airhorns commented Oct 28, 2020 •

edited

Loading