-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to using one tree per method instead of a map #168
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work! I like this change.
I've left a few notes
index.js
Outdated
code.push(`this['${m}'] = null`) | ||
} | ||
return new Function(code.join('\n')) // eslint-disable-line | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the use of new Function here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just copying the code that was already there -- this object with one property per HTTP method was used in Node
for the handler map before. See c70b2e8. I think it a little unclear and since there's only one instance of them per router I will switch it back to being a plain old constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take it back -- this makes a huge difference:
function buildMethodMap () {
const code = []
for (var i = 0; i < http.METHODS.length; i++) {
var m = http.METHODS[i]
code.push(`this['${m}'] = null`)
}
return new Function(code.join('\n')) // eslint-disable-line
}
// Object with prototype slots for all the HTTP methods
const MethodMap = buildMethodMap()
gives me lookup static route x 44,563,670 ops/sec ±0.88% (584 runs sampled)
whereas a constructor that just loops over the methods gives me lookup static route x 29,380,113 ops/sec ±1.03% (573 runs sampled)
It makes sense that every find
touches this object so the whole thing is very sensitive to it's property access speed. You folks probably know better than me, but I am guessing that if V8 can see exactly what properties are set in the constructor it gets the Shape right from the start, whereas if the properties are assigned dynamically, even in a constructor, it's the same as assigning them one at a time to an object literal, which gets the nasty Shape chain problem.
This has a 3% decrease in the static case routing, it should be solved. |
@mcollina or @delvedor do you have any suggestions for what might be causing that small regression? I tried profiling locally and it's hard to assess because the |
Also, I have gotten pretty widely varying benchmark results when running locally such that I thought that static line was about the same, but do you trust the error bars on those numbers enough that they really do need to match? |
I think they are the same. I'm going to run them on a dedicated machine. |
I found this thing the other day which might be worth setting up: https://github.com/rhysd/github-action-benchmark , though I think it wouldn't resolve the benchmark noise issue because of noisy neighbours in the cloud, it would at least mean contributors can let computers do the computer stuff of running the benchmarks and formatting them all the time instead of having to rerun locally often and try not to touch their computer while they are running :) |
334be9f
to
2314b77
Compare
Unfortunately it slows things down.
|
You'll need this patch: diff --git a/index.js b/index.js
index a4e28c6..f4027de 100644
--- a/index.js
+++ b/index.js
@@ -26,18 +26,6 @@ if (!isRegexSafe(FULL_PATH_REGEXP)) {
const acceptVersionStrategy = require('./lib/accept-version')
-function buildMethodMap () {
- const code = []
- for (var i = 0; i < http.METHODS.length; i++) {
- var m = http.METHODS[i]
- code.push(`this['${m}'] = null`)
- }
- return new Function(code.join('\n')) // eslint-disable-line
-}
-
-// Object with prototype slots for all the HTTP methods
-const MethodMap = buildMethodMap()
-
function Router (opts) {
if (!(this instanceof Router)) {
return new Router(opts)
@@ -63,7 +51,7 @@ function Router (opts) {
this.maxParamLength = opts.maxParamLength || 100
this.allowUnsafeRegex = opts.allowUnsafeRegex || false
this.versioning = opts.versioning || acceptVersionStrategy
- this.trees = new MethodMap()
+ this.trees = {}
this.routes = []
}
@@ -318,7 +306,7 @@ Router.prototype._insert = function _insert (method, path, kind, params, handler
}
Router.prototype.reset = function reset () {
- this.trees = new MethodMap()
+ this.trees = {}
this.routes = []
} |
Does it make sense to you why an object literal would be faster? Shouldn't it be slower because of the shape chain created? |
2314b77
to
ce7f7cc
Compare
…ndlers on each node This makes pretty printing annoying, but increases performance! With n trees instead of one tree, each tree is only split for handlers it actually has, so for HTTP verbs like POST or PUT that tend to have fewer routes, the trees are smaller and faster to traverse. For the HTTP GET tree, there are fewer nodes and I think better cache locality as that tree is traversed the most often. Each verb doesn't pay any traversal penalty for the other trees' size. This also results in more instances of more selective version stores, which means traversing them should be faster at the expense of a bit more memory consumption. This also makes the constraint implementation (see delvedor#166) easier, and prevents bugs like delvedor#132, and avoids the extra checks we have to do to fix that bug. This also prevents tree traversal for methods where there are no routes at all, which is a small optimization but kinda nice regardless. For the pretty printing algorithm, I think a nice pretty print wouldn't be per method and would instead show all routes in the same list, so I added code to merge the separate node trees and then pretty print the merged tree! To make it look pretty I added some "compression" to the tree where branches that only had one branch get compressed down, which if you ask me results in some prettier output, see the tests. Benchmarks: ``` kamloop ~/C/find-my-way (master) ➜ npm run bench; git checkout one-tree-per-method; npm run bench > find-my-way@3.0.4 bench /Users/airhorns/Code/find-my-way > node bench.js lookup static route x 42,774,309 ops/sec ±0.84% (580 runs sampled) lookup dynamic route x 3,536,084 ops/sec ±0.70% (587 runs sampled) lookup dynamic multi-parametric route x 1,842,343 ops/sec ±0.92% (587 runs sampled) lookup dynamic multi-parametric route with regex x 1,477,768 ops/sec ±0.57% (590 runs sampled) lookup long static route x 3,350,884 ops/sec ±0.62% (589 runs sampled) lookup long dynamic route x 2,491,556 ops/sec ±0.63% (585 runs sampled) lookup static versioned route x 9,241,735 ops/sec ±0.44% (586 runs sampled) find static route x 36,660,039 ops/sec ±0.76% (581 runs sampled) find dynamic route x 4,473,753 ops/sec ±0.72% (588 runs sampled) find dynamic multi-parametric route x 2,202,207 ops/sec ±1.00% (578 runs sampled) find dynamic multi-parametric route with regex x 1,680,101 ops/sec ±0.76% (579 runs sampled) find long static route x 4,633,069 ops/sec ±1.04% (588 runs sampled) find long dynamic route x 3,333,916 ops/sec ±0.76% (586 runs sampled) find static versioned route x 10,779,325 ops/sec ±0.73% (586 runs sampled) find long nested dynamic route x 1,379,726 ops/sec ±0.45% (587 runs sampled) find long nested dynamic route with other method x 1,962,454 ops/sec ±0.97% (587 runs sampled) > find-my-way@3.0.4 bench /Users/airhorns/Code/find-my-way > node bench.js lookup static route x 41,200,005 ops/sec ±0.98% (591 runs sampled) lookup dynamic route x 3,553,160 ops/sec ±0.28% (591 runs sampled) lookup dynamic multi-parametric route x 2,047,064 ops/sec ±0.83% (584 runs sampled) lookup dynamic multi-parametric route with regex x 1,500,267 ops/sec ±0.64% (590 runs sampled) lookup long static route x 3,406,235 ops/sec ±0.77% (588 runs sampled) lookup long dynamic route x 2,338,285 ops/sec ±1.60% (589 runs sampled) lookup static versioned route x 9,239,314 ops/sec ±0.40% (586 runs sampled) find static route x 35,230,842 ops/sec ±0.92% (578 runs sampled) find dynamic route x 4,469,776 ops/sec ±0.33% (590 runs sampled) find dynamic multi-parametric route x 2,237,214 ops/sec ±1.39% (585 runs sampled) find dynamic multi-parametric route with regex x 1,533,243 ops/sec ±1.04% (581 runs sampled) find long static route x 4,585,833 ops/sec ±0.51% (588 runs sampled) find long dynamic route x 3,491,155 ops/sec ±0.45% (589 runs sampled) find static versioned route x 10,801,810 ops/sec ±0.89% (580 runs sampled) find long nested dynamic route x 1,418,610 ops/sec ±0.68% (588 runs sampled) find long nested dynamic route with other method x 2,499,722 ops/sec ±0.38% (587 runs sampled) ```
ce7f7cc
to
acf1283
Compare
Ok, with the latest version of this branch I'm up to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@delvedor would love to get this in as it makes a bunch of other changes I'd like to propose clearer! Anything I can do to complete the process? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @airhorns! |
This makes pretty printing annoying, but increases performance!
With n trees instead of one tree, each tree is only split for handlers it actually has, so for HTTP verbs like POST or PUT that tend to have fewer routes, the trees are smaller and faster to traverse. For the HTTP GET tree, there are fewer nodes and I think better cache locality as that tree is traversed the most often. Each verb doesn't pay any traversal penalty for the other trees' size. This also results in more instances of more selective version stores, which means traversing them should be faster at the expense of a bit more memory consumption.
This also makes the constraints implementation (see #166) easier, and prevents bugs like #132, and avoids the extra checks we have to do to fix that bug.
This also prevents tree traversal for methods where there are no routes at all, which is a small optimization but kinda nice regardless.
For the pretty printing algorithm, I think a nice pretty print wouldn't be per method and would instead show all routes in the same list, so I added code to merge the separate node trees and then pretty print the merged tree! To make it look pretty I added some "compression" to the tree where branches that only had one branch get compressed down, which if you ask me results in some prettier output, see the tests.
master
one-tree-per-method
I split this out of the work I've been doing to complete the request constraints stuff from #166 . It hasn't changed routing behaviour at all, just the performance characteristics, so I don't think it needs a major release. The pretty print output has changed somewhat, but I'm not sure if that counts as a breaking change.