Request based constraints implementation #166

AyoubElk · 2020-10-03T15:51:07Z

This PR is to add support for request based constraints, following the discussions in issues fastify#2498 and find-my-way#165

Please note that this is still a WIP and tests are failing, but I'd like to start a discussion and get some feedback and direction.

01. API updates:

a. Constructor:

Instead of opts.versioning that allows specifying the versioning strategy, the constructor now takes opts.constrainingStrategies that allows the user to override default strategies with custom ones.

The strategy object maintains the same structure as previously, with a single change of deriveVersion to deriveConstraint to be more generic.

The default strategies supported are version and host.

Right now host only supports exact match

b. Route registration (.on):

When registering a route, the user can provide a constraints property in the opts parameter, this property is a flat object containing key-values constraints:

{
	version: '1.0.0',
	host: 'auth.airhorns.dev'
}

c. Routing (.find):

When finding a route, the find method now expects a function constraintsExtractor instead of a constraints object.

In the current codebase the use of versioned routes storage is only done if deriveVersion returns a non empty value.

This works well for the version constraint, which has to be explicitly set in the accept-version header. For other constraints like the header host it will always contain a value, which would imply the constrained storage will always be used.

In order to avoid that, the new solution that would support all use cases needs to derive the constraints for each tree node we check, since each node knows which constraints it supports.

The constraintsExtractor method expects a list of constraint keys that should be extracted from the request.

02. Internals:

Internally, these are some of the main changes that were made:

a. accept-constraints:

Function that takes an optional strategies object, allowing the user to override/extend the existing strategies. Returns an object that allows creating instances of the ConstraintsStore bound to the available strategies via the storage property, and getConstraintsExtractor function that returns a constraintsExtractor function with access to the request object.

b. constraints-store:

An instance of the ConstraintsStore is created for each node, it takes a list of strategies that it should use, where each strategy has its own storage:

ConstraintsStore exposes 2 methods:

set(constraints, store): Given a constraints object, we loop through its properties and call the appropriate strategy's set() method. We store the constraints object as well, which we'll need in the .get() method:

https://github.com/AyoubElk/find-my-way/blob/edd00c98f5eaaad5fd6f09b7ac645e3d2ab8d9bf/lib/constraints-store.js#L19-L22
get(constraints, method): Given a constraints object, calling the get method will result in a call to the get method of all the supported strategies. Since the strategies store handlers independently, if one of them returns null then we can know that no match that fulfills all constraints is found and we return null, otherwise we return the stored handlers.

https://github.com/AyoubElk/find-my-way/blob/edd00c98f5eaaad5fd6f09b7ac645e3d2ab8d9bf/lib/constraints-store.js#L38

We also compare the given constraints object with the one we stored previously, but there seems to be an edge case here that I haven't considered before writing this.

https://github.com/AyoubElk/find-my-way/blob/edd00c98f5eaaad5fd6f09b7ac645e3d2ab8d9bf/lib/constraints-store.js#L39-L40

Consider these routes:
```
router.on('GET', '/users', { constraints: { version: '1.0.0' } }, handler)
router.on('GET', '/users', { constraints: { host: 'auth.airhorns.dev' } }, handler)
```
Should this scenario be possible? If so, and given a request that can match both, which one should be matched

c. node:

The main changes here are two new methods and one new property:

kConstraints: A set that contains the constraint keys applicable to this node. But given the edge case mentioned above, it would be better to have an extra property that stores an array of constraints and try to match starting from the most to the least restricted routes. What do you think?
getMatchingHandler(constraintsExtractor, method): Extracts the applicable constraints from the req object if any, and calls getConstraintsHandler(), if no matching handler is found it calls getHandler() as well in case we have a non-constrained route available.
findMatchingChild(constraintsExtractor, path, method): replaces both findChild() and findVersionChild()

Looking forward to hearing your thoughts!

…the generic name deriveConstraint

…ned handler retrieval

…hild()

mcollina

This is missing tests, so it's a bit hard to know how this would be used. I've left a few comments.

Does this worsen the benchmarks? I think it does from looking at code, but you should be able to optimize it.

cc @delvedor

mcollina · 2020-10-04T09:53:41Z

index.js

-    assert(typeof opts.version === 'string', 'Version should be a string')
+  // constraints validation
+  if (opts.constraints !== undefined) {
+    // TODO: Support more explicit validation?


What do you mean?

I meant should the validation code check the properties inside opts.constraints and make sure there are strategies to handle each one of them?

Yes please, thanks.

mcollina · 2020-10-04T09:55:15Z

lib/constraints-store.js

@@ -0,0 +1,47 @@
+'use strict'
+
+const { assert } = require('console')


Why is this from console?

Good catch, it was auto-imported by mistake.

mcollina · 2020-10-04T09:55:27Z

lib/constraints-store.js

+}
+
+ConstraintsStore.prototype.set = function (constraints, store) {
+  // TODO: Should I check for existence of at least one constraint?


Yes, thanks

mcollina · 2020-10-04T09:55:49Z

lib/constraints-store.js

+    throw new TypeError('Constraints should be an object')
+  }
+
+  Object.keys(constraints).forEach(kConstraint => {


Please use a for(;;) loop

mcollina · 2020-10-04T09:56:54Z

lib/constraints-store.js

+    const storedObject = this.strategies[kConstraint].get(constraints[kConstraint])
+    if (!storedObject || !storedObject.store || !storedObject.store[method]) return null
+    // TODO: Order of properties may result in inequality
+    if (JSON.stringify(constraints) !== JSON.stringify(storedObject.constraints)) return null


Not only it suffer from the problem you described in the comment, it is also highly inefficient. What is the purpose of this check? Maybe we can do without?

To illustrate the purpose of that check, let's take these routes as an example:

router.on('GET', '/users', { constraints: { version: '2.0.0', host: 'www.airhorns.dev' } }, handler) // Route 1 router.on('GET', '/users', { constraints: { version: '1.*.0', host: 'auth.airhorns.dev' } }, handler) // Route 2

Using the constraints extracted from a request: { version: '1.0.0', host: 'www.airhorns.com' }, when we call the get method of the ConstraintsStore, each constraint is checked separately using its defined strategy, so in this case the version constraint will match route 2, while the host constraint will match route 1. Adding this comparison allows us to avoid this edge case.

I agree that it isn't the most efficient solution, in fact now that I think of it it's wrong because the comparison will fail in non-exact match cases (in this example the version constraint has a wildcard, while the request has the exact version).

I'll try to come up with an alternative.

airhorns · 2020-10-08T12:47:47Z

Been talking with @AyoubElk off platform, want to post some answers here so all can see:

Considering these two routes
router.on('GET', '/users', { constraints: { version: '1.0.0' } }, handler)
router.on('GET', '/users', { constraints: { host: 'auth.airhorns.dev' } }, handler2)
Should this scenario be possible? If so, and given a request that can match both, which one should be matched?

For a request that could match both of the constraint sets, I think we should match what fastify does now for route matching precedence, which is just use whichever route is matched first. I forget if it's the first defined route or the last defined route, but, there is an ordering in which the routes were all required and registered and I assume some existing semantics for in what order the routes are tested. If those semantics don't exist already and it's undefined behaviour, then I think we should actually match that too in order to not make a bigger breaking change. I hope it doesn't depend on object key insertion order, though that is stable in node now.

In a similar vein, if the user can declare multiple routes with the same method/path but different constraints for each, I'd assume we'll match the most restrictive route:
In this case, if route 1 (has 1 constraint) and route 3 (has 2 constraints) match the request, we'll use the 3rd:
router.on('GET', '/users', { constraints: { version: '1.0.0' } }, handler)
router.on('GET', '/users', { constraints: { host: 'admin.airhorns.dev' } }, handler2)
router.on('GET', '/users', { constraints: { host: 'auth.airhorns.dev', version: '1.0.0' } }, handler3)
In the previous example however, we have 1 constraint for each one of the routes, so we can't base the matching on the number of constraints. Maybe the order of the routes should then be used and go with the last defined route?

I'd say the same thing: match the first route checked that passes the constraint checks, and then hopefully there's already a well defined order that routes are checked in. I think any other algorithm around most restrictive is going to be unpredictable and therefore surprising because restrictive might mean different things to different people. I think as a developer if I defined two routes in a file, and then added a constraint to the second route, which caused it to start getting checked first and matched first, I'd be surprised.

FWIW this ordered route matching how the Rails router works, I am a one trick pony when it comes to this kind of thing.

…mprovement

AyoubElk · 2020-10-12T15:04:51Z

Thanks @airhorns for the detailed explanation.

I think we should match what fastify does now for route matching precedence, which is just use whichever route is matched first. I forget if it's the first defined route or the last defined route, but, there is an ordering in which the routes were all required and registered and I assume some existing semantics for in what order the routes are tested.
If those semantics don't exist already and it's undefined behaviour, then I think we should actually match that too in order to not make a bigger breaking change. I hope it doesn't depend on object key insertion order, though that is stable in node now.

Right now, find-my-way matches the routes following this order, which is based solely on the route path.

This allows it to deterministically figure out which node to match, but not which handler inside that node should be returned in case different ones are available.

I'd say the same thing: match the first route checked that passes the constraint checks, and then hopefully there's already a well defined order that routes are checked in. I think any other algorithm around most restrictive is going to be unpredictable and therefore surprising because restrictive might mean different things to different people. I think as a developer if I defined two routes in a file, and then added a constraint to the second route, which caused it to start getting checked first and matched first, I'd be surprised.
FWIW this ordered route matching how the Rails router works, I am a one trick pony when it comes to this kind of thing.

In the example I provided, we have the same method and path, which will always be matched to the same node. At the node-level, a few things happen:

During route registration: we take note of which constraints exist in the property: kConstraints (This is necessary because constraints like host will always be present in the request, so we need a way to figure out which constraints to use for each node)
During routing: we derive constraints from the request and then for each node we only pick the constraints that are relevant to it.

Thanks to this, matching the first handler that passes the constraints would be simple to implement and deterministic, because the order at which the routes are registered is the same order of the constraints available in the kConstraints property.

If no other solution is suggested, I'll implement it this way.

AyoubElk · 2020-10-12T15:18:28Z

Updates

I've made some updates during this weekend to fix some issues I encountered and make some improvements based on benchmark results.

This is an overview and explaination of some of the main changes:

1. Replace `getConstraintsExtractor` with `deriveConstraints`

Previously, lookup called getConstraintsExtractror(req, ctx) and passed the resulting function constraintsExtractor(kConstraints) to the find method.

This is necessary because constraints like host will always be present in the request, and we need a way to allow each node to extract the constraints relevant to it.

However, this also meant that we had to derive the constraints from the request for each node that needs to do so, which adds unnecessary overhead.

Right now, deriveConstraints is called only once, and an object with all the derived constraints is passed down to nodes which can then access only the properties they need from it.

2. Update `constraintsStore` to centralize store storage in a shared map

This was done mainly to solve the issue described in this comment. We needed a way to make sure the different constraints actually stored the same handler.

Instead of stringifying and comparing the constraints object (which is an inefficient operation), I added a shared map property where each store is mapped to a unique id, and that id is passed down to the store of each strategy used.

We can now simply compare those ids.

3. Change strategy object format

When running the benchmark, I tracked down the main performance degradation down to the deriveConstraints function, which dropped performance by half by simply looping through stategies and calling deriveConstraint for each one of them.

deriveConstraints: function (req, ctx) {
      const derivedConstraints = {}
      let value, hasConstraint = false
      for (var i = 0; i < strategies.length; i++) {
        value = strategies[i].deriveConstraint(req, ctx)
        if (value) {
          hasConstraint = true
          derivedConstraints[strategies[i].name] = value
        }
      }

      return hasConstraint ? derivedConstraints : null
    }

Interestingly, the issue happened only when accessing the strategies object/array (comparison n°3 below) using the loop variable, and disappeared when using a literal value. The issue also disappeared when keeping access using the loop variable and removing the call to the deriveConstraint function.

So it seemed to me that the combination fo both dynamic property access and function call somehow caused the V8 engine to do some heavy operations that degraded performance.

I never ran into anything like this before, and given some reading I've done it seems that the main reason is Javascript hidden classes, which are created and updated as the object changes to allow the V8 engine to avoid dynamic property lookups, perform optimizations and inline caching.

Eventually, I figured out that providing the properties at the prototype level would allow V8 to know the object structure in advance. This worked and brought performance back up considerably (comparison n°4 below).

a. Using prototype

function acceptVersion() { }
acceptVersion.prototype.name = 'version'
acceptVersion.prototype.storage = SemVerStore
acceptVersion.prototype.deriveConstraint = function (req, ctx) {
  return req.headers['version']
}

b. Using a normal object:

{
  name: 'version',
  storage: SemVerStore,
  deriveConstraint: function (req, ctx) {
    return req.headers['accept-version']
  }
}

This works well, but would mean the user would have to provide any custom strategy using the new format (which is not as straight-forward as the first)

I wasn't able to find any other solution that maintains performance levels, but if there is some other way you're aware of to implement/optimize this and keep the previous strategy format, please let me know.

Benchmarks comparison

On my MacBook Pro Retina Mid 2014 (i7, 16GB of RAM), these are the results of running the benchmarks:

1. Master branch

❯ node bench.js   
lookup static route x 39,864,317 ops/sec ±0.54% (586 runs sampled)
lookup dynamic route x 3,055,819 ops/sec ±0.39% (588 runs sampled)
lookup dynamic multi-parametric route x 1,584,679 ops/sec ±0.64% (583 runs sampled)
lookup dynamic multi-parametric route with regex x 1,158,698 ops/sec ±0.96% (584 runs sampled)
lookup long static route x 2,570,186 ops/sec ±0.41% (582 runs sampled)
lookup long dynamic route x 2,024,055 ops/sec ±0.33% (590 runs sampled)
lookup static versioned route x 7,154,573 ops/sec ±0.41% (585 runs sampled)
find static route x 29,247,517 ops/sec ±0.60% (573 runs sampled)
find dynamic route x 3,472,771 ops/sec ±0.41% (586 runs sampled)
find dynamic multi-parametric route x 1,874,547 ops/sec ±0.70% (590 runs sampled)
find dynamic multi-parametric route with regex x 1,407,869 ops/sec ±0.29% (590 runs sampled)
find long static route x 3,787,256 ops/sec ±0.24% (589 runs sampled)
find long dynamic route x 2,759,278 ops/sec ±0.27% (590 runs sampled)
find static versioned route x 8,826,557 ops/sec ±0.39% (590 runs sampled)

2. Request-based-constraints branch

❯ node bench.js
Routes registered successfully...
lookup static route x 34,997,182 ops/sec ±0.37% (587 runs sampled)
lookup dynamic route x 2,862,363 ops/sec ±0.56% (585 runs sampled)
lookup dynamic multi-parametric route x 1,576,788 ops/sec ±0.25% (589 runs sampled)
lookup dynamic multi-parametric route with regex x 1,196,899 ops/sec ±0.31% (589 runs sampled)
lookup long static route x 2,475,820 ops/sec ±0.41% (586 runs sampled)
lookup long dynamic route x 1,906,568 ops/sec ±0.29% (589 runs sampled)
lookup static versioned route x 16,925,526 ops/sec ±0.46% (579 runs sampled)
lookup static constrained (version & host) route x 8,802,996 ops/sec ±0.55% (587 runs sampled)
find static route x 28,607,663 ops/sec ±0.59% (579 runs sampled)
find dynamic route x 3,449,980 ops/sec ±0.31% (588 runs sampled)
find dynamic multi-parametric route x 1,912,809 ops/sec ±0.37% (590 runs sampled)
find dynamic multi-parametric route with regex x 1,339,933 ops/sec ±0.78% (588 runs sampled)
find long static route x 3,428,143 ops/sec ±0.31% (587 runs sampled)
find long dynamic route x 2,635,874 ops/sec ±0.31% (584 runs sampled)
find static versioned route x 4,678,909 ops/sec ±0.32% (588 runs sampled)
find static constrained (version & host) route x 4,003,335 ops/sec ±0.37% (588 runs sampled)

3. Strategies object/array

Code for this can be found in the branch benchmark-strategy-object-vs-array, where I used a strategies object in acceptConstraints instead of an array.

You can notice a significant performance drop in the lines in bold (compared to request-based-constraints branch):

❯ node bench.js
Routes registered successfully...
**lookup static route x 18,470,823 ops/sec ±0.58% (586 runs sampled)**
lookup dynamic route x 2,544,969 ops/sec ±0.86% (585 runs sampled)
lookup dynamic multi-parametric route x 1,487,122 ops/sec ±0.49% (588 runs sampled)
lookup dynamic multi-parametric route with regex x 1,154,611 ops/sec ±0.32% (590 runs sampled)
lookup long static route x 2,320,796 ops/sec ±0.33% (587 runs sampled)
lookup long dynamic route x 1,834,193 ops/sec ±0.53% (582 runs sampled)
**lookup static versioned route x 11,161,286 ops/sec ±0.42% (583 runs sampled)
lookup static constrained (version & host) route x 6,775,507 ops/sec ±0.36% (584 runs sampled)**
find static route x 28,863,575 ops/sec ±0.62% (578 runs sampled)
find dynamic route x 3,445,673 ops/sec ±0.47% (587 runs sampled)
find dynamic multi-parametric route x 1,894,014 ops/sec ±0.35% (588 runs sampled)
find dynamic multi-parametric route with regex x 1,353,584 ops/sec ±0.26% (589 runs sampled)
find long static route x 3,525,118 ops/sec ±0.29% (588 runs sampled)
find long dynamic route x 2,609,847 ops/sec ±1.00% (588 runs sampled)
find static versioned route x 4,626,360 ops/sec ±0.42% (586 runs sampled)
find static constrained (version & host) route x 3,760,848 ops/sec ±0.50% (585 runs sampled)

4. Strategy format object vs prototype

Code for this can be found in the branch benchmark-strategy-object-vs-prototype, where I used defined strategies as simple objects instead of using prototype.

You can notice a significant performance drop in the lines in bold (compared to request-based-constraints branch):

❯ node bench.js
Routes registered successfully...
**lookup static route x 19,416,833 ops/sec ±0.30% (589 runs sampled)**
lookup dynamic route x 2,739,224 ops/sec ±0.32% (587 runs sampled)
lookup dynamic multi-parametric route x 1,547,496 ops/sec ±0.42% (590 runs sampled)
lookup dynamic multi-parametric route with regex x 1,195,527 ops/sec ±0.24% (590 runs sampled)
lookup long static route x 2,433,980 ops/sec ±0.17% (590 runs sampled)
lookup long dynamic route x 1,885,178 ops/sec ±0.30% (588 runs sampled)
**lookup static versioned route x 3,975,875 ops/sec ±0.37% (586 runs sampled)
lookup static constrained (version & host) route x 2,537,698 ops/sec ±0.42% (590 runs sampled)**
find static route x 30,144,666 ops/sec ±0.62% (576 runs sampled)
find dynamic route x 3,513,484 ops/sec ±0.35% (584 runs sampled)
find dynamic multi-parametric route x 1,835,354 ops/sec ±0.62% (584 runs sampled)
find dynamic multi-parametric route with regex x 1,304,781 ops/sec ±0.43% (591 runs sampled)
find long static route x 3,536,402 ops/sec ±0.22% (591 runs sampled)
find long dynamic route x 2,697,523 ops/sec ±0.17% (590 runs sampled)
find static versioned route x 4,922,200 ops/sec ±0.19% (590 runs sampled)
find static constrained (version & host) route x 4,030,548 ops/sec ±0.39% (588 runs sampled)

Next steps

This is still a WIP, I'd appreciate any feedback, especially regarding the issue described in the last comment by @airhorns.

Also, it seems that the benchmark returns inconsistent results, depending on which tests we run. I'm not sure if this is the normal behaviour, but I expected that test executions should be independent from each other.

Below are some example outputs when I comment out tests in bench.js for request-based-constraints branch:

❯ node bench.js
Routes registered successfully...
lookup static versioned route x 36,250,898 ops/sec ±0.28% (591 runs sampled)
find static versioned route x 7,019,433 ops/sec ±0.44% (588 runs sampled)

❯ node bench.js
Routes registered successfully...
lookup static versioned route x 34,855,606 ops/sec ±0.48% (585 runs sampled)
find static versioned route x 6,734,253 ops/sec ±1.31% (578 runs sampled)

❯ node bench.js
Routes registered successfully...
lookup static route x 33,630,554 ops/sec ±0.77% (585 runs sampled)
find static route x 50,909,812 ops/sec ±1.26% (564 runs sampled)

❯ node bench.js
Routes registered successfully...
lookup static route x 36,130,011 ops/sec ±0.38% (590 runs sampled)

❯ node bench.js
Routes registered successfully...
find static route x 49,153,773 ops/sec ±2.16% (589 runs sampled)

❯ node bench.js 
Routes registered successfully...
lookup static constrained (version & host) route x 16,030,286 ops/sec ±0.36% (590 runs sampled)

mcollina · 2020-10-13T09:15:07Z

I know the benchmarks could be quite flaky and they'll need some work. Howwever, the performance regression is there. I don't understand if your other branch fixes it or not.

AyoubElk · 2020-10-13T12:05:07Z

I know the benchmarks could be quite flaky and they'll need some work.

Yeah, and interestingly lookup seems to outperform find calls, which seems counter-intuitive since lookup derives constraints and calls find

Howwever, the performance regression is there. I don't understand if your other branch fixes it or not.

@mcollina In the benchmark section, you can compare the master branch benchmarks with those of request-based-constraint branch.

This table should make it easier to compare both (the bolded values are those where request-based-constraints branch outperformed the master branch):

Benchmark	master	request-based-constraints
lookup static route	39,864,317 ops/sec ±0.54% (586 runs sampled)	34,997,182 ops/sec ±0.37% (587 runs sampled)
lookup dynamic route	3,055,819 ops/sec ±0.39% (588 runs sampled)	2,862,363 ops/sec ±0.56% (585 runs sampled)
lookup dynamic multi-parametric route	1,584,679 ops/sec ±0.64% (583 runs sampled)	1,576,788 ops/sec ±0.25% (589 runs sampled)
lookup dynamic multi-parametric route with regex	1,158,698 ops/sec ±0.96% (584 runs sampled)	1,196,899 ops/sec ±0.31% (589 runs sampled)
lookup long static route	2,570,186 ops/sec ±0.41% (582 runs sampled)	2,475,820 ops/sec ±0.41% (586 runs sampled)
lookup long dynamic route	2,024,055 ops/sec ±0.33% (590 runs sampled)	1,906,568 ops/sec ±0.29% (589 runs sampled)
lookup static versioned route	7,154,573 ops/sec ±0.41% (585 runs sampled)	16,925,526 ops/sec ±0.46% (579 runs sampled)
lookup static constrained (version & host) route	-	8,802,996 ops/sec ±0.55% (587 runs sampled)
find static route	29,247,517 ops/sec ±0.60% (573 runs sampled)	28,607,663 ops/sec ±0.59% (579 runs sampled)
find dynamic route	3,472,771 ops/sec ±0.41% (586 runs sampled)	3,449,980 ops/sec ±0.31% (588 runs sampled)
find dynamic multi-parametric route	1,874,547 ops/sec ±0.70% (590 runs sampled)	1,912,809 ops/sec ±0.37% (590 runs sampled)
find dynamic multi-parametric route with regex	1,407,869 ops/sec ±0.29% (590 runs sampled)	1,339,933 ops/sec ±0.78% (588 runs sampled)
find long static route	3,787,256 ops/sec ±0.24% (589 runs sampled)	3,428,143 ops/sec ±0.31% (587 runs sampled)
find long dynamic route	2,759,278 ops/sec ±0.27% (590 runs sampled)	2,635,874 ops/sec ±0.31% (584 runs sampled)
find static versioned route	8,826,557 ops/sec ±0.39% (590 runs sampled)	4,678,909 ops/sec ±0.32% (588 runs sampled)
find static constrained (version & host) route	-	4,003,335 ops/sec ±0.37% (588 runs sampled)

I'd like to note that in the current WIP code, the non-constrained routes only have the minimum constraint related code executed for them, which is:

The function deriveConstraints which is obviously necessary
Check at the node level of which constraints apply to determine if the normal or constrained handler store should be checked (Still not done, waiting for feedback on the issue outlined above by @airhorns)

This is the main reason for the slight performance drop in the non-constrained routes

I understand the importance of performance for fastify/find-my-way and that we're aiming to maintain or even improve the benchmarks, but is there some acceptable regression amount that we can still work with? Or should the new code outperform the current one on all tests?

mcollina · 2020-10-13T12:08:14Z

I would say there is no regression amount that is acceptable in the static & dynamic cases. I'm happy to take some regression on the versioned route path as it's more generic (good work!).

…traints function for faster processing

… when provided for faster processing

…ynamically building deriveConstraints function in case of custom strategies

AyoubElk · 2020-10-18T17:43:03Z

I've made some updates since last time, mainly to improve performance:

Revert to previous strategy object format and add a function to convert it to prototype format
For default strategies (version and host), inline their constraints derivation inside the deriveConstraints function
In case custom strategies are provided, dynamically update the deriveConstraints function to add code for faster processing
Add regex matching for host store

These are the current benchmarks:

// request-based-constraints branch
❯ node bench
Routes registered successfully...
lookup static route x 40,244,114 ops/sec ±0.43% (585 runs sampled)
lookup dynamic route x 2,908,414 ops/sec ±1.03% (587 runs sampled)
lookup dynamic multi-parametric route x 1,521,684 ops/sec ±0.93% (582 runs sampled)
lookup dynamic multi-parametric route with regex x 1,208,451 ops/sec ±0.39% (582 runs sampled)
lookup long static route x 2,532,517 ops/sec ±0.42% (588 runs sampled)
lookup long dynamic route x 2,013,666 ops/sec ±0.37% (588 runs sampled)
lookup static versioned route x 4,489,601 ops/sec ±0.30% (587 runs sampled)
lookup static constrained (version & host) route x 4,365,146 ops/sec ±0.55% (581 runs sampled)
find static route x 28,441,579 ops/sec ±1.28% (581 runs sampled)
find dynamic route x 3,542,673 ops/sec ±0.26% (589 runs sampled)
find dynamic multi-parametric route x 1,921,871 ops/sec ±0.24% (589 runs sampled)
find dynamic multi-parametric route with regex x 1,368,976 ops/sec ±0.26% (592 runs sampled)
find long static route x 3,508,879 ops/sec ±0.22% (591 runs sampled)
find long dynamic route x 2,692,937 ops/sec ±0.17% (591 runs sampled)
find static versioned route x 5,445,838 ops/sec ±0.21% (589 runs sampled)
find static constrained (version & host) route x 5,439,364 ops/sec ±0.20% (590 runs sampled)

// master branch
❯ node bench.js
lookup static route x 41,257,268 ops/sec ±0.53% (590 runs sampled)
lookup dynamic route x 2,992,159 ops/sec ±0.47% (587 runs sampled)
lookup dynamic multi-parametric route x 1,592,345 ops/sec ±0.88% (578 runs sampled)
lookup dynamic multi-parametric route with regex x 1,154,690 ops/sec ±0.57% (582 runs sampled)
lookup long static route x 2,639,660 ops/sec ±0.40% (585 runs sampled)
lookup long dynamic route x 2,041,826 ops/sec ±0.42% (587 runs sampled)
lookup static versioned route x 6,604,706 ops/sec ±0.59% (581 runs sampled)
find static route x 28,475,395 ops/sec ±1.07% (579 runs sampled)
find dynamic route x 3,429,277 ops/sec ±0.75% (588 runs sampled)
find dynamic multi-parametric route x 1,914,887 ops/sec ±0.33% (589 runs sampled)
find dynamic multi-parametric route with regex x 1,360,918 ops/sec ±0.50% (585 runs sampled)
find long static route x 3,687,065 ops/sec ±0.82% (587 runs sampled)
find long dynamic route x 2,755,498 ops/sec ±0.43% (582 runs sampled)
find static versioned route x 8,435,637 ops/sec ±0.35% (585 runs sampled)

There is still an open issue that needs to be discussed, which is defining a deterministic constrained route matching order:

It was first described in this comment: #166 (comment)

@airhorns made a good suggestion of matching the first route that matches the constraints, however it seems that it won't work because for example the Semverstore doesn't return the first route that matches the constraints, it returns the handler with the highest version available (which isn't necessarily the first one defined)

router.on('GET', '/users', { constraints: { version: '1.1.0', host: 'admin.airhorns.dev' } }, handler1)
router.on('GET', '/users', { constraints: { version: '1.2.0', host: '*.airhorns.dev' } }, handler2)

For a request with accept-version='1.x' and host='admin.airhorns.dev':

The semver store will match the highest available version and return handler2
The host store will return handler1

In this case, the constraint store will have 2 different handlers, and will consider that no match was found and return null

airhorns · 2020-10-19T15:07:49Z

I feel like the max version ordering for the semver store makes sense semantically and I think we shouldn't break it, but that is indeed annoying! Versions have an ordering where you can definitely sort one version as greater than another, but hosts don't, so I still think we need some other predictable way to sort the routes by a host constraint, and I think the most sensible way would be by addition order.

Would it be possible to just special case it? When there's more than one handler, check if there's a version constraint and use the max version if so, and otherwise just use the first one?

airhorns · 2020-10-24T20:14:42Z

@AyoubElk I don't mind taking a stab at the semantics I just mentioned -- do you have any unpushed code though that I should base changes on?

AyoubElk · 2020-10-25T13:04:43Z

@AyoubElk I don't mind taking a stab at the semantics I just mentioned -- do you have any unpushed code though that I should base changes on?

@airhorns yes, I updated the tests to support the new constraints format (One test is failing currently where a default handler is returned when it shouldn't, I'm trying to come up with a fix for it).

Regarding the semantics you described, I think they might work as a special case but will probably yield unexpected results in case custom strategies are added (the version strategy can be overridden, or the user might want to have another constraint take precedence, etc..). I'm trying to figure out a better way for this as well

…ndlers on each node This makes pretty printing annoying, but increases performance! With n trees instead of one tree, each tree is only split for handlers it actually has, so for HTTP verbs like POST or PUT that tend to have fewer routes, the trees are smaller and faster to traverse. For the HTTP GET tree, there are fewer nodes and I think better cache locality as that tree is traversed the most often. Each verb doesn't pay any traversal penalty for the other trees' size. This also results in more instances of more selective version stores, which means traversing them should be faster at the expense of a bit more memory consumption. This also makes the constraint implementation (see delvedor#166) easier, and prevents bugs like delvedor#132, and avoids the extra checks we have to do to fix that bug. This also prevents tree traversal for methods where there are no routes at all, which is a small optimization but kinda nice regardless. For the pretty printing algorithm, I think a nice pretty print wouldn't be per method and would instead show all routes in the same list, so I added code to merge the separate node trees and then pretty print the merged tree! To make it look pretty I added some "compression" to the tree where branches that only had one branch get compressed down, which if you ask me results in some prettier output, see the tests. Benchmarks: ``` kamloop ~/C/find-my-way (master) ➜ npm run bench; git checkout one-tree-per-method; npm run bench > find-my-way@3.0.4 bench /Users/airhorns/Code/find-my-way > node bench.js lookup static route x 42,774,309 ops/sec ±0.84% (580 runs sampled) lookup dynamic route x 3,536,084 ops/sec ±0.70% (587 runs sampled) lookup dynamic multi-parametric route x 1,842,343 ops/sec ±0.92% (587 runs sampled) lookup dynamic multi-parametric route with regex x 1,477,768 ops/sec ±0.57% (590 runs sampled) lookup long static route x 3,350,884 ops/sec ±0.62% (589 runs sampled) lookup long dynamic route x 2,491,556 ops/sec ±0.63% (585 runs sampled) lookup static versioned route x 9,241,735 ops/sec ±0.44% (586 runs sampled) find static route x 36,660,039 ops/sec ±0.76% (581 runs sampled) find dynamic route x 4,473,753 ops/sec ±0.72% (588 runs sampled) find dynamic multi-parametric route x 2,202,207 ops/sec ±1.00% (578 runs sampled) find dynamic multi-parametric route with regex x 1,680,101 ops/sec ±0.76% (579 runs sampled) find long static route x 4,633,069 ops/sec ±1.04% (588 runs sampled) find long dynamic route x 3,333,916 ops/sec ±0.76% (586 runs sampled) find static versioned route x 10,779,325 ops/sec ±0.73% (586 runs sampled) find long nested dynamic route x 1,379,726 ops/sec ±0.45% (587 runs sampled) find long nested dynamic route with other method x 1,962,454 ops/sec ±0.97% (587 runs sampled) > find-my-way@3.0.4 bench /Users/airhorns/Code/find-my-way > node bench.js lookup static route x 41,200,005 ops/sec ±0.98% (591 runs sampled) lookup dynamic route x 3,553,160 ops/sec ±0.28% (591 runs sampled) lookup dynamic multi-parametric route x 2,047,064 ops/sec ±0.83% (584 runs sampled) lookup dynamic multi-parametric route with regex x 1,500,267 ops/sec ±0.64% (590 runs sampled) lookup long static route x 3,406,235 ops/sec ±0.77% (588 runs sampled) lookup long dynamic route x 2,338,285 ops/sec ±1.60% (589 runs sampled) lookup static versioned route x 9,239,314 ops/sec ±0.40% (586 runs sampled) find static route x 35,230,842 ops/sec ±0.92% (578 runs sampled) find dynamic route x 4,469,776 ops/sec ±0.33% (590 runs sampled) find dynamic multi-parametric route x 2,237,214 ops/sec ±1.39% (585 runs sampled) find dynamic multi-parametric route with regex x 1,533,243 ops/sec ±1.04% (581 runs sampled) find long static route x 4,585,833 ops/sec ±0.51% (588 runs sampled) find long dynamic route x 3,491,155 ops/sec ±0.45% (589 runs sampled) find static versioned route x 10,801,810 ops/sec ±0.89% (580 runs sampled) find long nested dynamic route x 1,418,610 ops/sec ±0.68% (588 runs sampled) find long nested dynamic route with other method x 2,499,722 ops/sec ±0.38% (587 runs sampled) ```

AyoubElk added 16 commits October 3, 2020 11:21

move accept version to strategies folder and rename deriveVersion to …

f19a0db

…the generic name deriveConstraint

add accept-host strategy with exact match support

1220b07

add constraints store

5544329

add accept-constraints

ab39b39

use constraintsStorage instead of version storage

b311ec0

add kConstraints property and update reset()

141f42f

update getVersionHandler() to support constraints

25f7b04

add getMatchingHandler() to support both constrained and non-constrai…

c6b11d7

…ned handler retrieval

fix typo

eede77e

replace findChild() and findVersionChild() with generic findMatchingC…

9db137b

…hild()

update setVersionHandler() to support constraints

ac61bde

use acceptConstraints instead of acceptVersionStrategy

9c3d4d8

replace version usage with constraints

fe27b59

use constraintsExtractor

212c3ae

use getMatchingHandler() and findMatchingChild()

8650a7c

use getConstraintsHandler() and setConstraintsHandler()

edd00c9

mcollina reviewed Oct 4, 2020

View reviewed changes

mcollina added the semver-major label Oct 5, 2020

AyoubElk added 6 commits October 12, 2020 13:27

refactor version & host strategies to use prototype for performance i…

edde841

…mprovement

replace getConstraintsExtractor with deriveConstraints

537b684

fix assert import

c05c763

update constraintsStore to centralize store storage in a shared map

b9954dc

update bench file to support new constraints format

303f2f3

match first handler that passes constraints check

4aa5990

AyoubElk added 14 commits October 15, 2020 16:21

revert to previous strategy format & remove deriveConstraint functions

1c49cac

add regex matching for host store

6abe279

add strategyObjectToPrototype function

35e9da8

convert stratgies to prototype format

65bfdcf

inline constraint derivation for default strategies inside deriveCons…

ce44dcb

…traints function for faster processing

support conditional constraints derivation for custom strategies only…

0e04077

… when provided for faster processing

remove unused array

8c66309

replace this.getHandler call for faster processing

ce0260d

replace let with var

7d7e0f2

update bench.js file to use null instead of undefined

52a3479

update Node.getMatchingHandler to add hasConstraint boolean

df674d0

update Node constructor options to support kConstraints

f809308

validate custom strategies format

159d941

optimize performance by inlining default constraints derivation and d…

6f83041

…ynamically building deriveConstraints function in case of custom strategies

lint code and update tests

bf4a5e1

airhorns mentioned this pull request Oct 27, 2020

Switch to using one tree per method instead of a map #168

Merged

This was referenced Oct 29, 2020

Pretty print handler names #169

Closed

Constrained routes take 2 #170

Merged

delvedor closed this in #170 Feb 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request based constraints implementation #166

Request based constraints implementation #166

AyoubElk commented Oct 3, 2020 •

edited

Loading

mcollina left a comment

mcollina Oct 4, 2020

AyoubElk Oct 5, 2020

mcollina Oct 5, 2020

mcollina Oct 4, 2020

AyoubElk Oct 5, 2020

mcollina Oct 4, 2020

mcollina Oct 4, 2020

mcollina Oct 4, 2020

AyoubElk Oct 5, 2020

airhorns commented Oct 8, 2020

AyoubElk commented Oct 12, 2020

AyoubElk commented Oct 12, 2020 •

edited

Loading

mcollina commented Oct 13, 2020

AyoubElk commented Oct 13, 2020

mcollina commented Oct 13, 2020

AyoubElk commented Oct 18, 2020

airhorns commented Oct 19, 2020 •

edited

Loading

airhorns commented Oct 24, 2020

AyoubElk commented Oct 25, 2020

		@@ -0,0 +1,47 @@
		'use strict'

		const { assert } = require('console')

Request based constraints implementation #166

Request based constraints implementation #166

Conversation

AyoubElk commented Oct 3, 2020 • edited Loading

01. API updates:

a. Constructor:

b. Route registration (.on):

c. Routing (.find):

02. Internals:

a. accept-constraints:

b. constraints-store:

c. node:

mcollina left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

airhorns commented Oct 8, 2020

AyoubElk commented Oct 12, 2020

AyoubElk commented Oct 12, 2020 • edited Loading

Updates

1. Replace getConstraintsExtractor with deriveConstraints

2. Update constraintsStore to centralize store storage in a shared map

3. Change strategy object format

Benchmarks comparison

1. Master branch

2. Request-based-constraints branch

3. Strategies object/array

4. Strategy format object vs prototype

Next steps

mcollina commented Oct 13, 2020

AyoubElk commented Oct 13, 2020

mcollina commented Oct 13, 2020

AyoubElk commented Oct 18, 2020

airhorns commented Oct 19, 2020 • edited Loading

airhorns commented Oct 24, 2020

AyoubElk commented Oct 25, 2020

AyoubElk commented Oct 3, 2020 •

edited

Loading

AyoubElk commented Oct 12, 2020 •

edited

Loading

1. Replace `getConstraintsExtractor` with `deriveConstraints`

2. Update `constraintsStore` to centralize store storage in a shared map

airhorns commented Oct 19, 2020 •

edited

Loading