Support server selection priorities / weights #252

armon · 2014-07-18T16:58:03Z

If we support priories, then you can in support cases like a server in a remote region (for backup / quorum purposes). The priorities would disable client routing through them unless necessary (higher priority servers have failed / are unreachable).

morgante · 2014-08-28T08:05:58Z

It'd also be useful for gradually scaling in new versions of a service.

carlivar · 2014-08-28T17:19:49Z

I could see a use-case for weights as well. We have some different hardware profiles, and I would love to be able to weight nodes accordingly since we know performance for a particular piece of software is a certain percentage worse on certain hardware.

kcd83 · 2014-09-07T06:15:30Z

This could also allow failover for services where round robin isn't preferable/possible e.g failover a load balancer for a service that has stickiness

highlyunavailable · 2015-03-25T16:20:28Z

#488 is basically this as well, but just inside a service rather than cross-datacenter. +1 for this.

pepov · 2015-04-18T08:26:43Z

Would these priorities be static or dynamic?

sean- · 2015-06-08T13:37:43Z

👍 Being able to send <1% of traffic to a standby as a way of exercising DR paths would be great. Also a crude mechanisms to handle different capabilities between hardware.

camerondavison · 2016-01-13T02:28:46Z

Can this be done with the new "network coordinates" or the "prepared queries". I was hoping that we could add something like this to accomplish #488

slackpad · 2016-01-13T22:33:47Z

@a86c6f7964 prepared queries can help across datacenters for sure (using pre-configured fallbacks or network coordinates, or both). Within a datacenter, many HTTP endpoints now support the ?near= argument that lets you find the closest service, but this issue still stands for a more general weighting feature.

camerondavison · 2016-01-13T22:36:28Z

I guess I meant, the code that was added recently would help support easily adding this feature. Sorry for the mis-communication.

sean- · 2016-03-27T07:26:46Z

While not exactly server priorities, many on this issue have referenced automatic DC failover as a reason for wanting server priorities. As @slackpad mentioned earlier, with Prepared Queries this is possible, and has been made easier with Prepared Query Templates. We held a webinar and covered this at around minute 31.

https://www.youtube.com/watch?v=FGbzS6ripXA&feature=youtu.be&t=1690

doublerebel · 2016-03-27T18:20:19Z

👍
I arrived here looking for the solution to #1229. Prepared Queries are really cool and would solve both #1229 and #488 if we had one special tag for "local".

With tag local we could define remote as a prepared query template:

{
  "Name": "remote-",
  "Template": {
    "Type": "name_prefix_match"
  },
  "Service": {
    "Service": "${name.suffix}",
    "Tags": ["!local"]
  }
}

This is actually already possible if we merge the local implementation in PR #1231.

Then if we add failover-to-query, we can do priorities:

{
  "Name": "local-first-",
  "Template": {
    "Type": "name_prefix_match"
  },
  "Service": {
    "Service": "${name.suffix}",
    "Tags": ["local"],
    "Failover": {
      "Query": "remote-${name.suffix}",
  }
}

With failover-to-query it could even finally failover to another DC if local and remote aren't available.
(Although with anything recursive someone is bound to shoot themselves in the foot with it, we may want to guard against that.)

Weights are a bit tougher, because then Consul has to keep some kind of state about how many queries for a certain lookup. I'm using ebay/Fabio (it does weighted routing with Consul tags and services) and there are a lot of other solutions out there for weighted routing.

However, I think adding a single "local" tag and fallback-to-query, together with existing tag and query functionality, would require the least code changes yet still allow flexibility in composing complex queries. Tag groups could fallback to any other tag groups and nodes can be prioritized in any order.

EDIT: this fallback functionality is also specifically requested in #1159

sean- · 2016-03-27T21:00:13Z

Why not just look up everything as a query?

Sean Chittenden

cirocosta · 2017-08-29T15:08:50Z

Hey, this got me wondering, when an agent establishes a connection with a server, does it pick the nearest? I couldn't find that info in the docs.

preetapan · 2017-08-29T15:45:17Z

@cirocosta agents send the query to a server, which may forward it to a leader depending on the consistency mode of the query.
"nearness" as per the coordinate subsystem is only used in the catalog/health endpoints if you specify a query param. See https://www.consul.io/docs/internals/coordinates.html for more info.

slackpad · 2017-08-29T16:12:48Z

Adding to what @preetapan said - agents pick a random server and use that for a while, and then periodically choose a new one at a frequency that's dependent on the size of the cluster. This gives users of stale queries the best chance of having their load spread across the cluster. Since many kinds of requests have to be forwarded to the leader internally by the servers, it doesn't give much of an advantage to choose the nearest one.

dbason · 2017-09-04T22:39:02Z

Weighting would also be useful for load based routing to a service. Say we have a service that has 3 nodes but one of those nodes gets under medium load. We'd like to prefer the other 2, but still leave that node in the service in case the others go down.

dbason · 2017-09-06T21:07:51Z

@sean-
Would love to look everything up as a query but it doesn't quite have the functionality at the moment to meet our use cases. For example:
We have a 3 node service with a robust health check. When a node becomes degraded our healthcheck puts it into warning state (we could also do this with healthchecks based on load). We want to prefer nodes that are green but if there are none available then we want to return the yellow (warning) nodes. We don't want to failover datacenters, we still want to return the local nodes.
Something like what is suggested in https://groups.google.com/forum/#!topic/consul-tool/Rm4P7dSTsY0 would work for that use case.

There is also the use case mentioned above about introducing a new server in a canary style fashion, i.e. sending x% of random requests to the new server (or again in the case of load dropping the number of clients being sent to a node but still serving it). Conul wouldn't need to track connections to a node in this case, just offer a priority field that can be set for a service on a node, the randomized order from a standard service lookup could then be weighted by that priority.

tmanninger · 2019-06-06T06:50:35Z

We are also need this feature. There are any plans?

hanshasselberg · 2020-05-04T07:35:41Z

Thank you for reporting and helping with this issue/feature request!
This is something that we considered doing in 2014, but we didn't actually go down that route. This feature is supported by Consul these days: https://www.consul.io/docs/connect/l7-traffic-management.html which is why I am closing this issue.

armon added the enhancement label Jul 18, 2014

highlyunavailable mentioned this issue Sep 10, 2015

Proposal: Create a Way to Find Local Service with DNS Discovery #1229

Closed

ross mentioned this issue Mar 17, 2017

Prepared queries that include results from multiple DCs #2803

Open

slackpad mentioned this issue May 1, 2017

Allow service specific values for health checks and service discovery prioritization #418

Closed

slackpad added the theme/service-metadata Anything related to management/tracking of service metadata label May 2, 2017

slackpad added this to the Unplanned milestone Jan 5, 2018

schristoff removed this from the Unplanned milestone Nov 12, 2019

schristoff added the old-issue label Nov 12, 2019

jsosulska removed the close-old-issue-🤖 label Apr 15, 2020

hanshasselberg closed this as completed May 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support server selection priorities / weights #252

Support server selection priorities / weights #252

armon commented Jul 18, 2014

morgante commented Aug 28, 2014 •

edited by sethvargo

Loading

carlivar commented Aug 28, 2014

kcd83 commented Sep 7, 2014 •

edited by sethvargo

Loading

highlyunavailable commented Mar 25, 2015

pepov commented Apr 18, 2015

sean- commented Jun 8, 2015

camerondavison commented Jan 13, 2016

slackpad commented Jan 13, 2016

camerondavison commented Jan 13, 2016

sean- commented Mar 27, 2016

doublerebel commented Mar 27, 2016

sean- commented Mar 27, 2016

cirocosta commented Aug 29, 2017

preetapan commented Aug 29, 2017

slackpad commented Aug 29, 2017

dbason commented Sep 4, 2017

dbason commented Sep 6, 2017

tmanninger commented Jun 6, 2019

hanshasselberg commented May 4, 2020

Support server selection priorities / weights #252

Support server selection priorities / weights #252

Comments

armon commented Jul 18, 2014

morgante commented Aug 28, 2014 • edited by sethvargo Loading

carlivar commented Aug 28, 2014

kcd83 commented Sep 7, 2014 • edited by sethvargo Loading

highlyunavailable commented Mar 25, 2015

pepov commented Apr 18, 2015

sean- commented Jun 8, 2015

camerondavison commented Jan 13, 2016

slackpad commented Jan 13, 2016

camerondavison commented Jan 13, 2016

sean- commented Mar 27, 2016

doublerebel commented Mar 27, 2016

sean- commented Mar 27, 2016

cirocosta commented Aug 29, 2017

preetapan commented Aug 29, 2017

slackpad commented Aug 29, 2017

dbason commented Sep 4, 2017

dbason commented Sep 6, 2017

tmanninger commented Jun 6, 2019

hanshasselberg commented May 4, 2020

morgante commented Aug 28, 2014 •

edited by sethvargo

Loading

kcd83 commented Sep 7, 2014 •

edited by sethvargo

Loading