Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support server selection priorities / weights #252

Closed
armon opened this issue Jul 18, 2014 · 19 comments
Closed

Support server selection priorities / weights #252

armon opened this issue Jul 18, 2014 · 19 comments
Labels
theme/service-metadata Anything related to management/tracking of service metadata type/enhancement Proposed improvement or new feature

Comments

@armon
Copy link
Member

armon commented Jul 18, 2014

If we support priories, then you can in support cases like a server in a remote region (for backup / quorum purposes). The priorities would disable client routing through them unless necessary (higher priority servers have failed / are unreachable).

@morgante
Copy link

morgante commented Aug 28, 2014

It'd also be useful for gradually scaling in new versions of a service.

@carlivar
Copy link

I could see a use-case for weights as well. We have some different hardware profiles, and I would love to be able to weight nodes accordingly since we know performance for a particular piece of software is a certain percentage worse on certain hardware.

@kcd83
Copy link

kcd83 commented Sep 7, 2014

This could also allow failover for services where round robin isn't preferable/possible e.g failover a load balancer for a service that has stickiness

@highlyunavailable
Copy link
Contributor

#488 is basically this as well, but just inside a service rather than cross-datacenter. +1 for this.

@pepov
Copy link

pepov commented Apr 18, 2015

Would these priorities be static or dynamic?

@sean-
Copy link
Contributor

sean- commented Jun 8, 2015

👍 Being able to send <1% of traffic to a standby as a way of exercising DR paths would be great. Also a crude mechanisms to handle different capabilities between hardware.

@camerondavison
Copy link
Contributor

Can this be done with the new "network coordinates" or the "prepared queries". I was hoping that we could add something like this to accomplish #488

@slackpad
Copy link
Contributor

@a86c6f7964 prepared queries can help across datacenters for sure (using pre-configured fallbacks or network coordinates, or both). Within a datacenter, many HTTP endpoints now support the ?near= argument that lets you find the closest service, but this issue still stands for a more general weighting feature.

@camerondavison
Copy link
Contributor

I guess I meant, the code that was added recently would help support easily adding this feature. Sorry for the mis-communication.

@sean-
Copy link
Contributor

sean- commented Mar 27, 2016

While not exactly server priorities, many on this issue have referenced automatic DC failover as a reason for wanting server priorities. As @slackpad mentioned earlier, with Prepared Queries this is possible, and has been made easier with Prepared Query Templates. We held a webinar and covered this at around minute 31.

https://www.youtube.com/watch?v=FGbzS6ripXA&feature=youtu.be&t=1690

@doublerebel
Copy link

👍
I arrived here looking for the solution to #1229. Prepared Queries are really cool and would solve both #1229 and #488 if we had one special tag for "local".

With tag local we could define remote as a prepared query template:

{
  "Name": "remote-",
  "Template": {
    "Type": "name_prefix_match"
  },
  "Service": {
    "Service": "${name.suffix}",
    "Tags": ["!local"]
  }
}

This is actually already possible if we merge the local implementation in PR #1231.

Then if we add failover-to-query, we can do priorities:

{
  "Name": "local-first-",
  "Template": {
    "Type": "name_prefix_match"
  },
  "Service": {
    "Service": "${name.suffix}",
    "Tags": ["local"],
    "Failover": {
      "Query": "remote-${name.suffix}",
  }
}

With failover-to-query it could even finally failover to another DC if local and remote aren't available.
(Although with anything recursive someone is bound to shoot themselves in the foot with it, we may want to guard against that.)

Weights are a bit tougher, because then Consul has to keep some kind of state about how many queries for a certain lookup. I'm using ebay/Fabio (it does weighted routing with Consul tags and services) and there are a lot of other solutions out there for weighted routing.

However, I think adding a single "local" tag and fallback-to-query, together with existing tag and query functionality, would require the least code changes yet still allow flexibility in composing complex queries. Tag groups could fallback to any other tag groups and nodes can be prioritized in any order.

EDIT: this fallback functionality is also specifically requested in #1159

@sean-
Copy link
Contributor

sean- commented Mar 27, 2016

Why not just look up everything as a query?

Sean Chittenden

@cirocosta
Copy link

Hey, this got me wondering, when an agent establishes a connection with a server, does it pick the nearest? I couldn't find that info in the docs.

@preetapan
Copy link
Contributor

@cirocosta agents send the query to a server, which may forward it to a leader depending on the consistency mode of the query.
"nearness" as per the coordinate subsystem is only used in the catalog/health endpoints if you specify a query param. See https://www.consul.io/docs/internals/coordinates.html for more info.

@slackpad
Copy link
Contributor

Adding to what @preetapan said - agents pick a random server and use that for a while, and then periodically choose a new one at a frequency that's dependent on the size of the cluster. This gives users of stale queries the best chance of having their load spread across the cluster. Since many kinds of requests have to be forwarded to the leader internally by the servers, it doesn't give much of an advantage to choose the nearest one.

@dbason
Copy link

dbason commented Sep 4, 2017

Weighting would also be useful for load based routing to a service. Say we have a service that has 3 nodes but one of those nodes gets under medium load. We'd like to prefer the other 2, but still leave that node in the service in case the others go down.

@dbason
Copy link

dbason commented Sep 6, 2017

@sean-
Would love to look everything up as a query but it doesn't quite have the functionality at the moment to meet our use cases. For example:
We have a 3 node service with a robust health check. When a node becomes degraded our healthcheck puts it into warning state (we could also do this with healthchecks based on load). We want to prefer nodes that are green but if there are none available then we want to return the yellow (warning) nodes. We don't want to failover datacenters, we still want to return the local nodes.
Something like what is suggested in https://groups.google.com/forum/#!topic/consul-tool/Rm4P7dSTsY0 would work for that use case.

There is also the use case mentioned above about introducing a new server in a canary style fashion, i.e. sending x% of random requests to the new server (or again in the case of load dropping the number of clients being sent to a node but still serving it). Conul wouldn't need to track connections to a node in this case, just offer a priority field that can be set for a service on a node, the randomized order from a standard service lookup could then be weighted by that priority.

@slackpad slackpad added this to the Unplanned milestone Jan 5, 2018
@tmanninger
Copy link

We are also need this feature. There are any plans?

@schristoff schristoff removed this from the Unplanned milestone Nov 12, 2019
@hanshasselberg
Copy link
Member

Thank you for reporting and helping with this issue/feature request!
This is something that we considered doing in 2014, but we didn't actually go down that route. This feature is supported by Consul these days: https://www.consul.io/docs/connect/l7-traffic-management.html which is why I am closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theme/service-metadata Anything related to management/tracking of service metadata type/enhancement Proposed improvement or new feature
Projects
None yet
Development

No branches or pull requests