Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify backend servers' weight via annotation for kubernetes #3112

Merged

Conversation

yue9944882
Copy link
Contributor

@yue9944882 yue9944882 commented Apr 1, 2018

What does this PR do?

Fixes #2729. Also previous discussions.

Provides a new ingress annotation ingress.kubernetes.io/backend-weights which specifies a YAML-encoded, percentage-based weight distribution. With this annotation, we can do canary release by dynamically adjust the weight of ingress backends.

Since that currently the weight of types.Server is integer, so I created a simple allocator to make the weight of the server as average as possible.

Motivation

Introduce weight-based canary release to kubernetes provider with minimal change.

More

  • Added/updated tests
  • Added/updated documentation

Additional Notes

@yue9944882
Copy link
Contributor Author

yue9944882 commented Apr 1, 2018

@timoreimann In previous discussion, you proposed a percentage-based value. Do you mean a string in the form like "12%" or implicitly a integer value ranged from 0 to 100? As for the latter one, do we need to constraint that the weights under the same ingress.Spec.Rule.Path should sum up to 100 to make sure it is "percentaged"?

btw what if we want a smaller granularity like 1/1000-ish?

Current implementation in this PR is just a copy of skipper's implementation.

@yue9944882 yue9944882 force-pushed the feature/kubernetes-weight-annotation branch from 3c3ba18 to e95601e Compare April 1, 2018 16:41
@yue9944882 yue9944882 changed the title feat(kubernetes): specify backend servers' weight via annotation Specify backend servers' weight via annotation for kubernetes Apr 1, 2018
@yue9944882 yue9944882 force-pushed the feature/kubernetes-weight-annotation branch from e95601e to 0b81317 Compare April 1, 2018 16:51
@trondhindenes
Copy link

@yue9944882 I think it should match Traefik's "regular" weight logic, where weight is just an arbitrary number and doesn't have to sum up to 100 in order to be valid.

@yue9944882
Copy link
Contributor Author

@trondhindenes +1 with that. But the problem is how to spread the weight without breaking the proportion of the weights from annotation when they are integer.

My currently logic tries to calculate a remainder and re-assign it to the servers by 1 each time. Otherwise we can scale the integer 1000 times aka taking the weight as milli-value which is similar to what resource.Quantity does in kubernetes to help to do division on calculation of total weight / instance number.

// total weight as average as possible by integer.
type weightPoolAllocator struct {
flooredWeight int
weightRemainer int
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo. will fix it after @timoreimann reviews.

@timoreimann
Copy link
Contributor

timoreimann commented Apr 9, 2018

@yue9944882 I suppose the answer to your question depends on what we consider to be the most useful UI to specify weights. My personal opinion is that I'd rarely want to specify them literally -- I'd rather prefer to note down percentage values and rely on the implementation to translate those into integer-based, relative weights while taking the number of endpoints into account accordingly. While I could do the math myself, it's not trivial and seems to require knowledge about how many endpoints I have.

To clarify my argument, let's look at a service that fronts 20 endpoints. If I assign a weight of 10 to that service, it means that only half of those endpoints will actually receive traffic, right? That'd be quite wasteful. With only literal weights to specify, I'd have to know the weight-to-endpoints relationship a priori and scale my weights accordingly so that all endpoints receive some traffic, all while aiming for evenly distributed traffic overall. IMHO, the algorithm should take care of this, which it should be able to if we just specify percentages.

Does that make sense?

@yue9944882 yue9944882 force-pushed the feature/kubernetes-weight-annotation branch from 0b81317 to e93bbf6 Compare April 10, 2018 16:57
@yue9944882
Copy link
Contributor Author

yue9944882 commented Apr 10, 2018

@timoreimann PTAL again

To clarify my argument, let's look at a service that fronts 20 endpoints. If I assign a weight of 10 to that service, it means that only half of those endpoints will actually receive traffic, right?

I just changed the implementation. Now we use regard all weight as a milli value at the first place e.g. 10 -> 10 * 1000, then we can do the division with maximum loss of 10^-3 precision. WDYT?

@yue9944882 yue9944882 force-pushed the feature/kubernetes-weight-annotation branch 2 times, most recently from e844cfc to e15ea47 Compare April 11, 2018 07:53
@timoreimann
Copy link
Contributor

@yue9944882 thanks, I think that should address the scaling problem.

Before discussing about implementation details, I'd like to raise the question again if pure integer weights represent the right interface as opposed to percentages. My guess is that most users would think in terms of percentage values anyway and translate them into integer values because Traefik would require them at the current stage. If that's the case, shouldn't we decide to accept percentages exclusively? Or is there a legitimate use case to specify integers?

@timoreimann
Copy link
Contributor

@trondhindenes appreciate any feedback you may have on the question I raised above.

@yue9944882
Copy link
Contributor Author

@timoreimann Umm..🧐I'd +1 if we could use percentage weight. This may help if other providers is doing similar weight-specifying things. And my little concern is that whether percentage weight would be cause some overhead or potential "float-precision-overflowing" problem.

@timoreimann
Copy link
Contributor

timoreimann commented Apr 12, 2018

@yue9944882 overhead should be neglectable I'd hope... AFAICS, it should just be another small computational step to turn floats into integer. For instance, if a user specified 10% and 90%, we'd just have to multiply that by a large enough factor to yield integers big enough so that we don't run into the scaling problem.

By "float precision overflow", I suppose you refer to the sum of all percentages not yielding 100%? That's something we'd have to check and either signal an error or cap the overflow somehow (within reason). My hopes are that floats in Go are doing better than in JS, but let's verify this before going all-in on percentages.

@timoreimann
Copy link
Contributor

Had a chat with @marco-jantke, he came up with another useful feature for the percentage interface: It'd be convenient if you could omit the specification for the last fraction and let Traefik compute it. For instance, if I were to specify 5% on one of two services, then Traefik should conclude that 95% need to go the other service.

This would allow users to promote canary release stages easily just by updating the percentage of the new deployment/service gradually (e.g., 1%, 5%, 25%, 50%, 100%), as opposed to having to update both deployments/services (e.g., 1%/99%, 5%/95%, etc.).

@trondhindenes
Copy link

@timoreimann that makes sense. Still, I guess I'd prefer consistency with how Traefik works "outside" of Kubernetes, where the weight is just an arbitrary number (30 + 30 + 100 is completely okay).

That said: I'm fine either way - as long as things are documented I don't think it matters too much. I totally see the point in allowing auto-calculation of the "remainder weight" if using percentages - that's functionality that (afaik) doesn't exist in Traefik today.

@yue9944882
Copy link
Contributor Author

yue9944882 commented Apr 17, 2018

overhead should be neglectable I'd hope...

Yeah, since there are little overhead from these provider parts. Overhead here should be neglectable.

It'd be convenient if you could omit the specification for the last fraction and let Traefik compute it.

@timoreimann Agree that it would be more convenient. But does it mean that we could only specify two service for each ingress rule in this feature? IDK if there are use-cases where we need to specify 3+. E.g. we may do "canary analysis" in which we need to split the traffic 98%(main traffic), 1 %(canary traffic), 1%(baseline traffic) and do comparative analysis over them.

@timoreimann
Copy link
Contributor

timoreimann commented Apr 17, 2018

@yue9944882 the idea is that you can only omit the last fraction. So in your example, you'd likely want to specify 1% for the canary, 1% for the baseline, and omit the 98% for the main traffic as a convenience. If you had two canaries each with baselines running plus the main traffic, you could specify 4 out of 5 fractions, etc. Specifying all fractions should still be possible -- omitting is just a convenience thing.

Does that make sense?

@yue9944882
Copy link
Contributor Author

@timoreimann Sure, it does make perfect sense :)

@timoreimann
Copy link
Contributor

@yue9944882 great :-)

Would you like to update the PR so that it uses a percentage-based interface, or shall I? Both is fine with me, especially since my team wants this feature in our version of Traefik as well.

@timoreimann
Copy link
Contributor

Updated the PR description to reflect the final implementation stage we have reached.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants