-
Notifications
You must be signed in to change notification settings - Fork 600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial attempt at 429 Retry-After header support #5285
Initial attempt at 429 Retry-After header support #5285
Conversation
Hi @travis-minke-sap. Thanks for your PR. I'm waiting for a knative member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Codecov Report
@@ Coverage Diff @@
## main #5285 +/- ##
==========================================
+ Coverage 82.78% 83.83% +1.05%
==========================================
Files 197 243 +46
Lines 6093 6955 +862
==========================================
+ Hits 5044 5831 +787
- Misses 724 782 +58
- Partials 325 342 +17
Continue to review full report at Codecov.
|
I think producing an opt-in header is one of those features that I would categorize between Experimental features, because is still a behavioural change. WDYT @travis-minke-sap? |
+1 to @slinkydeveloper 's comment. This seems like a reasonable feature but it also changes behaviour and we are trying to tread lightly on changes to core until we can get this 1.0 out the door. |
Hey @slinkydeveloper and @newscott - thanks for the feedback! I wouldn't necessarily characterize this change as a "new feature", as it is an incremental step towards addressing a deficiency in adhering to the CloudEvents Specification. The implementation attempts to add the ability to respect the I do acknowledge, though, that the underlying implementation is changing, and there is a possibility for unintended bugs / regressions to be introduced. If the risk of such a possibility is too great for the current timeframe (pre-V1) that is understandable. Would you be open to this approach (and possibly exposing a flag in the DeliverySpec) if this was postponed until after the V1 release? If you feel strongly that this needs to go through the "experimental-features" track, regardless of timeframes, then I have the following questions / concerns...
If the infrastructure mentioned above was already in-place and available, it might not be too large an effort to go through the experimental-feature process, but if I have to build all that out just to have the privilege of being the first use case... It's a hard sell ; ) I do feel bad not offering to take that on, as it would be a win for the project, but I just don't have the time available (internal management, deadlines, etc.) I'm also not 100% certain that we will need this capability internally yet, and was trying to provide a low-touch effort at improving things. Maybe waiting till post-V1 is a possibility? I assume that post-V1 we're not going to require every new change to inherently be an "experimental feature" and that issues like this (opt-in adherence to the CloudEvent Spec) can just be added via the normal process? /hold |
So first of all, here we're talking about the Webhook spec, which is an "optional" spec. I feel this particular issue is part of the larger topic whether we want to support this spec or not. There is an open issue, opened a while ago, about it #3092.
The process is still being debated, feel free to chime in, but we already have part of the WG leads agreeing. The infra is being worked here: #5214, so you won't need to do anything more than opening an issue with the plan, send a mail, and do little changes to this PR.
I don't think we need to wait post-V1. As soon as we start with experimental features, I'll definitely be happy to endorse this effort through the experimental features process, and hopefully the larger one of supporting cloudevents webhook spec. |
Ah, great point about the Webhook Spec being optional - I lost sight of that in the interim from initiating the Discussion and resurrecting this effort. Also, I was completely unaware of those prior efforts - no one ever mentioned them or added to the Discussion (I'll add a link). This makes me think I might have over-simplified the approach?
Great to hear that the experimental-features framework is in progress!
OK, sounds good. Once the experimental-features stuff is ready I can re-evaluate our need for supporting the Retry-After header and start again down that path. Thanks for your patience and for providing the background/status in order to level set me ; ) |
Closing for now - will re-asses later ; ) |
Re-opening to keep it on the radar ; ) |
@travis-minke-sap FYI the experimental feature process is now merged https://github.com/knative/eventing/blob/main/docs/experimental-features.md |
/ok-to-test |
We might need this for v1.0 depending on whether the spec precisely defines how sender must behave when receiving 429. Adding this PR to the v1 project so we don't forget about it. We can always remove it if the spec stays vague. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: travis-minke-sap The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Depending on if/how this is added to the V1 spec... we might want to consider removing the "configurability" or opt-in nature of the PR? Just something to consider as when it was originally written it was not part of the spec and I was trying to "tread lightly" ; ) |
The following is the coverage report on the affected files.
|
|
||
var retryAfterDuration time.Duration | ||
|
||
if resp != nil && resp.StatusCode == nethttp.StatusTooManyRequests && resp.Header != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also honor the 503? https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/503
Noticed look here: https://github.com/hashicorp/go-retryablehttp/blob/master/client.go#L474
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whatever we decide, make sure to add a comment to the spec as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also honor the 503? https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/503
Yeah I agree, good catch - thanks!
yeah, I agree making this the default! Perhaps deprecating/retire the other Retry func? |
@travis-minke-sap: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@travis-minke-sap still working on this? |
I am planning on coming back to it in the near-ish future. Probably a rewrite without the "opt-in" plumbing. Feel free to take it if you're interested in running with it ; ). Or are you just checking to see whether it can be closed? |
just checking :-) |
cc @embano1 |
HEADS UP... This PR is old/state and I will be creating a new/separate PR in the near future with an similar approach based on enhancements to the DeliverySpec wrapped in experimental-feature protection - stay tuned ; ) |
great, Travis!
On Wed 6. Oct 2021 at 18:30, Travis Minke ***@***.***> wrote:
HEADS UP... This PR is old/state and I will be creating a new/separate PR
in the near future with an similar approach based on enhancements to the
DeliverySpec wrapped in experimental-feature protection - stay tuned ; )
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#5285 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABGPTXCDFPISD6RGNRP5OTUFR2QLANCNFSM43IUACOA>
.
--
Sent from Gmail Mobile
|
@travis-minke-sap can we also add support for 503 retry-after? Knative Service replies with a 503 when the queue is overloaded. AFAIK, it currently does not set |
This is the plan we agreed in the last call
|
Yes, Matthias had previously requested this and I have included 503 in the implementation I'll be creating a new Issue/PR for soon (waiting on the recociler-test changes so I can finish the e2e - it's almost ready ; ) Also... Yes - what Pier stated is how it is implemented. Here's a sneak preview of the new
|
@travis-minke-sap Can we close this, since we have #5813 ? |
Yep - was on my list to do so - was just waiting out of paranoia to see how the reviews for #5813 might go ; ). I'll close it now though. |
The common logic for sending CloudEvents (see kncloudevents/message_sender.go) does not support the Retry-After header for 429 StatusCodes as specified in the CloudEvent Spec.
See Discussion #5011 for full details.
🎁 Creating RetryConfig from DeliverySpec now supports ability to indicate preference for respecting the 429 Retry-After header in message sending responses as specified in the CloudEvents specification.
Proposed Changes
Release Notes