-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider Supporting 429 Throttling #78
Comments
This is challenging to support given the current implementation of the We need the response in order to do any advanced 429 handling, as we'll need the retry-after header. It's also required for handling other statuses above 400 in the current implementation. We could potentially try to extract the status out of the error, although that seems pretty undesirable given we should just have it in the response. Returning an error causes the response to be filtered out many other places as well (including core libraries it appears), so it is not a matter of simply also returning the response. In the current implementation, this means that this line [1] and this line [2] are never reached, because the error code is >400 and the decode middleware appies. The former is not unit tested and the latter has the same behavior as a nil response body, which is why its test passes. I would like to understand the need for the error decoding middleware -- @bmoylan @jamesross3 can you please describe what exactly we need it for? I would like to remove it and force consumers to do this sort of operation, rather than embedding it in the client. |
I'm not fully familiar with this, but given that specifying the decoding middleware is a pretty central part of the current design, I think that removing that would be a pretty big change and could impact a lot of current clients. Also, although we don't have to be beholden to this, what is the behavior of the Java Conjure clients with regards to 429 handling? Would at least be helpful to know for reference. Adding logic that "handles" these kinds of errors with its own logic can often lead to behavior that can be hard to reason about down the line, so also want to be mindful of that. |
Right, but the current behavior allows clients near-arbitrary control over request handling. This API is unreasonably large, and while it is painful to shrink the surface area of an API, I think it is something we should certainly do in this case, because without it we are signing ourselves up for more issues like this in the future, only ones that are harder to debug.
The behavior used by c-j-r is the behavior andy described "back off and try the next node", instead of "try the next node, back off if current node has previously failed". If the current code behaved as it is intended to, this would be a 5 line change.
These error codes are pretty standard across environments, and while I agree we don't want to be handling a large number of cases here, being able to define different failure modes is useful. I actually think the ability of clients to define their own middleware is much more likely to lead to the scenario you described, as it reduces consistency across clients, increases entropy, and can lead to staleness down the line. |
A less invasive option is to move retry logic to middleware and put that a layer below the error decode middleware. I am going to investigate the feasibility of this option, but even if that is tractable still strongly believe we need to remove the ability to specify custom error handling. The issue with doing this is it will (more subtly) break the existing middleware API, as we'll be effectively hiding some error codes from the middleware |
Fixes #78 Extracts status code from the returned error and acts on that. This limits more complex behavior, such as respecting the retry-after header.
What happened?
Currently, when a client recieves a 429, it just continues to the next URI, which can result in more requests ultimately being made when a server is trying to indicate that it would like you to slow down. The code around handling this response code indicates that a service mesh should handle this throttling... but it would be nice to have some basic backoff in the client here.
conjure-go-runtime/conjure-go-client/httpclient/client.go
Lines 104 to 107 in a9e7ef7
What did you want to happen?
Upon receiving a 429, the client backs off before retrying the same node internally. The same max retries behavior can be preserved, but this would give the server a little bit of time to recover/catch up before continuing to hit it with requests.
The text was updated successfully, but these errors were encountered: