-
Notifications
You must be signed in to change notification settings - Fork 40.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve handling of cancelled requests when publishing WebClient metrics #18444
Comments
I'm not really sure about this. We're recording metrics on client operations and we're adding tags using data available. In this case, it seems that the application code is cancelling the operation with a I've got a fix for this issue, but:
From metrics point of view, fixing this problem could show many errors with a timer right at the timeout value. If the timeout is really a feature, then dashboards will be overwhelmed by errors and would create false positives or hide actual errors. If the timeout is considered as an error, then it would help developers to tune their timeout value. Maybe using a specific |
I was thinking of a specific tag. As you state, Regarding the new operator, can you provide any link to follow up/understand the context? Also, I'm not sure about why the new operator is needed for this particular scenario, isn't the |
So you think the following tags
I'm using
See this commit for why we're using the reactor context: bdd95f0 |
Sorry, on the comment above I meant
I think so, it offers enough information to filter them. Though I don't see the |
Before adding any new |
Sure, there's definitely no rush from my side on such scenario, and definitely I can hack a workaround on my side if needed. Could you, in the meantime, clarify this part? :
|
@bclozel what's the current implementation behaviour when we are consuming the body as a Flux and we limit it with something like The way I solved it was simply flagging somehow on the What do you think? |
@rubasace this feature is not about recording timeouts, but recording cancelled requests. This can happen for various operators like From a pure The workaround you're describing might work, but we'd still get "false negatives" if the client receives the response headers but no body, or only a partial body. I think it's still a good compromise and I'll adapt these changes to only record cancellations if we did not receive any response at all. |
I also just bumped into this issue recently. Mapping cancelled signals for client requests to |
Currently,
MetricsWebClientFilterFunction
does handle the signal viadoOnEach()
method. When a timeout cancels the pipeline, there's no signal emitted so no metric will be recorded.It seems to me that this can cause the WebClient metrics to be misleading, as they will only show numbers lower than the timeout applied to the pipeline.
The text was updated successfully, but these errors were encountered: