Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metrics: correctly reflect closed connections in route metrics #3378

Merged
merged 4 commits into from
Nov 22, 2024

Conversation

zaharidichev
Copy link
Member

This PR fixes a problem that was introduced in the following commit: 5ac1352

The problem is that if a particular route is configured to fail a connection, the poll_ready call on the inner service will return the error and the service will never get called. Therefore in order to capture that in metrics, we need to increment the metrics counter if we encounter errors while calling poll_ready on the inner service.

Signed-off-by: Zahari Dichev zaharidichev@gmail.com

@zaharidichev zaharidichev requested a review from a team as a code owner November 21, 2024 15:48
@cratelyn
Copy link
Collaborator

this isn't a comment on this particular change so much as the architecture of this middleware, which predates this PR, but...

a "closed" counter is usually, in my experience, a good place to lean on RAII guards that increment a closed counter when dropped. that could be stored as an Option<Deferred> value that will increment the "unexpected" counter by default when dropped, to work around tricky edge cases like this.

that'd be a more significant change though, so i'll just note the idea here for posterity.

@olix0r
Copy link
Member

olix0r commented Nov 21, 2024

The problem is that if a particular route is configured to fail a connection, the poll_ready call on the inner service will return the error and the service will never get called.

The poll contract isn't really tied to the connection contract: calling poll_ready doesn't indicate that connection is being attempted, strictly speaking. Services may be polled eagerly or more than once. It would be fragile to encode this here. Also, callers may cache error values returned from poll_ready, so this may not be called as expected.

I think a preferable solution is to modify the stack to replace the push_filter with a service that runs the filter in the request-response path...

…etrics

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
@zaharidichev zaharidichev force-pushed the zd/reflect-err-in-poll-ready branch from 2b35e5f to 68293e1 Compare November 22, 2024 10:20
… route metrics"

This reverts commit 2b35e5f.

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
@zaharidichev zaharidichev force-pushed the zd/reflect-err-in-poll-ready branch from 68293e1 to 9c3dc5a Compare November 22, 2024 10:21
@zaharidichev
Copy link
Member Author

zaharidichev commented Nov 22, 2024

@olix0r Addressed feedback in 9c3dc5a

I think there is a way to make this middleware more generic so it can be reused across tls and tcp stacks but did not want to overcomplicate

Copy link
Member

@olix0r olix0r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can push a commit that addresses this.

linkerd/app/outbound/src/opaq/logical/route/filters.rs Outdated Show resolved Hide resolved
@olix0r olix0r enabled auto-merge (squash) November 22, 2024 17:48
@olix0r olix0r merged commit 7b78212 into main Nov 22, 2024
15 checks passed
@olix0r olix0r deleted the zd/reflect-err-in-poll-ready branch November 22, 2024 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants