Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve HTTP metrics to distinguish client cancelled requests in 503 responses #3710

Closed
bartekn opened this issue Jun 18, 2021 · 2 comments · Fixed by #4098
Closed

Improve HTTP metrics to distinguish client cancelled requests in 503 responses #3710

bartekn opened this issue Jun 18, 2021 · 2 comments · Fixed by #4098
Assignees

Comments

@bartekn
Copy link
Contributor

bartekn commented Jun 18, 2021

What version are you using?

2.4.1

What did you do?

If client disconnects the horizon_http_requests_duration_seconds_count metric records this with status=503 label. It's possible that it can affect alerting. We should be able to distinguish client initiated disconnects from actual 503 responses sent because of ctx being cancelled.

What did you expect to see?

Maybe we should add another boolean label to horizon_http_requests_duration_seconds_count to be able to tell which 503 were a result of client disconnects.

@sreuland
Copy link
Contributor

@bartekn , which metrics data model option is more preferable to capture client disconnect, since there is no standard HTTP code for client disconnects:

  1. set new metrics label 'client_disconnected=true|false' and set existing label 'status=204'
    or
  2. use '499', a non-standard HTTP Status Code proposed by Nginx which represents 'client disconnected', so metrics would just have different 'status=499' label when client disconnected.

@bartekn
Copy link
Contributor Author

bartekn commented Nov 23, 2021

I think option 2 is great, much clearer and no new label dimension. I wasn't aware 499 existed, looks like nginx custom code but should be fine in our context.

sreuland added a commit to sreuland/go that referenced this issue Nov 24, 2021
sreuland added a commit to sreuland/go that referenced this issue Nov 25, 2021
sreuland added a commit to sreuland/go that referenced this issue Nov 25, 2021
sreuland added a commit to sreuland/go that referenced this issue Nov 25, 2021
sreuland added a commit to sreuland/go that referenced this issue Nov 25, 2021
sreuland added a commit to sreuland/go that referenced this issue Nov 29, 2021
…de db layer distinction of timeout vs cancel
sreuland added a commit to sreuland/go that referenced this issue Nov 29, 2021
sreuland added a commit to sreuland/go that referenced this issue Aug 7, 2022
sreuland added a commit to sreuland/go that referenced this issue Aug 7, 2022
sreuland added a commit to sreuland/go that referenced this issue Aug 7, 2022
sreuland added a commit to sreuland/go that referenced this issue Aug 7, 2022
sreuland added a commit to sreuland/go that referenced this issue Aug 7, 2022
sreuland added a commit to sreuland/go that referenced this issue Aug 7, 2022
…de db layer distinction of timeout vs cancel
sreuland added a commit to sreuland/go that referenced this issue Aug 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants