Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhanced metrics for DynamoDB backend #21588

Closed
thomashargrove opened this issue Jul 5, 2023 · 1 comment
Closed

Enhanced metrics for DynamoDB backend #21588

thomashargrove opened this issue Jul 5, 2023 · 1 comment

Comments

@thomashargrove
Copy link
Contributor

Is your feature request related to a problem? Please describe.

We recently had a multi-hour outage on a vault cluster using DynamoDB backend and our investigation was challenging due to limited information in logs and metrics. Vault was reporting a very low RATE(dynamodb_get_total_count) close to zero, but the AWS console was reporting ~25k qps of DynamoDB read requests. The number of core_in_flight_requests spiked to 200k, and we had large numbers of go routines and high memory usage. We suspect the DynamoDB backend was returning errors and the AWS client was doing infinite retries (we did not have AWS_DYNAMODB_MAX_RETRIES set, but that is fixed now). If that was the case, then most likely we reached the limit on the DynamoDB PermitPool causing all the other requests to wait on that lock. But as far as we can tell, there is nothing in the logs or metrics that would indicate we have reached the limit of our PermitPool.

Describe the solution you'd like

Metrics covering the DynamoDB PermitPool would be very useful. If we had gauge metrics for Active Permits, Pool Size, and Permits Waiting we could quickly determine that DynamoDB layer is the cause of requests backing up.

Describe alternatives you've considered

Explain any additional use-cases

Additional context

@VioletHynes
Copy link
Contributor

Closing, as this has been implemented by #21742. Thanks for the issue and the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants