Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add telemetry to vault agent #8649

Closed
45cali opened this issue Mar 31, 2020 · 6 comments · Fixed by #13675
Closed

add telemetry to vault agent #8649

45cali opened this issue Mar 31, 2020 · 6 comments · Fixed by #13675
Labels
agent community-sentiment Tracking high-profile issues from the community core/telemetry enhancement

Comments

@45cali
Copy link

45cali commented Mar 31, 2020

Is your feature request related to a problem? Please describe.
We need a way to monitor vault agent. Currently if it fails to renew a lease, not authenticate with vault or retrieve a token, we dont know about it.

Describe the solution you'd like
Add the telemetry stanza to vault agent and add a metrics for

  • auto-auth failures
  • lease renewal failures
  • how many times a token has been renewed
  • http response code from vaults api

It would also be nice if there was a prometheus endpoint available to scrape these metrics as well.

@TamasNeumer
Copy link

This would be also important for us.

@andrejvanderzee
Copy link
Contributor

We are also very interested.

@BorisDr
Copy link

BorisDr commented Jul 2, 2021

Really needed and important feature.

@jamesgoodhouse
Copy link

jamesgoodhouse commented Jul 21, 2021

This makes the Vault agent a risky proposition to run in production systems, especially in Kubernetes. Without the ability to collect metrics and telemetry it's hard to determine when things are failing. This is crucial if it's to be relied upon to update secrets or renew leases. Last thing we would want is for lease renewals to silently fail and then start having cascading failures across the entire system.

I'm going to try and find a little time to see what it'd take to add a prometheus endpoint and open up a PR.

@owenhaynes
Copy link

Yeah this would be helpful because k8s token issuer name change went undetected because of not have any telemetry about agent auth failures.

@heatherezell heatherezell added the community-sentiment Tracking high-profile issues from the community label Nov 29, 2021
remilapeyre added a commit to remilapeyre/vault that referenced this issue Jan 16, 2022
This patch adds a new /agent/v1/metrics that will return metrics on the
running Vault agent. Configuration is done using the same telemetry
stanza than the Vault server. For now default runtime metrics are
returned with a few additional ones specific to the agent:
  - `vault.agent.auth.failure` and `vault.agent.auth.success` to monitor
  the correct behavior of the auto auth mechanism
  - `vault.agent.proxy.success`, `vault.agent.proxy.client_error` and
  `vault.agent.proxy.error` to check the connection with the Vault server
  - `vault.agent.cache.hit` and `vault.agent.cache.miss` to monitor the
  cache

Closes hashicorp#8649
tvoran added a commit that referenced this issue Feb 18, 2022
This patch adds a new /agent/v1/metrics that will return metrics on the
running Vault agent. Configuration is done using the same telemetry
stanza as the Vault server. For now default runtime metrics are
returned with a few additional ones specific to the agent:
  - `vault.agent.auth.failure` and `vault.agent.auth.success` to monitor
  the correct behavior of the auto auth mechanism
  - `vault.agent.proxy.success`, `vault.agent.proxy.client_error` and
  `vault.agent.proxy.error` to check the connection with the Vault server
  - `vault.agent.cache.hit` and `vault.agent.cache.miss` to monitor the
  cache

Closes #8649

Co-authored-by: Theron Voran <tvoran@users.noreply.github.com>
@fewknow
Copy link

fewknow commented Feb 6, 2025

We could really use this today at citizen bank.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agent community-sentiment Tracking high-profile issues from the community core/telemetry enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants