Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add additional sections to telemetry.mdx #8986

Open
mkcp opened this issue Oct 19, 2020 · 3 comments
Open

Add additional sections to telemetry.mdx #8986

mkcp opened this issue Oct 19, 2020 · 3 comments
Labels
theme/telemetry Anything related to telemetry or observability type/docs Documentation needs to be created/updated/clarified

Comments

@mkcp
Copy link
Contributor

mkcp commented Oct 19, 2020

Feature Description

Our telemetry docs do an effective job at identifying the key metrics operators should be tracking and suggested alerting thresholds. However, we currently provide a table with every metric after that without groupings. The metrics in this table are second or third order - likely to be used for debugging after a problem has been identified and needs to be narrowed down. They're the metrics that are specific enough to provide operators insights into how Consul's being used.

In order to make this better, we should categorize the remaining metrics by the workloads they're used in. Often this is captured by the second token in the metric name, e.g. ACLs: consul.acl.etc.

Use Case(s)

We should group metrics by workload where applicable so we may provide guidelines on grokking them, alerting on them, debugging with them, and getting insights from them.

@mkcp mkcp added type/docs Documentation needs to be created/updated/clarified theme/telemetry Anything related to telemetry or observability labels Oct 19, 2020
@github-actions github-actions bot added theme/acls ACL and token generation theme/packaging Packaging and distributing Consul theme/ui Anything related to the UI labels Oct 19, 2020
@mkcp mkcp removed theme/acls ACL and token generation theme/packaging Packaging and distributing Consul theme/ui Anything related to the UI labels Oct 19, 2020
@mkcp
Copy link
Contributor Author

mkcp commented Nov 20, 2020

Blocked by #9197

@mkcp
Copy link
Contributor Author

mkcp commented Nov 23, 2020

It would be neat to sort all of the metric names alphabetically and trim a lot of the redundant language we use in the help strings: mainly "This" over and over.

hashicorp-ci pushed a commit that referenced this issue Mar 22, 2021
* Fixes #2379-Improve interval explanation in the telemetry doc

* Fixes #4734-Update consul memory metrics

* Fixes #4836-Removed node.deregistration as that isn't in state.go

* Fixes #8986 partially-Trim redundant language

* Fixes #9087-Adds helpful details to telemetry on autopilot

* Fixes #9274-Addresses NaN output in autopilot
hashicorp-ci pushed a commit that referenced this issue Mar 22, 2021
* Fixes #2379-Improve interval explanation in the telemetry doc

* Fixes #4734-Update consul memory metrics

* Fixes #4836-Removed node.deregistration as that isn't in state.go

* Fixes #8986 partially-Trim redundant language

* Fixes #9087-Adds helpful details to telemetry on autopilot

* Fixes #9274-Addresses NaN output in autopilot
@jsosulska
Copy link
Contributor

Reopening this issue to add additional context. The original ask is still valid:

  • Group metrics logically, (re)move the massive table
  • Provide "Why Important" for each grouping
  • Provide "What to Alert On" for each grouping
  • Recommended Debugging & Insights

@jsosulska jsosulska reopened this Mar 22, 2021
dizzyup pushed a commit that referenced this issue Apr 21, 2021
* Fixes #2379-Improve interval explanation in the telemetry doc

* Fixes #4734-Update consul memory metrics

* Fixes #4836-Removed node.deregistration as that isn't in state.go

* Fixes #8986 partially-Trim redundant language

* Fixes #9087-Adds helpful details to telemetry on autopilot

* Fixes #9274-Addresses NaN output in autopilot
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theme/telemetry Anything related to telemetry or observability type/docs Documentation needs to be created/updated/clarified
Projects
None yet
Development

No branches or pull requests

2 participants