Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: aggregated metric volume queries #14412

Closed
wants to merge 23 commits into from
Closed

Conversation

trevorwhitney
Copy link
Collaborator

@trevorwhitney trevorwhitney commented Oct 7, 2024

What this PR does / why we need it:

This PR introduces the ?aggregatedMetrics=true query parameter to the /volume endpoint which allows one to get stream volume from aggregated metrics rather than the index.

Explore Logs uses the /volume endpoint to find the top services by volume for the services landing page. This is slow for big services. With the latest release of Explore Logs, we introduced the ability to get this volume using aggregated metric streams, which is much faster. In that first release, however, the app is getting volumes for the aggregated metric streams rather than the actual service. This PR adds aggregated metric support directly in the the Loki /volume endpoint. This allows the frontend to send the same stream selector whether it is using aggregated metrics or not (and therefore not need to know about the __aggregated_metric__ label).

Which issue(s) this PR fixes:
Fixes #14349

Special notes for your reviewer:

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • Title matches the required conventional commits format, see here
    • Note that Promtail is considered to be feature complete, and future development for logs collection will be in Grafana Alloy. As such, feat PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

@trevorwhitney trevorwhitney requested a review from a team as a code owner October 7, 2024 19:47
@trevorwhitney
Copy link
Collaborator Author

It may be interesting to change the underlying instant query to a shardable topk query to enforce the requested limit once #14243 is merged

"foo",
)
require.NoError(t, err)
fruitMatcher, err := labels.NewMatcher(labels.MatchNotEqual, "fruit", "apple")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check out labels.MustNewMatcher(), it'll help make these tests less verbose

filters := []string{}
serviceNamePresent := false

for i, matcher := range a.matchers {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is len(matchers) ever 0? If not, it might be good to pull the if i == 0 clause out for clarity

Limit uint32
TargetLabels []string
AggregateBy string
AggregatedMetrics bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I tripped over the naming here. It might be better to use a name like UseAggregatedMetrics or FromAggregatedMetrics

// idxVolumeHandler gets volume from metadata in the index
idxVolumeHandler := indexTw.Wrap(next)
if !r.AggregatedMetrics || !cfg.AggregatedMetrics {
return idxVolumeHandler.Do(ctx, req)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it intended to silently not use aggregated metrics when the user explicitly sets "aggregatedMetrics=true" in the request, but does not have patterns enabled?
or should we throw a bad request instead?

trevorwhitney and others added 18 commits October 8, 2024 13:18
…14413)

This change allows to observe the distribution of how many tasks are dequeued at once over time.

Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
At the moment, errors are silently ignored when asynchronous blocks downloading fails.

Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
**Statefulsets for ingester and index-gateway**
The change here is to make the `spec.updateStrategy.type` configurable, and to default to `RollingUpdate`.

**Enterprise tokengen**
For this change, if `enterprise` and `tokengen` are enabled, and `rbac.namespaced` is true, the chart will render a `Role` and `RoleBinding`. If `rbac.namespaced` is false, it will render a `ClusterRole` and `ClusterRoleBinding`.
)

This new field reports how many blocks have been processed in total in the multiplexed request.

Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Co-authored-by: Jack Baldry <jack.baldry@grafana.com>
Signed-off-by: Vladyslav Diachenko <vlad.diachenko@grafana.com>
Co-authored-by: Trevor Whitney <trevorjwhitney@gmail.com>
Co-authored-by: J Stickler <julie.stickler@grafana.com>
@github-actions github-actions bot added sig/operator area/helm type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories labels Oct 10, 2024
Copy link
Contributor

Kubernetes Manifest Diff Summary

Scenario: default-single-binary-values (Added: 0, Modified: 0, Removed: 0)

Summary:

  • Added: 0

  • Modified: 0

  • Removed: 0

Added Files

No added files

Modified Files

No modified files

Removed Files

No removed files

Scenario: default-values (Added: 0, Modified: 0, Removed: 0)

Summary:

  • Added: 0

  • Modified: 0

  • Removed: 0

Added Files

No added files

Modified Files

No modified files

Removed Files

No removed files

Scenario: ingress-values (Added: 0, Modified: 0, Removed: 0)

Summary:

  • Added: 0

  • Modified: 0

  • Removed: 0

Added Files

No added files

Modified Files

No modified files

Removed Files

No removed files

Scenario: legacy-monitoring-values (Added: 0, Modified: 0, Removed: 0)

Summary:

  • Added: 0

  • Modified: 0

  • Removed: 0

Added Files

No added files

Modified Files

No modified files

Removed Files

No removed files

Scenario: simple-scalable-aws-kube-irsa-values (Added: 0, Modified: 0, Removed: 0)

Summary:

  • Added: 0

  • Modified: 0

  • Removed: 0

Added Files

No added files

Modified Files

No modified files

Removed Files

No removed files

@trevorwhitney
Copy link
Collaborator Author

at this time, I don't think this is worth merging. over large time ranges (like 7d), this is significantly slower than getting the same data from the index. granted it's more accurate, but accuracy is not what we need for Explore Logs. there may be use in bringing this back for cost attribution or Logs Volume Explorer, but I think it's premature to merge this for that purpose at this time.

@chaudum chaudum deleted the volume-agg-metrics branch December 20, 2024 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/helm sig/operator size/XXL type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Aggregated metric support in /volume