Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC: Cost Attribution Proposal 2 #9733

Closed
wants to merge 42 commits into from
Closed

Conversation

ying-jeanne
Copy link
Contributor

@ying-jeanne ying-jeanne commented Oct 24, 2024

What this PR does

Which issue(s) this PR fixes or relates to

Fixes #

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

@ying-jeanne ying-jeanne force-pushed the poc-cost-attribution-2 branch from 1d711e5 to e7bf88c Compare October 24, 2024 11:52

// we need the time stamp, since active series could have entered active stripe long time ago, and already evicted
// from the observed map but still in the active Stripe
func (t *Tracker) DecrementActiveSeries(lbs labels.Labels, value int64, ts time.Time) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here we might still want ts, when we decrement a active series counter that are "active" we may want to keep the timestamp updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we want to update the timestamp of something that didn't have activity?

@ying-jeanne ying-jeanne requested a review from colega October 24, 2024 11:56
@ying-jeanne ying-jeanne force-pushed the poc-cost-attribution-2 branch from fa91187 to 00d8c00 Compare October 24, 2024 13:38
@ying-jeanne ying-jeanne force-pushed the poc-cost-attribution-2 branch from 00d8c00 to 4e75934 Compare October 24, 2024 13:39
@ying-jeanne ying-jeanne force-pushed the poc-cost-attribution-2 branch from 5b093ab to 0456d94 Compare October 24, 2024 14:09
cmd/mimir/help-all.txt.tmpl Outdated Show resolved Hide resolved
pkg/ingester/activeseries/active_series.go Outdated Show resolved Hide resolved
pkg/ingester/activeseries/active_series.go Outdated Show resolved Hide resolved
pkg/ingester/activeseries/active_series.go Outdated Show resolved Hide resolved
pkg/ingester/activeseries/active_series.go Outdated Show resolved Hide resolved
pkg/ingester/ingester.go Outdated Show resolved Hide resolved
pkg/ingester/user_tsdb.go Outdated Show resolved Hide resolved
pkg/mimir/mimir.go Outdated Show resolved Hide resolved
pkg/mimir/mimir.go Outdated Show resolved Hide resolved
pkg/util/validation/limits.go Outdated Show resolved Hide resolved
@colega
Copy link
Contributor

colega commented Oct 25, 2024

I reviewed mostly the existing code changes, skipping the costattribution package implementation for now.

Please ping me back when this is addressed.

Meanwhile, while I didn't look into Tracker implementation, since it's a hot path, I wonder if we could replace the prometheus.Gauge implementation by a custom map. Of course we'll need to benchmark this.

@ying-jeanne ying-jeanne requested a review from colega October 25, 2024 17:37
"go.uber.org/atomic"
)

type Tracker interface {
Copy link
Contributor

@colega colega Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need an interface here, it just makes things more complex and slow. Now we can't compare two trackers by just comparing the pointers, and we expose too many internal details on exported method just to make that comparison.

There are only two implementations of Tracker: the one that does things and the one that is disabled, we can make the disabled one be the nil pointer of TrackerImpl (please rename that back to Tracker).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the interface, and it checks for nil pointer in the receiver instead. commit 7f0b372

Comment on lines 129 to 134
func (c *ActiveSeries) ConfigDiffers(ctCfg asmodel.CustomTrackersConfig, caCfg costattribution.Tracker) bool {
if ctCfg.String() != c.CurrentConfig().String() {
return true
}
return !areTrackersEqual(caCfg, c.CurrentCostAttributionTracker())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have ConfigDiffers now, we don't need CurrentConfig or CurrentCostAttributionTracker methods, and we don't need to take mutex twice for them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ying-jeanne and others added 3 commits November 7, 2024 13:23
… them multiple times (#9914) (#9930)

* Don't hold labels from store-gateways in two forms

* Don't retain labels longer than needed

* Don't convert mimirpb.LabelAdaptors to labels.Labels multiple times

* Add changelog entry

(cherry picked from commit d2367de)

Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
@ying-jeanne
Copy link
Contributor Author

ying-jeanne commented Nov 19, 2024

Based on the benchmark results, the successfully pushed case shows a significant increase in allocations, which is expected due to tracking counters per cost label value, previously there is only one counter for received sample tracking. In discarded cases, differences are minimal since counters per reason were already tracked, with a notable percentage of memory consumption. The additional allocations primarily stem from tracking cost attribution-specific data.

BenchmarkDistributor_Push/all_samples_successfully_pushed-8      10           2432680 ns/op          212628 B/op         98 allocs/op
BenchmarkDistributor_Push/all_samples_successfully_pushed_with_cost_attribution_enabled-8    10           4021089 ns/op          287027 B/op        3110 allocs/op
BenchmarkDistributor_Push/too_many_labels_limit_reached-8          10          12637491 ns/op         1218316 B/op        5069 allocs/op
BenchmarkDistributor_Push/too_many_labels_limit_reached_with_cost_attribution_enabled-8         10           8159270 ns/op         1351149 B/op        9080 allocs/op
BenchmarkDistributor_Push/all_samples_go_to_metric_relabel_configs-8         10           4029890 ns/op         1414630 B/op       7099 allocs/op
BenchmarkDistributor_Push/max_label_value_length_limit_reached-8           10           7136839 ns/op         1234532 B/op        6069 allocs/op
BenchmarkDistributor_Push/timestamp_too_new-8          10           2085542 ns/op          345832 B/op        4069 allocs/op
BenchmarkDistributor_Push/ingestion_rate_limit_reached-8           10           1061883 ns/op           21572 B/op          59 allocs/op
BenchmarkDistributor_Push/max_label_name_length_limit_reached-8        10           4094097 ns/op         1150353 B/op        5067 allocs/op

grafanabot and others added 2 commits November 20, 2024 10:52
Signed-off-by: Nikos Angelopoulos <nikolaos.angelopoulos@grafana.com>
(cherry picked from commit 0caede4)

Co-authored-by: Nikos Angelopoulos <nikolaos.angelopoulos@grafana.com>
@ying-jeanne
Copy link
Contributor Author

The result of ingester push benchmark, the impact is less compare to distributor.

BenchmarkIngesterPush/ingester_push_succeeded-8                       10             49926 ns/op            9944 B/op         37 allocs/op
BenchmarkIngesterPush/ingester_push_succeeded_with_cost_attribution_enabled-8                 10             52132 ns/op           10514 B/op         51 allocs/op

ying-jeanne and others added 4 commits November 20, 2024 20:33
…9998) (#10003)

* MQE: fix issue where subqueries could return series with no points

* Add changelog entry

* Add more test cases

(cherry picked from commit 1afa9f6)

# Conflicts:
#	CHANGELOG.md
@armandgrillet armandgrillet changed the title POC: Cost Attribution Proposal 1.2 POC: Cost Attribution Proposal 2 Nov 26, 2024
@ying-jeanne ying-jeanne force-pushed the poc-cost-attribution-2 branch from 9b8ae76 to e0525e1 Compare November 27, 2024 20:41
@ying-jeanne ying-jeanne force-pushed the poc-cost-attribution-2 branch from 70c6099 to 6f1baa7 Compare December 2, 2024 17:36
@@ -1828,7 +1829,7 @@ func mkLabels(n int, extra ...string) []mimirpb.LabelAdapter {
ret[i+1] = mimirpb.LabelAdapter{Name: fmt.Sprintf("name_%d", i), Value: fmt.Sprintf("value_%d", i)}
}
for i := 0; i < len(extra); i += 2 {
ret[i+n+1] = mimirpb.LabelAdapter{Name: extra[i], Value: extra[i+1]}
ret[i/2+n+1] = mimirpb.LabelAdapter{Name: extra[i], Value: extra[i+1]}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was a bug that I discovered when update test.

@ying-jeanne ying-jeanne mentioned this pull request Dec 17, 2024
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants