Add max-chunks-bytes-per-query limiter #4216

treid314 · 2021-05-24T21:31:59Z

Signed-off-by: Tyler Reid tyler.reid@grafana.com

What this PR does:
This PR adds a new -querier.max-chunk-bytes-per-query limit to limit the amount of bytes a query can use for storing chunks for a single query.

Which issue(s) this PR fixes:
Fixes #3669

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Tyler Reid <tyler.reid@grafana.com>

treid314 · 2021-05-25T01:03:57Z

pkg/util/limiter/query_limiter.go

+	chunkBytesCount *atomic.Int32
+
+	maxSeriesPerQuery     int
+	maxChunkBytesPerQuery int


This limits us to 2GB (2^31 -1 bytes) per query, is it worth making this an unsigned int which is about 4GB (2^32 bytes) per query or a 64 bit number?

int64 please. 4GB is not that much. We may have use cases setting higher limits.

On 64-bit systems, int is 64-bit, so this is fine. Note that Cortex officially doesn't support 32-bit systems.

I would be explicit like we do everywhere else.

Should we also pass in an int64 at the config/limit.go level? Or is leaving NewQueryLimiter(int, int) and casting the maxChunkBytes value to an int64 ok?

I would be explicit like we do everywhere else.

I don't think we're explicit "everywhere else". I think it would make sense to use int here simply because we cannot fit more than max of int into memory anyway (applies for both 32-bit and 64-bit platforms).

To your question Tyler, if you go with int64 route, you will need to "extend" that everywhere to avoid losing precision somewhere (ie. in NewQueryLimiter too)

Ok. Let's not block on this and keep int.

pkg/distributor/query.go

pracucci

Good job! I left few comments but overall logic LGTM 👏

pkg/util/validation/limits.go

pkg/distributor/query.go

pkg/querier/blocks_store_queryable.go

pkg/util/limiter/query_limiter.go

pkg/distributor/query.go

Signed-off-by: Tyler Reid <tyler.reid@grafana.com>

CHANGELOG.md

pkg/util/limiter/query_limiter.go

pkg/distributor/distributor_test.go

Signed-off-by: Tyler Reid <tyler.reid@grafana.com>

pracucci

Thanks a lot Tyler to address my feedback! I think the PR logic is good to go. I just have few last comments on tests that I would be glad to see addressed before merging. Thanks! 🚀

pkg/distributor/distributor_test.go

pstibrany

Nice work! I've left few nit comments (mention ruler in the changelog/help, remove duplicite mentions of blocks storage).

pkg/util/limiter/query_limiter.go

pkg/util/validation/limits.go

CHANGELOG.md

pstibrany · 2021-05-27T13:27:54Z

pkg/util/limiter/query_limiter.go

+		return nil
+	}
+	if ql.chunkBytesCount.Add(int64(chunkSizeInBytes)) > int64(ql.maxChunkBytesPerQuery) {
+		return validation.LimitError(fmt.Sprintf(ErrMaxChunkBytesHit, ql.maxChunkBytesPerQuery))


Nit: Same comment as in AddSeries -- no need to return validation.LimitError from here. Simple return fmt.Sprintf(ErrMaxChunkBytesHit, ql.maxChunkBytesPerQuery) would remove dependency on validation package. Calling code (querier package) can add this wrapping when needed.

I jump in on this. It's in the TODO list, but I suggested to do it in a follow up PR to keep changes easier to review.

…ther code review comments. Signed-off-by: Tyler Reid <tyler.reid@grafana.com>

Signed-off-by: Tyler Reid <tyler.reid@grafana.com>

pracucci

Thanks for addressing my feedback! One final nit and we go! 🚀 🌔

pracucci · 2021-05-27T15:29:53Z

pkg/distributor/ha_tracker_test.go

@@ -660,7 +661,8 @@ func TestHATracker_MetricsCleanup(t *testing.T) {
 func TestCheckReplicaCleanup(t *testing.T) {
 	replica := "r1"
 	cluster := "c1"
-	user := "user"
+	userName := "user"


[nit] userID.

Signed-off-by: Tyler Reid <tyler.reid@grafana.com>

* Add per-user query metrics for series and bytes returned Add stats included in query responses from the querier and distributor for measuring the number of series and bytes included in successful queries. These stats are emitted per-user as summaries from the query frontends. These stats are picked to add visibility into the same resources limited as part of #4179 and #4216. Fixes #4259 Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Formatting fix Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Fix changelog to match actual changes Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Typo Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Code review changes, rename things for clarity Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Apply suggestions from code review Co-authored-by: Marco Pracucci <marco@pracucci.com> Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Code review changes, remove superfluous summaries Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> Co-authored-by: Marco Pracucci <marco@pracucci.com>

…ct#4343) * Add per-user query metrics for series and bytes returned Add stats included in query responses from the querier and distributor for measuring the number of series and bytes included in successful queries. These stats are emitted per-user as summaries from the query frontends. These stats are picked to add visibility into the same resources limited as part of cortexproject#4179 and cortexproject#4216. Fixes cortexproject#4259 Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Formatting fix Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Fix changelog to match actual changes Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Typo Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Code review changes, rename things for clarity Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Apply suggestions from code review Co-authored-by: Marco Pracucci <marco@pracucci.com> Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Code review changes, remove superfluous summaries Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> Co-authored-by: Marco Pracucci <marco@pracucci.com> Signed-off-by: Alvin Lin <alvinlin@amazon.com>

Add max-chunks-bytes-per-query limiter

8e544d8

Signed-off-by: Tyler Reid <tyler.reid@grafana.com>

pull-request-size bot added the size/M label May 24, 2021

Tyler Reid added 2 commits May 24, 2021 17:40

Fix for distributor test

e7fd8d6

Signed-off-by: Tyler Reid <tyler.reid@grafana.com>

Fix spacing and add pr number to changelog

8011feb

Signed-off-by: Tyler Reid <tyler.reid@grafana.com>

treid314 commented May 25, 2021

View reviewed changes

pkg/distributor/query.go Outdated Show resolved Hide resolved

pracucci reviewed May 25, 2021

View reviewed changes

Add unit test for ingester and address code review comments

3120569

Signed-off-by: Tyler Reid <tyler.reid@grafana.com>

pull-request-size bot added size/L and removed size/M labels May 25, 2021

treid314 marked this pull request as ready for review May 25, 2021 23:17

pracucci reviewed May 26, 2021

View reviewed changes

Tyler Reid added 2 commits May 26, 2021 12:35

Fix chunk bytes unit test and code review comments

99c9fbc

Signed-off-by: Tyler Reid <tyler.reid@grafana.com>

Fix linter error

d823cf8

Signed-off-by: Tyler Reid <tyler.reid@grafana.com>

pracucci reviewed May 27, 2021

View reviewed changes

pkg/distributor/distributor_test.go Outdated Show resolved Hide resolved

pkg/distributor/distributor_test.go Outdated Show resolved Hide resolved

pkg/distributor/distributor_test.go Outdated Show resolved Hide resolved

pkg/distributor/distributor_test.go Outdated Show resolved Hide resolved

pstibrany approved these changes May 27, 2021

View reviewed changes

Tyler Reid added 2 commits May 27, 2021 09:30

Move context to a global to per test for distributor tests. Address o…

df08789

…ther code review comments. Signed-off-by: Tyler Reid <tyler.reid@grafana.com>

Update config docs

b89016c

Signed-off-by: Tyler Reid <tyler.reid@grafana.com>

pracucci approved these changes May 27, 2021

View reviewed changes

Change username to userId

0da69c8

Signed-off-by: Tyler Reid <tyler.reid@grafana.com>

treid314 force-pushed the bytes-limter branch from 56fae26 to 0da69c8 Compare May 27, 2021 15:39

Fix linter error

7956a53

Signed-off-by: Tyler Reid <tyler.reid@grafana.com>

pracucci enabled auto-merge (squash) May 27, 2021 15:45

pracucci merged commit 2ba3fdd into cortexproject:master May 27, 2021

treid314 mentioned this pull request Jun 1, 2021

Replace validation error with error in query limiter #4240

Merged

56quarters mentioned this pull request Jul 6, 2021

Add per-user query metrics for series and bytes returned #4343

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add max-chunks-bytes-per-query limiter #4216

Add max-chunks-bytes-per-query limiter #4216

treid314 commented May 24, 2021 •

edited

Loading

treid314 May 25, 2021

pracucci May 25, 2021

pstibrany May 25, 2021

pracucci May 26, 2021

treid314 May 26, 2021

pstibrany May 26, 2021

pstibrany May 26, 2021

pracucci May 27, 2021

pracucci left a comment

pracucci left a comment

pstibrany left a comment

pstibrany May 27, 2021

pracucci May 27, 2021

pracucci left a comment

pracucci May 27, 2021

Add max-chunks-bytes-per-query limiter #4216

Add max-chunks-bytes-per-query limiter #4216

Conversation

treid314 commented May 24, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pracucci left a comment

Choose a reason for hiding this comment

pracucci left a comment

Choose a reason for hiding this comment

pstibrany left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pracucci left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

treid314 commented May 24, 2021 •

edited

Loading