-
Notifications
You must be signed in to change notification settings - Fork 803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add max-chunks-bytes-per-query limiter #4216
Conversation
Signed-off-by: Tyler Reid <tyler.reid@grafana.com>
Signed-off-by: Tyler Reid <tyler.reid@grafana.com>
Signed-off-by: Tyler Reid <tyler.reid@grafana.com>
chunkBytesCount *atomic.Int32 | ||
|
||
maxSeriesPerQuery int | ||
maxChunkBytesPerQuery int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This limits us to 2GB (2^31 -1 bytes) per query, is it worth making this an unsigned int which is about 4GB (2^32 bytes) per query or a 64 bit number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int64 please. 4GB is not that much. We may have use cases setting higher limits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On 64-bit systems, int
is 64-bit, so this is fine. Note that Cortex officially doesn't support 32-bit systems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be explicit like we do everywhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also pass in an int64 at the config/limit.go level? Or is leaving NewQueryLimiter(int, int) and casting the maxChunkBytes value to an int64 ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be explicit like we do everywhere else.
I don't think we're explicit "everywhere else". I think it would make sense to use int
here simply because we cannot fit more than max of int
into memory anyway (applies for both 32-bit and 64-bit platforms).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To your question Tyler, if you go with int64
route, you will need to "extend" that everywhere to avoid losing precision somewhere (ie. in NewQueryLimiter
too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Let's not block on this and keep int
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! I left few comments but overall logic LGTM 👏
Signed-off-by: Tyler Reid <tyler.reid@grafana.com>
Signed-off-by: Tyler Reid <tyler.reid@grafana.com>
Signed-off-by: Tyler Reid <tyler.reid@grafana.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot Tyler to address my feedback! I think the PR logic is good to go. I just have few last comments on tests that I would be glad to see addressed before merging. Thanks! 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! I've left few nit comments (mention ruler in the changelog/help, remove duplicite mentions of blocks storage).
return nil | ||
} | ||
if ql.chunkBytesCount.Add(int64(chunkSizeInBytes)) > int64(ql.maxChunkBytesPerQuery) { | ||
return validation.LimitError(fmt.Sprintf(ErrMaxChunkBytesHit, ql.maxChunkBytesPerQuery)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Same comment as in AddSeries
-- no need to return validation.LimitError
from here. Simple return fmt.Sprintf(ErrMaxChunkBytesHit, ql.maxChunkBytesPerQuery)
would remove dependency on validation
package. Calling code (querier package) can add this wrapping when needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I jump in on this. It's in the TODO list, but I suggested to do it in a follow up PR to keep changes easier to review.
…ther code review comments. Signed-off-by: Tyler Reid <tyler.reid@grafana.com>
Signed-off-by: Tyler Reid <tyler.reid@grafana.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing my feedback! One final nit and we go! 🚀 🌔
pkg/distributor/ha_tracker_test.go
Outdated
@@ -660,7 +661,8 @@ func TestHATracker_MetricsCleanup(t *testing.T) { | |||
func TestCheckReplicaCleanup(t *testing.T) { | |||
replica := "r1" | |||
cluster := "c1" | |||
user := "user" | |||
userName := "user" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] userID
.
Signed-off-by: Tyler Reid <tyler.reid@grafana.com>
Signed-off-by: Tyler Reid <tyler.reid@grafana.com>
* Add per-user query metrics for series and bytes returned Add stats included in query responses from the querier and distributor for measuring the number of series and bytes included in successful queries. These stats are emitted per-user as summaries from the query frontends. These stats are picked to add visibility into the same resources limited as part of #4179 and #4216. Fixes #4259 Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Formatting fix Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Fix changelog to match actual changes Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Typo Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Code review changes, rename things for clarity Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Apply suggestions from code review Co-authored-by: Marco Pracucci <marco@pracucci.com> Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Code review changes, remove superfluous summaries Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> Co-authored-by: Marco Pracucci <marco@pracucci.com>
…ct#4343) * Add per-user query metrics for series and bytes returned Add stats included in query responses from the querier and distributor for measuring the number of series and bytes included in successful queries. These stats are emitted per-user as summaries from the query frontends. These stats are picked to add visibility into the same resources limited as part of cortexproject#4179 and cortexproject#4216. Fixes cortexproject#4259 Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Formatting fix Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Fix changelog to match actual changes Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Typo Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Code review changes, rename things for clarity Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Apply suggestions from code review Co-authored-by: Marco Pracucci <marco@pracucci.com> Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> * Code review changes, remove superfluous summaries Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com> Co-authored-by: Marco Pracucci <marco@pracucci.com> Signed-off-by: Alvin Lin <alvinlin@amazon.com>
Signed-off-by: Tyler Reid tyler.reid@grafana.com
What this PR does:
This PR adds a new
-querier.max-chunk-bytes-per-query
limit to limit the amount of bytes a query can use for storing chunks for a single query.Which issue(s) this PR fixes:
Fixes #3669
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]