-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query Frontend: dynamic horizontal query sharding #5658
Query Frontend: dynamic horizontal query sharding #5658
Conversation
afabd95
to
bc57bc6
Compare
941ae32
to
9eddb3d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, I think we just need to polish validation a bit better
41171f2
to
f4c6fb8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few more documentation nits, everything else looks good 👍
@@ -242,14 +245,25 @@ type LabelsConfig struct { | |||
// Validate a fully initialized config. | |||
func (cfg *Config) Validate() error { | |||
if cfg.QueryRangeConfig.ResultsCacheConfig != nil { | |||
if cfg.QueryRangeConfig.SplitQueriesByInterval <= 0 { | |||
return errors.New("split queries interval should be greater than 0 when caching is enabled") | |||
if cfg.QueryRangeConfig.SplitQueriesByInterval <= 0 && !cfg.isDynamicSplitSet() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change might be out of scope for this PR, but I wonder why we even have this constraint. @yeya24 any ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was inherited from Cortex (only by looking at git history). 17343c16
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, we don't allow negative split intervals.
bd6417e
to
9484d0b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome feature, lgtm 👍
@yeya24 any other comments?
CHANGELOG.md
Outdated
@@ -17,6 +17,7 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re | |||
### Added | |||
* [#5654](https://github.com/thanos-io/thanos/pull/5654) Query: add `--grpc-compression` flag that controls the compression used in gRPC client. With the flag it is now possible to compress the traffic between Query and StoreAPI nodes - you get lower network usage in exchange for a bit higher CPU/RAM usage. | |||
- [#5650](https://github.com/thanos-io/thanos/pull/5650) Query Frontend: Add sharded queries metrics. | |||
- [#5658](https://github.com/thanos-io/thanos/pull/5658) Query Frontend: Introduce new parameters to implement more dynamic horizontal query splitting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you mention the new flags name here?
cmd/thanos/query_frontend.go
Outdated
"One should also set query-range.split-min-horizontal-shards to a value greater than 1 to enable splitting."). | ||
Default("0").DurationVar(&cfg.QueryRangeConfig.MinQuerySplitInterval) | ||
|
||
cmd.Flag("query-range.max-split-interval", "Split query range bellow this interval in query-range.horizontal-shards. Queries with a range longer than this value will be split in multiple requests of this length."). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type below
pkg/queryfrontend/config.go
Outdated
if cfg.QueryRangeConfig.SplitQueriesByInterval <= 0 { | ||
return errors.New("split queries interval should be greater than 0 when caching is enabled") | ||
if cfg.QueryRangeConfig.SplitQueriesByInterval <= 0 && !cfg.isDynamicSplitSet() { | ||
return errors.New("split queries or split threshold interval should be greater than 0 when caching is enabled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove 1 space?
pkg/queryfrontend/config.go
Outdated
|
||
if cfg.isDynamicSplitSet() { | ||
err := cfg.validateDynamicSplitParams() | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use if err := xxx; err != nil {}
pattern
pkg/queryfrontend/roundtrip_test.go
Outdated
handlerFunc: promqlResults, | ||
codec: queryRangeCodec, | ||
splitInterval: 1 * time.Hour, | ||
expected: 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test case is the same as the same at line 279?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thank you!
pkg/queryfrontend/roundtrip.go
Outdated
} | ||
|
||
queryInterval := time.Duration(r.GetEnd()-r.GetStart()) * time.Millisecond | ||
// if the query is multiple of max interval, we use the max interval to split. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's capitalize the comment.
pkg/queryfrontend/roundtrip.go
Outdated
return config.MaxQuerySplitInterval | ||
} | ||
|
||
// if the query duration is less than max interval, we split it equally in HorizontalShards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
pkg/queryfrontend/roundtrip.go
Outdated
} | ||
|
||
// if the query duration is less than max interval, we split it equally in HorizontalShards. | ||
return time.Duration(queryInterval.Milliseconds()/config.HorizontalShards) * time.Millisecond |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see where MinQuerySplitInterval
is used here. We need to also ensure the query interval is larger than MaxQuerySplitInterval
IIUC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MinQuerySplitInterval
is a drop-in replacement for SplitQueriesByInterval
, saying just when we start splitting queries. We use it in newQueryRangeTripperware
(line 182) to decide whether to send split requests or to send the original request.
Regarding the MaxQuerySplitInterval
, we are checking that on line 247 to see if the query interval is at least twice the size of the MaxQuerySplitInterval
, if so we use that as our split interval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MinQuerySplitInterval is a drop-in replacement for SplitQueriesByInterval, saying just when we start splitting queries. We use it in newQueryRangeTripperware (line 182) to decide whether to send split requests or to send the original request.
Where do we ensure that the query range is larger than the MinQuerySplitInterval
to be split horizontally? For this code, I can see that even a small range query will be sharded by HorizontalShards
times.
Maybe I misunderstand the initial design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you are right. I added new test case and an if statement.
9484d0b
to
3ebf35b
Compare
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
3ebf35b
to
72f13d7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Summary
This is a possible solution for the issue #5617.
Closes #5617.
Changes
Verification
The modification was only on the Query Range middleware, I introduced more test cases to test the new behavior.
I also added more validation for new parameters and tests to cover it.