-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
query-scheduler: fix query distribution in SSD mode #9471
query-scheduler: fix query distribution in SSD mode #9471
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, thanks for fixing this! I think we're just missing one small check for legacy read mode, and then LGTM!
pkg/scheduler/lifecycle.go
Outdated
) | ||
|
||
func (rm *RingManager) OnRingInstanceRegister(_ *ring.BasicLifecycler, ringDesc ring.Desc, instanceExists bool, instanceID string, instanceDesc ring.InstanceDesc) (ring.InstanceState, ring.Tokens) { | ||
// When we initialize the index gateway instance in the ring we want to start from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment should say scheduler?
pkg/loki/modules.go
Outdated
t.Cfg.QueryScheduler.SchedulerRing.ListenPort = t.Cfg.Server.GRPCListenPort | ||
|
||
managerMode := scheduler.RingManagerModeReader | ||
if t.Cfg.isModuleEnabled(QueryScheduler) || t.Cfg.isModuleEnabled(Backend) || t.Cfg.isModuleEnabled(All) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are missing a check for legacy read mode here. in legacy read mode (before backend
was introduced), the scheduler was part of the read target, so we should be a member
when t.Cfg.LegacyReadTarget && t.Cfg.isModuleEnabled(Read)
pkg/scheduler/ringmanager.go
Outdated
|
||
// instantiate ring for both mode modes. | ||
ringCfg := rm.cfg.SchedulerRing.ToRingConfig(ringReplicationFactor) | ||
rm.Ring, err = ring.NewWithStoreClientAndStrategy(ringCfg, ringNameForServer, ringKey, ringStore, ring.NewIgnoreUnhealthyInstancesReplicationStrategy(), prometheus.WrapRegistererWithPrefix("cortex_", registerer), rm.log) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we break this long function call up over multiple lines?
rm.subservicesWatcher = services.NewFailureWatcher() | ||
rm.subservicesWatcher.WatchManager(rm.subservices) | ||
|
||
rm.Service = services.NewIdleService(func(ctx context.Context) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the lack of understanding, but I'm not quite following how this IdleService works, and how it is able to read the ring without adding itself to it? Is it because it only has a Ring
service and not a RingLifecycler
service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When running RingManager
in reader mode, we only use the ring client for reading the ring. We do not have to register the service in the ring to be able to read the ring. RingLifecycler
is required only when we want to register tokens in the ring.
…-scheduler replicas (#9477) **What this PR does / why we need it**: Currently, we have a bug in our code when running Loki in SSD mode and using the ring for query-scheduler discovery. It causes queries to not be distributed to all the available read pods. I have explained the issue in detail in [the PR which fixes the code](#9471). Since this bug causes a major query performance impact and code release might take time, in this PR we are doing a new helm release which fixes the issue by using the k8s service for discovering `query-scheduler` replicas. **Which issue(s) this PR fixes**: Fixes #9195
…-scheduler replicas (grafana#9477) **What this PR does / why we need it**: Currently, we have a bug in our code when running Loki in SSD mode and using the ring for query-scheduler discovery. It causes queries to not be distributed to all the available read pods. I have explained the issue in detail in [the PR which fixes the code](grafana#9471). Since this bug causes a major query performance impact and code release might take time, in this PR we are doing a new helm release which fixes the issue by using the k8s service for discovering `query-scheduler` replicas. **Which issue(s) this PR fixes**: Fixes grafana#9195
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Hello @sandeepsukhani!
Please, if the current pull request addresses a bug fix, label it with the |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new branch
git switch --create backport-9471-to-release-2.8.x origin/release-2.8.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x 0a5e149ea540d9b034ff7023ca6f95ce09805080
# Push it to GitHub
git push --set-upstream origin backport-9471-to-release-2.8.x
git switch main
# Remove the local backport branch
git branch -D backport-9471-to-release-2.8.x Then, create a pull request where the |
**What this PR does / why we need it**: When we run the `query-scheduler` in `ring` mode, `queriers` and `query-frontend` discover the available `query-scheduler` instances using the ring. However, we have a problem when `query-schedulers` are not running in the same process as queriers and query-frontend since [we try to get the ring client interface from the scheduler instance](https://github.com/grafana/loki/blob/abd6131bba18db7f3575241c5e6dc4eed879fbc0/pkg/loki/modules.go#L358). This causes queries not to be spread across all the available queriers when running in SSD mode because [we point querier workers to query frontend when there is no ring client and scheduler address configured](https://github.com/grafana/loki/blob/b05f4fced305800b32641ae84e3bed5f1794fa7d/pkg/querier/worker_service.go#L115). I have fixed this issue by adding a new hidden target to initialize the ring client in `reader`/`member` mode based on which service is initializing it. `reader` mode will be used by `queriers` and `query-frontend` for discovering `query-scheduler` instances from the ring. `member` mode will be used by `query-schedulers` for registering themselves in the ring. I have also made a couple of changes not directly related to the issue but it fixes some problems: * [reset metric registry for each integration test](grafana@18c4fe5) - Previously we were reusing the same registry for all the tests and just [ignored the attempts to register same metrics](https://github.com/grafana/loki/blob/01f0ded7fcb57e3a7b26ffc1e8e3abf04a403825/integration/cluster/cluster.go#L113). This causes the registry to have metrics registered only from the first test so any updates from subsequent tests won't reflect in the metrics. metrics was the only reliable way for me to verify that `query-schedulers` were connected to `queriers` and `query-frontend` when running in ring mode in the integration test that I added to test my changes. This should also help with other tests where earlier it was hard to reliably check the metrics. * [load config from cli as well before applying dynamic config](grafana@f9e2448) - Previously we were applying dynamic config considering just the config from config file. This results in unexpected config changes, for example, [this config change](https://github.com/grafana/loki/blob/4148dd2c51cb827ec3889298508b95ec7731e7fd/integration/loki_micro_services_test.go#L66) was getting ignored and [dynamic config tuning was unexpectedly turning on ring mode](https://github.com/grafana/loki/blob/52cd0a39b8266564352c61ab9b845ab597008770/pkg/loki/config_wrapper.go#L94) in the config. It is better to do any config tuning based on both file and cli args configs. **Which issue(s) this PR fixes**: Fixes grafana#9195 (cherry picked from commit 0a5e149)
…-scheduler replicas (grafana#9477) **What this PR does / why we need it**: Currently, we have a bug in our code when running Loki in SSD mode and using the ring for query-scheduler discovery. It causes queries to not be distributed to all the available read pods. I have explained the issue in detail in [the PR which fixes the code](grafana#9471). Since this bug causes a major query performance impact and code release might take time, in this PR we are doing a new helm release which fixes the issue by using the k8s service for discovering `query-scheduler` replicas. **Which issue(s) this PR fixes**: Fixes grafana#9195
What this PR does / why we need it:
When we run the
query-scheduler
inring
mode,queriers
andquery-frontend
discover the availablequery-scheduler
instances using the ring. However, we have a problem whenquery-schedulers
are not running in the same process as queriers and query-frontend since we try to get the ring client interface from the scheduler instance.This causes queries not to be spread across all the available queriers when running in SSD mode because we point querier workers to query frontend when there is no ring client and scheduler address configured.
I have fixed this issue by adding a new hidden target to initialize the ring client in
reader
/member
mode based on which service is initializing it.reader
mode will be used byqueriers
andquery-frontend
for discoveringquery-scheduler
instances from the ring.member
mode will be used byquery-schedulers
for registering themselves in the ring.I have also made a couple of changes not directly related to the issue but it fixes some problems:
query-schedulers
were connected toqueriers
andquery-frontend
when running in ring mode in the integration test that I added to test my changes. This should also help with other tests where earlier it was hard to reliably check the metrics.Which issue(s) this PR fixes:
Fixes #9195
Special notes for your reviewer:
I have copied most of the ring manager code from indexgateway RingManager. I will open a follow-up PR to refactor and share the code between the two since most of the code is the same.
Checklist
CHANGELOG.md
updated