-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Per-tenant replication (sharding) factor for index gateway #9574
Conversation
0cec265
to
3e500d9
Compare
This has a fundamental issue of linking the size of a tenant to the size of a cluster. For instance, if tenant Ideally, we'd link the size of a tenant to the resources it needs, so something like 10QPS=1 instance, 100QPS=10 instances, etc (not a function of the cell's size). That way, we can scale the size of a tenant independently from the size of the cluster the tenant is part of. |
Pulls changes from grafana/dskit#304 Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
0de26f7
to
7187350
Compare
Since the replication factor is not only used on the client side to determine request targets, but also on the server side to determine whether a tenant belongs to the instance to pre-fetch data, we will need to communicate the QPS (or other metric) between clients and servers. The flow could look like this:
Alternatively, instead of the client tracking the rate for each tenant, the server instances could track it. Then the server instances need to report to the leader and the clients only fetch the aggregated data. |
Phlare and Mimir do shuffle sharding based on user and block on their store gateways. There is a per-tenant setting for the shard size. This is similar to the implementation in this PR, but with an absolute number, rather than a factor. |
Superseded by #9710 |
What this PR does / why we need it:
This PR introduces a per-tenant sharding (replication) factor for the index gateway.
At the moment we use a high replication factor in the index gateway ring to be able to distribute load from large tenants across a larger subset of index gateways. However, since the replication factor is a global setting in the gateway ring, also small tenants have a high replication factor, leading to a lot of data being pre-fetched at the start of the index gateways.
This PR adds a per-tenant runtime setting to specify a "sharding" factor between 0 and 1 that defines the per-tenant RF in the range of ring RF to max instances.
Example:
Using a factor instead of a fixed replication factor makes it easier to scale index gateways horizontally, because it does not require to adjust the setting when more instances are added to the ring.
Special notes for your reviewer:
This PR vendors grafana/dskit#304 which has to be merged first.
Checklist
CONTRIBUTING.md
guide (required)CHANGELOG.md
updateddocs/sources/upgrading/_index.md
production/helm/loki/Chart.yaml
and updateproduction/helm/loki/CHANGELOG.md
andproduction/helm/loki/README.md
. Example PR