-
Notifications
You must be signed in to change notification settings - Fork 810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added per-tenant in-process sharding support to compactor #2599
Added per-tenant in-process sharding support to compactor #2599
Conversation
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing PR!
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Unfortunately this causes problems with the deduplication on the read path, because the new external labels end up being treated as additional series labels. I'm working on a solution. |
Signed-off-by: Marco Pracucci <marco@pracucci.com>
The solution I've adopted (commit) is pretty simple and follows what also Thanos is doing. The idea is to remove any external label used to identify replicas/shards directly when iterating the bucket. |
A quick explanation of the problem and solution. The Thanos Before this PR we had only the user ID as external label, which is constant across all blocks of an user, so doesn't matter at which "point in time" you remove such external label, we've just to remove it. However, the new external labels (ingester ID and shard ID) are variable. For the same user we have blocks with different external labels. Think about 2 single series
In the querier, we assume that a The solution (which is also what Thanos does) is removing these external labels that we don't want at query time directly in the metadata fetcher. |
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
…ect#2599) * Added per-tenant in-process sharding support to compactor Signed-off-by: Marco Pracucci <marco@pracucci.com> * Added concurrency config option Signed-off-by: Marco Pracucci <marco@pracucci.com> * Fixed filter Signed-off-by: Marco Pracucci <marco@pracucci.com> * Updated CHANGELOG Signed-off-by: Marco Pracucci <marco@pracucci.com> * Improved distribution test Signed-off-by: Marco Pracucci <marco@pracucci.com> * Fixed external labels removal at query time Signed-off-by: Marco Pracucci <marco@pracucci.com> * Fixed removal of external labels when querying back blocks from storage Signed-off-by: Marco Pracucci <marco@pracucci.com> * Added unit test Signed-off-by: Marco Pracucci <marco@pracucci.com> * Fixed linter Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Alex Le <leqiyue@amazon.com>
Signed-off-by: Alex Le <leqiyue@amazon.com>
NOTE: Rolled back by #2628
What this PR does:
We're hitting some vertical scalability limits on the compactor. We have a large user with 30+M active series and compacting 2h blocks take more than 2h. The TSDB compactor uses a single CPU core (there's no parallelisation), so we can't really vertically scale up unless sharding blocks.
In this PR I'm introducing per-tenant in-process sharding support to the compactor, leveraging on the fact that Thanos compactor can parallelise compaction of different blocks groups.
The way it works is quite simple:
Guaranteed properties:
Out of the scope of this PR:
Which issue(s) this PR fixes:
N/A
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]