Add proposal for parallel compaction by time interval #4272

roystchiang · 2021-06-09T19:00:04Z

What this PR does:

Add proposal for parallel compaction by time interval

Which issue(s) this PR fixes:
Proposal to work on a fix for #3753

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

bboreham

Good submission; I had some questions to clarify

bboreham · 2021-06-14T16:59:29Z

docs/proposals/parallel-compaction.md

I didn't follow this. What sort of corruption are you thinking about?
Is it related to this? thanos-io/thanos#4046

I'm thinking of #3569

I suppose it's not exactly a corruption, but something that prevents compactor from compacting currently.

I'm not sure I fully understand the proposed approach here. You can still compact all non-corrupted blocks together and just leave out of the compaction the corrupted block.

I did not want to introduce a different behavior from what is currently happening. I wanted the proposal to be scoped to introducing horizontal scaling, and we can work on skipping bad blocks at a later stage.

That idea seems fine. The amount of detail in the example seems overkill to make the point.

bboreham · 2021-06-14T17:09:12Z

docs/proposals/parallel-compaction.md

That Thanos PR seems to be stalled; looks like Bartek called for a redesign.
The Google doc is just laying out the goals; it just says "TBD" for what would actually be done.

Do you think it still makes sense to reference it given that it is not really actively being developed?

Definitely, always good to see where others have been before.
I guess I mostly wanted to clarify that this isn't something we can build on directly.

docs/proposals/parallel-compaction.md

pracucci

Thanks for working on this proposal! The overall design makes sense to me. I left few comments.

docs/proposals/parallel-compaction.md

pracucci · 2021-06-17T15:09:53Z

docs/proposals/parallel-compaction.md

I'm not sure I fully understand the proposed approach here. You can still compact all non-corrupted blocks together and just leave out of the compaction the corrupted block.

docs/proposals/parallel-compaction.md

Signed-off-by: Roy Chiang <roychi@amazon.com>

bboreham

I have some further comments, but they can be resolved later; happy to accept this proposal as it stands to unblock work.

bboreham · 2021-06-30T11:16:10Z

docs/proposals/parallel-compaction.md

That idea seems fine. The amount of detail in the example seems overkill to make the point.

bboreham · 2021-06-30T11:16:51Z

docs/proposals/parallel-compaction.md

Definitely, always good to see where others have been before.
I guess I mostly wanted to clarify that this isn't something we can build on directly.

bboreham · 2021-06-30T11:18:29Z

docs/proposals/parallel-compaction.md

+---
+
+## Introduction
+As a part of pushing Cortex’s scaling capability at AWS, we have done performance testing with Cortex and found the compactor to be one of the main limiting factors for higher active timeseries limit per tenant. The documentation [Compactor](https://cortexmetrics.io/docs/blocks-storage/compactor/#how-compaction-works) describes the responsibilities of a compactor, and this proposal focuses on the limitations of the current compactor architecture. In the current architecture, compactor has simple sharding, meaning that a single tenant is sharded to a single compactor. In addition, a compactor handles compaction groups of a single tenant iteratively, meaning that blocks belonging non-overlapping times are not compacted in parallel.


"compaction groups" is not defined before use, either in this document or at the referenced link.
Possibly it could go in https://github.com/cortexproject/cortex/blob/master/docs/guides/glossary.md

bboreham · 2021-06-30T11:19:55Z

docs/proposals/parallel-compaction.md

+As a part of pushing Cortex’s scaling capability at AWS, we have done performance testing with Cortex and found the compactor to be one of the main limiting factors for higher active timeseries limit per tenant. The documentation [Compactor](https://cortexmetrics.io/docs/blocks-storage/compactor/#how-compaction-works) describes the responsibilities of a compactor, and this proposal focuses on the limitations of the current compactor architecture. In the current architecture, compactor has simple sharding, meaning that a single tenant is sharded to a single compactor. In addition, a compactor handles compaction groups of a single tenant iteratively, meaning that blocks belonging non-overlapping times are not compacted in parallel.
+
+### Problem and Requirements
+Currently, a compactor is able to compact up to 20M timeseries within 2 hours for a level-2 compaction, including the time to download blocks, compact, and upload the newly compacted block. We would like to increase the timeseries limit per tenant, and compaction is one of the limiting factors. In addition, we would like to achieve the following:


I cannot find a definition of "level-2" here or in Compactor docs either.

pracucci

LGTM

pracucci · 2021-07-01T08:27:34Z

I'm going to merge the proposal to move forward but I would be glad if you could address Bryan comments in a follow up PR. Thanks!

bboreham · 2021-08-19T16:38:57Z

@roystchiang gentle reminder I made some points about explaining terms, and I couldn't see any follow-up.

roystchiang · 2021-08-30T04:13:39Z

@roystchiang gentle reminder I made some points about explaining terms, and I couldn't see any follow-up.

I will submit a follow up PR this week. Thanks for the reminder.

roystchiang force-pushed the compactor-proposal branch from d94b281 to b386d7b Compare June 9, 2021 19:50

bboreham reviewed Jun 14, 2021

View reviewed changes

pracucci reviewed Jun 17, 2021

View reviewed changes

roystchiang added 2 commits June 24, 2021 10:11

Add proposal for parallel compaction by time interval

c8497f0

Signed-off-by: Roy Chiang <roychi@amazon.com>

address comments regarding compaction sharding

b04d90f

Signed-off-by: Roy Chiang <roychi@amazon.com>

roystchiang force-pushed the compactor-proposal branch from 13855cd to b04d90f Compare June 24, 2021 17:12

This was referenced Jun 24, 2021

Add compactor plan #4316

Closed

Add planner filter #4318

Closed

bboreham approved these changes Jun 30, 2021

View reviewed changes

pracucci approved these changes Jul 1, 2021

View reviewed changes

pracucci merged commit 297fb62 into cortexproject:master Jul 1, 2021

ac1214 mentioned this pull request Jul 10, 2021

Add shuffle sharding grouper/planner #4357

Closed

3 tasks

jeromeinsf mentioned this pull request Jul 21, 2021

Thanos planner component proposal thanos-io/thanos#4458

Closed

2 tasks

ac1214 mentioned this pull request Jul 23, 2021

Add shuffle sharding for compactor ac1214/cortex#4

Open

3 tasks

roystchiang mentioned this pull request Sep 3, 2021

address comments for parallel compaction proposal #4460

Merged

3 tasks

alvinlin123 mentioned this pull request Jan 14, 2022

Add shuffle sharding grouper/planner (Clone of PR 4357) #4621

Closed

3 tasks

roystchiang mentioned this pull request Jan 14, 2022

Add shuffle sharding grouper/planner (Clone of PR 4357) #4624

Merged

3 tasks

Uh oh!

Add proposal for parallel compaction by time interval #4272

Add proposal for parallel compaction by time interval #4272

Uh oh!

Conversation

roystchiang commented Jun 9, 2021

Uh oh!

bboreham left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pracucci left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bboreham left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pracucci left a comment

Choose a reason for hiding this comment

Uh oh!

pracucci commented Jul 1, 2021

Uh oh!

bboreham commented Aug 19, 2021

Uh oh!

roystchiang commented Aug 30, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants