Introduce an internal option to enable vertical compaction #1917

pracucci · 2019-12-20T08:55:48Z

In Cortex, we're currently working on an experimental TSDB-storage layer based on Thanos 🎉❤️. The next step would be introducing the Thanos compactor in Cortex too, but to do that we would need a way to allow vertical compaction (due to how the replication works, in Cortex we have overlapping blocks).

According to our experiments, enabling vertical compaction is just a matter of disabling the checks done by Thanos to make sure blocks don't overlap. In this PR I'm suggesting to introduce an internal option (not exposed to Thanos users) to allow Cortex enabling it.

I added CHANGELOG entry for this change.
Change is not relevant to the end user.

Changes

Introduce an internal option to enable vertical compaction

Verification

Manual tests.

brancz · 2019-12-20T09:07:18Z

This is just as useful for people running the Thanos receive component with replication factor >1. I think it would be useful to have this regardless. The larger question I would have is how is this different from #1276 (I realize #1276 is more, but it is capable of the same, no?)?

pracucci · 2019-12-20T09:09:49Z

The larger question I would have is how is this different from #1276 (I realize #1276 is more, but it is capable of the same, no?)?

Thanks @brancz for your quick feedback. My take on #1276 :

Looks exciting, but I suspect may take some time to get it production-grade
On Cortex side we don't really need offline deduplication (we have series replicated with the same exact timestamp-value pairs), so I think we would feel more comfortable stay on TSDB vertical compaction

What's your take?

brancz · 2019-12-20T11:45:36Z

I think that's generally fair. If we go ahead with something here though I think it at least needs to work with duplicate blocks due to thanos receive replication, which would just mean that a labelset (usually a single label, but can in theory be mutliple for example when replicating to multiple regions) is removed from the candidate blocks.

bwplotka

LGTM, but do you mind adding log/metric for when is the actual vertical compaction is triggered/enabled?

The difference between vertical and offline deduplication is well described many times already but once again:

Strictly same timestamps vs realistically different scrape times.
External-label aware deduplication.

It looks to me that actually offline deduplication should fix all cases indeed and this PR is a temporary solution working only for Cortex ingesters. For data replicated by remote write receiver we still ensure different external labels, so this does not help us out of the box.

Since the change is not very disruptive LGTM, but I think we need to take a greater look on #1276 🙈 @smalldirector who is doing amazing work on maintaining the branch with the offline dedup.

@smalldirector by any chance do you have any feedback on running your code for longer time? My assumption was that you already use it on production, or not? (: Also commented on that PR on the way we can move forward.

Signed-off-by: Marco Pracucci <marco@pracucci.com>

pracucci · 2019-12-23T13:09:21Z

I think it at least needs to work with duplicate blocks due to thanos receive replication, which would just mean that a labelset

@brancz I see your point. What do you think if we keep the two things separated, and I work on it in a separate PR? I would like to gather some specs first, while we may move on with this PR to unlock the compactor on the Cortex side.

do you mind adding log/metric for when is the actual vertical compaction is triggered/enabled?

@bwplotka Makes sense. Now an example of compaction-related metrics is:

prometheus_tsdb_compactions_total 30
prometheus_tsdb_vertical_compactions_total 18

thanos_compact_group_compactions_total{group="0@11146230207840493432"} 1
thanos_compact_group_compactions_total{group="0@5679675083797525161"} 29
thanos_compact_group_vertical_compactions_total{group="0@11146230207840493432"} 1
thanos_compact_group_vertical_compactions_total{group="0@5679675083797525161"} 17

About the log there's already a log by TSDB compactor (see below), so I've just added an extra key-value pair to the "compacted blocks" debug log, but I'm open to suggestions.

level=warn ts=2019-12-23T13:04:49.673716Z caller=compact.go:678 msg="found overlapping blocks during compaction" ulid=[REDACTED]

What's your take?

pracucci · 2019-12-23T13:33:19Z

The failing tests looks unrelated. Please advise in case I've misread it.

bwplotka

LGTM 👍

bwplotka · 2019-12-23T13:53:49Z

Rerunning workflow for you.

Thanks for metrics, this should do. Mentioned entry is fine as well. (:

bwplotka approved these changes Dec 20, 2019

View reviewed changes

pracucci added 2 commits December 23, 2019 13:23

Introduce an internal option to enable vertical compaction

4718c1a

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Added metric with the count of compactions on overlapping blocks

7c15a0f

Signed-off-by: Marco Pracucci <marco@pracucci.com>

pracucci force-pushed the add-flag-to-compactor-to-allow-overlapping-blocks branch from c3770d9 to 7c15a0f Compare December 23, 2019 13:09

bwplotka approved these changes Dec 23, 2019

View reviewed changes

bwplotka merged commit 39f8623 into thanos-io:master Dec 23, 2019

pracucci deleted the add-flag-to-compactor-to-allow-overlapping-blocks branch December 24, 2019 13:03

pracucci mentioned this pull request Dec 24, 2019

Upgraded Thanos dependency cortexproject/cortex#1935

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce an internal option to enable vertical compaction #1917

Introduce an internal option to enable vertical compaction #1917

pracucci commented Dec 20, 2019

brancz commented Dec 20, 2019

pracucci commented Dec 20, 2019

brancz commented Dec 20, 2019

bwplotka left a comment •

edited

Loading

pracucci commented Dec 23, 2019

pracucci commented Dec 23, 2019

bwplotka left a comment

bwplotka commented Dec 23, 2019

Introduce an internal option to enable vertical compaction #1917

Introduce an internal option to enable vertical compaction #1917

Conversation

pracucci commented Dec 20, 2019

Changes

Verification

brancz commented Dec 20, 2019

pracucci commented Dec 20, 2019

brancz commented Dec 20, 2019

bwplotka left a comment • edited Loading

Choose a reason for hiding this comment

pracucci commented Dec 23, 2019

pracucci commented Dec 23, 2019

bwplotka left a comment

Choose a reason for hiding this comment

bwplotka commented Dec 23, 2019

bwplotka left a comment •

edited

Loading