-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compaction failing due to out-of-order chunks #267
Comments
Hm... The question is why OOO (out of order) chunk was actually appended in wrong order. Your TSDB PR makes sense, just here we have OOO chunks, not samples. I have never seen that problem. |
Can you track |
@mattbostock While we don't error out, we drop the sample. OOO samples can never make it inside :) This looks like a Thanos issue. |
That sounds like the bug we couldn't track down but for which we added the halting – so we can detect it early on. In the original case we observed, it was simply that sequences of 1-3 chunks were repeated a few up to hundreds of times, i.e. not data loss (and not even visible during querying) but concerning nonetheless of course. Could you make the index file available for debugging? In general, the |
Good point, thanks @gouthamve. @Bplotka:
So Here's the metadata for the source blocks used for compaction, which I queried using {
"version": 1,
"ulid": "01CAAQP1TRAZTTE7VCN02TQ9FQ",
"minTime": 1522925400000,
"maxTime": 1522926000000,
"stats": {
"numSamples": 803141,
"numSeries": 666332,
"numChunks": 669066
},
"compaction": {
"level": 1,
"sources": [
"01CAAQP1TRAZTTE7VCN02TQ9FQ"
]
},
"thanos": {
"labels": {
"__kafka2thanos_partition": "0"
},
"downsample": {
"resolution": 0
}
}
}
{
"version": 1,
"ulid": "01CAAR8KHJ5CF5M48BYC8FANRK",
"minTime": 1522926000000,
"maxTime": 1522926600000,
"stats": {
"numSamples": 10699530,
"numSeries": 1062388,
"numChunks": 1088379
},
"compaction": {
"level": 1,
"sources": [
"01CAAR8KHJ5CF5M48BYC8FANRK"
]
},
"thanos": {
"labels": {
"__kafka2thanos_partition": "0"
},
"downsample": {
"resolution": 0
}
}
}
{
"version": 1,
"ulid": "01CAARV0YR1AH46N01BVTE1KF0",
"minTime": 1522926600000,
"maxTime": 1522927200000,
"stats": {
"numSamples": 10720091,
"numSeries": 1062867,
"numChunks": 1092616
},
"compaction": {
"level": 1,
"sources": [
"01CAARV0YR1AH46N01BVTE1KF0"
]
},
"thanos": {
"labels": {
"__kafka2thanos_partition": "0"
},
"downsample": {
"resolution": 0
}
}
}
{
"version": 1,
"ulid": "01CAASD3MEYPK2825BK864SD0T",
"minTime": 1522927200000,
"maxTime": 1522927800000,
"stats": {
"numSamples": 10696100,
"numSeries": 1062900,
"numChunks": 1089044
},
"compaction": {
"level": 1,
"sources": [
"01CAASD3MEYPK2825BK864SD0T"
]
},
"thanos": {
"labels": {
"__kafka2thanos_partition": "0"
},
"downsample": {
"resolution": 0
}
}
}
{
"version": 1,
"ulid": "01CAASZMX64D8T32SSD0RCVCMR",
"minTime": 1522927800000,
"maxTime": 1522928400000,
"stats": {
"numSamples": 10696364,
"numSeries": 1063668,
"numChunks": 1089091
},
"compaction": {
"level": 1,
"sources": [
"01CAASZMX64D8T32SSD0RCVCMR"
]
},
"thanos": {
"labels": {
"__kafka2thanos_partition": "0"
},
"downsample": {
"resolution": 0
}
}
}
{
"version": 1,
"ulid": "01CAATHYQYCWN6GTK6RHW9WB9P",
"minTime": 1522928400000,
"maxTime": 1522929000000,
"stats": {
"numSamples": 10711089,
"numSeries": 1063985,
"numChunks": 1090410
},
"compaction": {
"level": 1,
"sources": [
"01CAATHYQYCWN6GTK6RHW9WB9P"
]
},
"thanos": {
"labels": {
"__kafka2thanos_partition": "0"
},
"downsample": {
"resolution": 0
}
}
}
{
"version": 1,
"ulid": "01CAAV48R8300SFWDYP8K0JJ7Q",
"minTime": 1522929000000,
"maxTime": 1522929600000,
"stats": {
"numSamples": 10719638,
"numSeries": 1064534,
"numChunks": 1092185
},
"compaction": {
"level": 1,
"sources": [
"01CAAV48R8300SFWDYP8K0JJ7Q"
]
},
"thanos": {
"labels": {
"__kafka2thanos_partition": "0"
},
"downsample": {
"resolution": 0
}
}
} Min/max times for the source blocks (same as above, but summarised to make the times easier to read):
|
Thanks! Looks like they are all non-overlapping, which means the compactor is facing the same case as in Prometheus – that excludes a fair amount of possible issues. But I've absolutely no idea what is is unfortunately (: @mattbostock You think sending all those input blocks our way somehow is at all possible?
Ah, actually, you should still be able to find that output block on the halted compactor. If not we should definitely make it so – without that halting is kind of pointless I guess :) |
So the bug you mentioned before was observed in Prometheus? To be sure, do you suspect the issue is in the TSDB library? Sharing the data directly is probably not going to be feasible. If you can please send me some pointers, I'm happy to debug and maybe can share some more specific metadata. |
No, I just meant that we are using the vanilla TSDB compactor and the input we give it is entirely on-overlapping, just like in Prometheus.
That's the question. Seems like there shouldn't be anything Thanos specific in that path, so a TSDB issue is possible.
You are probably very familiar with the TSDB library, so that helps :) I think your best bet would be pulling the blocks in question onto your machine and writing a standalone program compacting them. You can probably just focus on a single series that is affected to avoid the noise – it will likely be the same issue for all of them. |
Hey, I fixed other issues I had problem with regarding compactor, also successfully performed repair process and rerun compactor. After couple of hours I can repro this too. Looking on that now. My logs:
|
I can find that my compactor restarted 18 times before halting ): Logs are weird. Status was "Terminated 0" so no error 😱 logs
Not yet sure if related. |
Ignore above, I just had |
Two blocks that are compacted into a wrong thing:
These blocks looks like overlapping ): |
Yea.. more questions than answers:
|
I will definitely focus on finding the reason for overlaps first, before ooo investigation. OOO might be a result of this. |
I feel like my overlap is related to previous issue. I repaired my state in 98% and I assumed, other overlaps will be garbage collected. With this #274 we will never have this weird thing again -> We will need to resolve "non-garbage collectable" overlaps before continuing compaction. Now I will write yet another repair tool to merge my 2 blocks together. Seems like one block has 48h time range witth 24h gap and is second that missing 24h gap. Cannot tell though, if the reason of this happening is some bug, or my previous broken state. #274 will help to deduct that in future runs. |
I found potential cause for my overlaps. Fix: #278 |
OK, I finally have cleaned up all overlaps, found root cases, fixed it, and no longer have overlaps in my system (: After lot's of compaction passes I have still OOO, so looking into that now. |
Ok, so after long investigation, this Basically here, my input blocks (two that are merged) includes series for the same time range. This is technically impossible in normal cases, so TSDB is not that wrong, the only missing bit is potential detection if this and returning proper error -> can fix that upstream. The question is now, why I have non-overlapping blocks with overlaped series' time range O.o |
Basically:
Block B:
And both of them have series for range: This is bizzare, but in the same time we had really nasty bug with potential group mixups. (Group is a unique group of blocks within same External labels and Resolution. Compaction should be done only within group). Bug fixed in #278 was causing potential mixes. Dir cleanup was not working, so compactor had access to ALL blocks (from all groups). |
Checked all these weird series and these are 1:1 duplicates, so this seems to be either unrelated to #278 or just 2 different replicas gathered same samples within same timestamp which is unlikely (or isn't?) |
Ok I repaired all these blocks and I am back with running compactor again. With #278 I no longer can repro this issue. @mattbostock could you rerun compactor from master, and confirm this? Preferably on clean storage or at least fixed one (can guide you if you really want to not remove your blocks, but it won't be easy - all sorts of weirdnesses could happen). |
OK all good except this: #281 |
Got yet another overlap. ): Investigated and I think we need to kill sync-delay which introduces these (at least sync delay for compacted blocks). Fix: #282 |
Thanks @Bplotka, I'm still looking into this from my side and testing with 61e63b7 from master. |
For my issue, I'm starting to suspect it's due to a lack of bounds checking in the TSDB library for the maximum timestamp in a chunk, such that the block's metadata reports a lower maxTime than the one actually recorded in the last chunk in the block. I'll write a test case to confirm. |
Yea I see your point, but I think it is all good with the compaction on
that side. The out of order chunks can easily happen when compactor
compacts overlapping blocks. It just checks min time of first one, and max
time of last, but if first one has greater max time than last ooo is
produced.
pt., 13 kwi 2018, 19:53 użytkownik Matt Bostock <notifications@github.com>
napisał:
… For my issue, I'm starting to suspect it's due to a lack of bounds
checking in the TSDB library for the maximum timestamp in a chunk, such
that the block's metadata reports a lower maxTime than the one actually
recorded in the last chunk in the block. I'll write a test case to confirm.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#267 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGoNu1JZRe6h-UszS5nTckeshnWcJv5Cks5toOYwgaJpZM4TIdBR>
.
|
Yes - I think the issue is happening for me because the chunk max time (as stored in the index) is higher than the max time on the block that owns the chunk, which causes the blocks to overlap. It looks to be an issue in the TSDB library but I think we should keep this issue open in the meantime, since it impacts Thanos. |
Cannot repro this issue anymore with current master. So basically we added overlap check now, so this ooo blocks should never happen anymore. There are ways to repair that kind of blocks or at least not removing the whole one, only the broken chunks. Can you show you if you want. I would rerun Thanos from master and double check again. I am not aware of any issue in TSDB or even any wrong result that would indicate this, but maybe I am just lucky. Can we close this issue and reopen when we can repro something using Thanos with our fixes? |
No repro so far, feel free to reopen it if needed. |
@bwplotka @mattbostock We ran into this today with v.0.13.0, here's the log:
Could you please suggest a fix or workaround? |
We've since upgraded to v0.14.0, will let you know if it reproduces again.. |
It still reproduces even on v0.14 (same issue as ^^^). Is there any diagnostic info I can provide to help with investigation? |
If we hit this issue, is there anyway to recover the block? |
The global compactor is halting due to the following error:
I haven't dug into the issue yet, but I wonder if it's related to @gouthamve's comment that out-of-order appends are not prevented in the TSDB library when out-of-order samples are appended to the same appender:
prometheus-junkyard/tsdb#258 (comment)
The text was updated successfully, but these errors were encountered: