Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes an issue in the index chunks/series intersect code. #2796

Merged

Conversation

cyriltovena
Copy link
Contributor

@cyriltovena cyriltovena commented Jun 26, 2020

This was introduce in #2700, more specifically this line https://github.com/cortexproject/cortex/pull/2700/files#diff-10bca0f4f31a2ca1edc507d0289b143dR537

This causes any query with the first label matcher not matching anything to return all matches of all other labels.
This is a nasty one since, the code was relying on empty slice, and so it would skip nil values instead of returning no matches. I've added a regression test proving this is fixed everywhere. I think in Cortex it probably affect performance (since you have to download all chunks not required) but not read integrity.

I have found this with @slim-bean while deploying Loki, we fought a lot to pin down this exact issue while all queriers where OOMing.

/cc @gouthamve @pracucci @pstibrany We need to put on hold all current releases this is a dangerous bug.
/cc @rfratto @slim-bean Let's wait for this to be merged before applying the latest release of Loki.

Signed-off-by: Cyril Tovena cyril.tovena@gmail.com

What this PR does:

Which issue(s) this PR fixes:
Fixes #

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

This was introduce in cortexproject#2700, more specifically this line https://github.com/cortexproject/cortex/pull/2700/files#diff-10bca0f4f31a2ca1edc507d0289b143dR537

This causes any query with the first label matcher not matching anything to return all matches of all other labels.
This is a nasty one since, the code was relying on empty slice, and so it would skip nil values instead of returning no matches. I've added a regression test proving this is fixed everywhere. I think in cortex it can probably affect performance (since you have to download all chunk not required) but not read integrity.

I have found this with @slim-bean while deploying Loki, all queriers where OOMing because of this.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Copy link
Contributor

@jtlisi jtlisi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Good find

Copy link
Contributor

@codesome codesome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find!
@bboreham: should this go into the release branch and need 1.2.0-rc.1?

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @cyriltovena for the troubles my change has caused and thanks for the fix!

@pracucci
Copy link
Contributor

Let's merge this fix quicky and then @bboreham can cherry-pick it into the release-1.2.

@pracucci pracucci merged commit 5942a59 into cortexproject:master Jun 26, 2020
bboreham pushed a commit that referenced this pull request Jun 26, 2020
* Fixes an issue in the index chunks/series intersect code.

This was introduce in #2700, more specifically this line https://github.com/cortexproject/cortex/pull/2700/files#diff-10bca0f4f31a2ca1edc507d0289b143dR537

This causes any query with the first label matcher not matching anything to return all matches of all other labels.
This is a nasty one since, the code was relying on empty slice, and so it would skip nil values instead of returning no matches. I've added a regression test proving this is fixed everywhere. I think in cortex it can probably affect performance (since you have to download all chunk not required) but not read integrity.

I have found this with @slim-bean while deploying Loki, all queriers where OOMing because of this.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Update changelog.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
bboreham added a commit that referenced this pull request Jun 26, 2020
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants