Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduces cache to TSDB postings #9621

Merged
merged 96 commits into from
Aug 3, 2023
Merged

Conversation

DylanGuedes
Copy link
Contributor

@DylanGuedes DylanGuedes commented Jun 4, 2023

What this PR does / why we need it:
Introduces a new interface for fetching TSDB postings and implements a cached TSDB postings.

  • The new postings interface is named PostingsReader and contains two implementations: a simple one, which is identical to the former Postings behavior, and a new one, that caches postings using the existing index_queries_cache_config. By default, the simple/former implementation is used. The new cached implementation can be configured in the following way:
storage_config:
  tsdb_shipper:
    cache_postings: true

In the future we'll remove the cache_postings flag/configuration and change the behavior to cache postings by default.

Which issue(s) this PR fixes:
N/A

Special notes for your reviewer:

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • CHANGELOG.md updated
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/upgrading/_index.md
  • For Helm chart changes bump the Helm chart version in production/helm/loki/Chart.yaml and update production/helm/loki/CHANGELOG.md and production/helm/loki/README.md. Example PR

@DylanGuedes DylanGuedes force-pushed the check-index-postings-calls branch from 9cf0bea to 2753a89 Compare June 8, 2023 17:21
@DylanGuedes DylanGuedes force-pushed the check-index-postings-calls branch from 02d4206 to 6878e4b Compare June 12, 2023 21:03
@DylanGuedes DylanGuedes force-pushed the check-index-postings-calls branch from 00ddd35 to e9d2650 Compare June 12, 2023 21:18
Copy link
Contributor

@chaudum chaudum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, I haven't looked into the LRU implementation too much in detail, but rather focussed on the overall integration.

pkg/storage/chunk/cache/cache.go Outdated Show resolved Hide resolved
pkg/storage/chunk/cache/cache.go Outdated Show resolved Hide resolved
pkg/storage/chunk/cache/lru_cache.go Outdated Show resolved Hide resolved
pkg/storage/chunk/cache/lru_cache.go Outdated Show resolved Hide resolved
pkg/storage/chunk/cache/lru_cache.go Outdated Show resolved Hide resolved
pkg/storage/stores/tsdb/cached_postings_index.go Outdated Show resolved Hide resolved
pkg/storage/stores/tsdb/cached_postings_index.go Outdated Show resolved Hide resolved
pkg/storage/stores/tsdb/single_file_index.go Outdated Show resolved Hide resolved
pkg/storage/stores/tsdb/single_file_index.go Show resolved Hide resolved
pkg/storage/stores/tsdb/cached_postings_index.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@trevorwhitney trevorwhitney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, I'm excited to see it land!

})

// restart ingester which should flush the chunks and index
require.NoError(t, tIngester.Restart())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could also hit /flush here to be even more deliberate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part was just copied from the other tests

tIndexGateway = clu.AddComponent(
"index-gateway",
"-target=index-gateway",
"-tsdb.enable-postings-cache=true",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test passes with this flag set to false. is there any way to this test to fail when caching is not happening? where are the objects coming from (local filesystem?), could we change the data at rest to prove the data we're getting is from cache?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an assertion that checks the cache added/misses/gets metrics results. they guarantee we're covering the cache behavior using the FIFO cache.

"github.com/grafana/loki/pkg/util/encoding"
)

type PostingsReader interface {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PostingsIterator?

for i := range ms {
sorted[i] = labels.Matcher{Type: ms[i].Type, Name: ms[i].Name, Value: ms[i].Value}
}
sort.Sort(sortedLabelMatchers(sorted))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's a little weird we're sorting something called sorted...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorteableLabelMatchers maybe?

Copy link
Collaborator

@trevorwhitney trevorwhitney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving to unblock pending the few small changes left. great work Dylan!

@DylanGuedes DylanGuedes merged commit 1221658 into main Aug 3, 2023
@DylanGuedes DylanGuedes deleted the check-index-postings-calls branch August 3, 2023 18:03
DylanGuedes added a commit that referenced this pull request Nov 7, 2023
DylanGuedes added a commit that referenced this pull request Nov 7, 2023
This reverts commit 1221658.

**What this PR does / why we need it**:
We decided to discontinue the postings cache a few releases ago for not
finding any evidences that postings was the culprit for the bad igw
performance. Plus, we noticed a major issue with it: it doesn't bubble
down shards correctly, so every query end up processing every shard.
rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024
)

This reverts commit 1221658.

**What this PR does / why we need it**:
We decided to discontinue the postings cache a few releases ago for not
finding any evidences that postings was the culprit for the bad igw
performance. Plus, we noticed a major issue with it: it doesn't bubble
down shards correctly, so every query end up processing every shard.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/XXL type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants