-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(bloom planner): Compute gaps and build tasks from metas and TSDBs #12994
Conversation
pkg/bloombuild/planner/config.go
Outdated
func (cfg *Config) RegisterFlagsWithPrefix(_ string, _ *flag.FlagSet) { | ||
// TODO: Register flags with flagsPrefix | ||
func (cfg *Config) RegisterFlagsWithPrefix(prefix string, f *flag.FlagSet) { | ||
f.DurationVar(&cfg.PlanningInterval, prefix+".interval", 10*time.Minute, "Interval at which to re-run the bloom creation planning.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: IMO, 10m
is too frequent, it's more frequent than the TSDB index is compacted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to 8 hours, so it runs three times a day. Wdyt
pkg/bloombuild/planner/planner.go
Outdated
|
||
case <-ticker.C: | ||
if err := p.runOne(ctx); err != nil { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any error would stop the planner service. Is this by intention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect an error to be logged and an error counter to be increased, but not the service to be shut down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looked a bit weird to me as well when I copied it from the bloom compactor:
loki/pkg/bloomcompactor/bloomcompactor.go
Lines 146 to 148 in a9345d0
if err := c.runOne(ctx); err != nil { | |
return err | |
} |
I'll log the error instead.
error counter to be increased
That's already done inside the runOne function:
loki/pkg/bloombuild/planner/planner.go
Lines 102 to 105 in bb6b3d9
status = statusFailure | |
) | |
defer func() { | |
p.metrics.buildCompleted.WithLabelValues(status).Inc() |
pkg/bloombuild/planner/planner.go
Outdated
func (p *Planner) tables(ts time.Time) *dayRangeIterator { | ||
// adjust the minimum by one to make it inclusive, which is more intuitive | ||
// for a configuration variable | ||
adjustedMin := min(p.cfg.MinTableOffset - 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is min()
used for here with a single argument?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied over that function. I think that min is probably a leftover. Removed.
pkg/validation/limits.go
Outdated
@@ -205,6 +205,9 @@ type Limits struct { | |||
BloomCompactorMaxBlockSize flagext.ByteSize `yaml:"bloom_compactor_max_block_size" json:"bloom_compactor_max_block_size" category:"experimental"` | |||
BloomCompactorMaxBloomSize flagext.ByteSize `yaml:"bloom_compactor_max_bloom_size" json:"bloom_compactor_max_bloom_size" category:"experimental"` | |||
|
|||
BloomCreationEnabled bool `yaml:"bloom_creation_enabled" json:"bloom_creation_enabled" category:"experimental"` | |||
BloomSplitSeriesKeyspaceByFactor int `yaml:"bloom_split_series_keyspace_by_factor" json:"bloom_split_series_keyspace_by_factor" category:"experimental"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Personally I find the name factor
not quite right, because it implies that it is used to multiply with something.
I would name it something with series shard size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't use shard since it's not sharding at all. What about bloom_split_series_keyspace_by: 256
. I think it reads good enough.
Note that wi'll likely replace this keyspace split by something smarter using TSDB stats soon, so I wouldn't worry too much about naming here.
…TSDBs (grafana#12994) Signed-off-by: Joel Takvorian <jtakvori@redhat.com>
commit 0bfd0ad Merge: 68aa188 efdae3d Author: Trevor Whitney <trevorjwhitney@gmail.com> Date: Thu May 23 17:04:32 2024 -0600 Merge branch 'main' into sample-count-and-bytes commit 68aa188 Author: Trevor Whitney <trevorjwhitney@gmail.com> Date: Thu May 23 17:03:32 2024 -0600 feat: guard aggregation behavior behind a feature flag commit efdae3d Author: hayden <haydenfuss@gmail.com> Date: Thu May 23 16:25:50 2024 -0400 feat(helm): Support for PVC Annotations for Non-Distributed Modes (#12023) Signed-off-by: hfuss <hayden.fuss@kaleido.io> Co-authored-by: J Stickler <julie.stickler@grafana.com> Co-authored-by: Trevor Whitney <trevorjwhitney@gmail.com> commit f0d6a92 Author: Trevor Whitney <trevorjwhitney@gmail.com> Date: Thu May 23 14:03:32 2024 -0600 feat: reject filter queries to /patterns endpoint commit dc620e7 Author: Trevor Whitney <trevorjwhitney@gmail.com> Date: Wed May 8 14:08:44 2024 -0600 feat: collect and serve pre-agg bytes and count * pre-aggregate bytes and count per stream in the pattern ingester * serve bytes_over_time and count_over_time queries from the patterns endpoint commit 97212ea Author: Jay Clifford <45856600+Jayclifford345@users.noreply.github.com> Date: Thu May 23 12:10:48 2024 -0400 feat: Added Interactive Sandbox to Quickstart tutorial (#12701) commit 1111595 Author: Vladyslav Diachenko <82767850+vlad-diachenko@users.noreply.github.com> Date: Thu May 23 13:18:16 2024 +0300 feat: new stream count limiter (#13006) Signed-off-by: Vladyslav Diachenko <vlad.diachenko@grafana.com> Co-authored-by: JordanRushing <rushing.jordan@gmail.com> commit 987e551 Author: Quentin Bisson <quentin@giantswarm.io> Date: Thu May 23 02:15:52 2024 +0200 fix: allow cluster label override in bloom dashboards (#13012) Signed-off-by: QuentinBisson <quentin@giantswarm.io> commit d3c9cec Author: Quentin Bisson <quentin@giantswarm.io> Date: Thu May 23 01:59:28 2024 +0200 fix: upgrade old plugin for the loki-operational dashboard. (#13016) Signed-off-by: QuentinBisson <quentin@giantswarm.io> commit 8d9fb68 Author: Quentin Bisson <quentin@giantswarm.io> Date: Wed May 22 22:00:08 2024 +0200 fix: remove unneccessary disk panels for ssd read path (#13014) Signed-off-by: QuentinBisson <quentin@giantswarm.io> commit 1948899 Author: Quentin Bisson <quentin@giantswarm.io> Date: Wed May 22 15:16:29 2024 +0200 fix: Mixins - Add missing log datasource on loki-deletion (#13011) commit efd8f5d Author: Salva Corts <salva.corts@grafana.com> Date: Wed May 22 10:43:32 2024 +0200 refactor(blooms): Add queue to bloom planner and enqueue tasks (#13005) commit d6f29fc Author: Vitor Gomes <41302394+vitoorgomes@users.noreply.github.com> Date: Wed May 22 04:34:42 2024 +1200 docs: update otlp ingestion with correct endpoint and add endpoint to reference api docs (#12996) commit 3195036 Author: Salva Corts <salva.corts@grafana.com> Date: Tue May 21 13:12:24 2024 +0200 refactor(bloom planner): Compute gaps and build tasks from metas and TSDBs (#12994) commit 7a3338e Author: Jonathan Davies <jpds@protonmail.com> Date: Tue May 21 10:41:42 2024 +0100 feat: loki/main.go: Log which config file path is used on startup (#12985) Co-authored-by: Michel Hollands <42814411+MichelHollands@users.noreply.github.com> commit bf8a278 Author: Ashwanth <iamashwanth@gmail.com> Date: Tue May 21 12:56:07 2024 +0530 chore: remove duplicate imports (#13001) commit 1f5291a Author: Ashwanth <iamashwanth@gmail.com> Date: Tue May 21 12:38:02 2024 +0530 fix(indexstats): do not collect stats from "IndexStats" lookups for other query types (#12978) commit 8442dca Author: Jay Clifford <45856600+Jayclifford345@users.noreply.github.com> Date: Mon May 20 17:52:17 2024 -0400 feat: Added getting started video (#12975) commit 75ccf21 Author: Christian Haudum <christian.haudum@gmail.com> Date: Mon May 20 17:14:40 2024 +0200 feat(blooms): Separate page buffer pools for series pages and bloom pages (#12992) Series pages are much smaller than bloom pages and therefore can make use of a separate buffer pool with different buckets. The second commit fixes a possible panic. Signed-off-by: Christian Haudum <christian.haudum@gmail.com> commit 94d610e Author: Yarden Shoham <git@yardenshoham.com> Date: Mon May 20 18:05:50 2024 +0300 docs: Fix broken link in the release notes (#12990) Co-authored-by: J Stickler <julie.stickler@grafana.com> commit 31a1314 Author: choeffer <christian.hoeffer@maibornwolff.de> Date: Mon May 20 16:39:25 2024 +0200 docs(install-monolithic): add quotation marks (#12982) Co-authored-by: Michel Hollands <42814411+MichelHollands@users.noreply.github.com> commit 8978ecf Author: Salva Corts <salva.corts@grafana.com> Date: Mon May 20 12:36:22 2024 +0200 feat: Boilerplate for new bloom build planner and worker components. (#12989)
What this PR does / why we need it:
This PR copies logic from the bloom compactor to the new bloom planner component. This is what it does:
planning_interval
:This is the definition of a task:
Special notes for your reviewer:
tsdb.go
andtasb_test.go
ultil.go
I moved thefindGaps
function from the compactor controller and renamed toFindGapsInFingerprintBounds
. I also copied its test toutil_test.go
tableIterator.go
I extracted thedayRangeIterator
. The code is the same as the one from the compactor.planner.go
.planner.tables
gapsBetweenTSDBsAndMetas
blockPlansForGaps
planner.runOne
planner.loadWork
Checklist
CONTRIBUTING.md
guide (required)feat
PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.docs/sources/setup/upgrade/_index.md
production/helm/loki/Chart.yaml
and updateproduction/helm/loki/CHANGELOG.md
andproduction/helm/loki/README.md
. Example PRdeprecated-config.yaml
anddeleted-config.yaml
files respectively in thetools/deprecated-config-checker
directory. Example PR