Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic structure of bloom gateways #10782

Merged
merged 29 commits into from
Oct 16, 2023
Merged

Conversation

chaudum
Copy link
Contributor

@chaudum chaudum commented Oct 3, 2023

Summary

This pull requests adds the basic structure for the new bloom gateway component.

  • Adds new bloom-gateway target that runs with multiple instances joined by a ring
  • Adds a querier and client component on the index gateway to filter chunk refs
  • Adds the gRPC protobuf definitions for commication between index gateways and bloom gateways
  • Adds a store component used on the bloom gateways to query binary bloom files

			     Querier   Query Frontend
			        |           |
			................................... service boundary
			        |           |
			        +----+------+
			             |
			     indexgateway.Gateway**
			             |
			   bloomgateway.BloomQuerier
			             |
			   bloomgateway.GatewayClient
			             |
			  logproto.BloomGatewayClient
			             |
			................................... service boundary
			             |
			      bloomgateway.Gateway
			             |
			       bloomshipper.Store
			             |
			      bloomshipper.Shipper
			             |
                        bloomshipper.BloomFileClient**
			             |
			        ObjectClient**
			             |
			................................... service boundary
			             |
		         object storage

** not part of this PR

This PR still contains a lot of TODOs and possibilities for optimisations, which will be addressed in subsequent pull requests.

@github-actions github-actions bot added the type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories label Oct 3, 2023
pkg/bloomgateway/bloomgateway.go Outdated Show resolved Hide resolved
pkg/storage/bloom/bloom-shipper/bloom_shipper.go Outdated Show resolved Hide resolved
pkg/storage/stores/shipper/bloomshipper/store.go Outdated Show resolved Hide resolved
@chaudum chaudum force-pushed the chaudum/bloom-gateway-structure branch from f415580 to 67647f4 Compare October 4, 2023 06:08
Copy link
Member

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some suggestions but approving 👍

pkg/bloomgateway/client.go Outdated Show resolved Hide resolved
pkg/bloomgateway/client.go Show resolved Hide resolved
pkg/bloomgateway/querier.go Show resolved Hide resolved
pkg/bloomgateway/querier.go Outdated Show resolved Hide resolved

// NewRingManager instantiates a new RingManager instance.
// The other functions will assume the RingManager was instantiated through this function.
func NewRingManager(mode ManagerMode, cfg Config, logger log.Logger, registerer prometheus.Registerer) (*RingManager, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: probably a good thing to refactor into independent code. This looks to be repeated in both the scheduler & index-gw

BloomGatewayShardSize(tenantID string) int
}

type ShardingStrategy interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this also looks copy+pasted from the index-gw, can't we reuse?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will consolidate both the ring manager scaffolding and the sharding strategy in a separate PR after merging this one.

pkg/logproto/bloomgateway.proto Outdated Show resolved Hide resolved
@chaudum chaudum force-pushed the chaudum/bloom-gateway-structure branch from 3ba9e8e to aec3d7b Compare October 5, 2023 11:30
@chaudum chaudum changed the title WIP: Add basic structure of bloom gateways Add basic structure of bloom gateways Oct 5, 2023
Copy link
Contributor

@poyzannur poyzannur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very minor comments/questions. LGTM.

|
bloomshipper.Shipper
|
bloomshipper.BloomFileClient
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drawing the flow like this is a really nice practice to follow, i'll adopt it. nit tab tab, alignment on this line :)

// Since we set the shard size to replication factor if shard size is 0, this
// can only happen if both the shard size and the replication factor are set
// to 0.
if shardSize <= 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can shardSize be negative? why not ==

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shardSize is an int. And the config would allow setting a negative value.

Comment on lines 20 to 31
type ForEachBlockCallback func(bq BlockQuerier) error

type ReadShipper interface {
ForEachBlock(ctx context.Context, tenant string, from, through time.Time, fingerprints []uint64, callback ForEachBlockCallback) error
}

type WriteShipper interface {
}

type Shipper interface {
ReadShipper
WriteShipper
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this block also have bloom-compactor in mind? otherwise why is WriteShipper needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could remove it, because it's not needed for the purposes of this PR.

Copy link
Contributor

@vlad-diachenko vlad-diachenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks awesome. just one important comment that I believe we need to address before merging it

Comment on lines +105 to +123
rs, err = subRing.Get(uint32(blockRef.FromFp), BlocksOwnerSync, bufDescs, bufHosts, bufZones)
if err != nil {
return nil, err
}
// Include the block only if it belongs to this bloom gateway shard.
if rs.Includes(s.instanceID) {
filteredBlockRefs = append(filteredBlockRefs, blockRef)
continue
}

rs, err = subRing.Get(uint32(blockRef.ThroughFp), BlocksOwnerSync, bufDescs, bufHosts, bufZones)
if err != nil {
return nil, err
}
// Include the block only if it belongs to this bloom gateway shard.
if rs.Includes(s.instanceID) {
filteredBlockRefs = append(filteredBlockRefs, blockRef)
continue
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we include the block by two factors: 1) if FromFp belongs to this instance 2) if ThroughFp belongs to this instance, it means that two separate bloomGw might own the same block, one by FromFp another one by ThroughFp .
We should check only FromFp , it's enough to identify the owner

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is actually OK that multiple instances own the same block, if the same block span into the key ranges from both instances.

screenshot_20231006_082906

@chaudum chaudum marked this pull request as ready for review October 6, 2023 06:30
@chaudum chaudum requested a review from a team as a code owner October 6, 2023 06:30
pkg/bloomgateway/bloomgateway_test.go Outdated Show resolved Hide resolved
pkg/bloomgateway/client.go Show resolved Hide resolved
Copy link
Contributor

@vlad-diachenko vlad-diachenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks cool

Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
on top of existing shipper code

The existing bloom_shipper.Shipper will be renamed to
bloom_shipper.BloomClient since it is mainly responsible for file
handling from/to object storage and has a lower level interface than
what the bloom gateway consumes.

Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Provide no-op implementation of the shipper to define boundary of this
pull request.

Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
@chaudum chaudum force-pushed the chaudum/bloom-gateway-structure branch from 6daeb41 to 796154a Compare October 16, 2023 06:40
@chaudum chaudum enabled auto-merge (squash) October 16, 2023 06:48
@chaudum chaudum disabled auto-merge October 16, 2023 07:47
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Copy link
Contributor

@vlad-diachenko vlad-diachenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks awesome, but I found some cases when isOutsideRange()does not work


blockRefs, err := s.getActiveBlockRefs(ctx, tenantID, startTimestamp, endTimestamp, minFingerprint, maxFingerprint)
level.Debug(s.logger).Log("msg", "ForEachBlock", "tenant", tenantID, "from", from, "through", through, "fingerprints", fingerprints)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be thousands of fingerprints, lets skip printing it in the logs

}
return i
}
return len(s)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

must return len(s) - 1 otherwise it will be index out of range error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually possible that the index is out of range: in case the value v is greater than the last element of s.

Comment on lines 172 to 173
idx := getPosition[[]uint64](fingerprints, b.MinFingerprint)
return b.MaxFingerprint < fingerprints[idx]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this check should give a true if no one item from fingerprints []uint64 is greater or equal to b.MinFingerprint and less or equal to b.MaxFingerprint.
however, this method returns false for the block b.MinFingerprint: 3 , b.MaxFingerprint: 5 if fingerprints []uint64 is :

  1. [2] - should be true because the block does not cover this fingerprint
  2. [6] - should be true because the block does not cover this fingerprint

could you please add more test cases to cover this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These cases are actually covered in line 163-166

Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Copy link
Contributor

@vlad-diachenko vlad-diachenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chaudum chaudum enabled auto-merge (squash) October 16, 2023 13:51
@chaudum chaudum merged commit b49b3ce into main Oct 16, 2023
3 checks passed
@chaudum chaudum deleted the chaudum/bloom-gateway-structure branch October 16, 2023 14:00
@poyzannur poyzannur mentioned this pull request Oct 24, 2023
8 tasks
MichelHollands pushed a commit that referenced this pull request Oct 24, 2023
**What this PR does / why we need it**:
The errors below were introduced during adding initial service structure
for BloomGateway #10782 and
BloomCompactor #10748 components.

```
docker-loki-read-3      | panic: runtime error: invalid memory address or nil pointer dereference
docker-loki-read-3      | [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x9e6d90]
docker-loki-read-3      |
docker-loki-read-3      | goroutine 1 [running]:
docker-loki-read-3      | github.com/grafana/dskit/kv.createClient({_, _}, {_, _}, {{{0x22f6128, 0xe}, {{0x0, 0x0}}, 0x4a817c800, 0x0, ...}, ...}, ...)
docker-loki-read-3      |       /src/loki/vendor/github.com/grafana/dskit/kv/client.go:158 +0x360
docker-loki-read-3      | github.com/grafana/dskit/kv.NewClient({{0x22ec444, 0xa}, {0x22ee95d, 0xb}, {{{0x22f6128, 0xe}, {{...}}, 0x4a817c800, 0x0, 0x3ff0000000000000, ...}, ...}, ...}, ...)
```

This PR initialies new components as part of MemberlistKV.


**Which issue(s) this PR fixes**:
Fixes #<issue number>

**Special notes for your reviewer**:
Tested on local via docker workflow

**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/setup/upgrade/_index.md`
- [ ] For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](d10549e)
- [ ] If the change is deprecating or removing a configuration option,
update the `deprecated-config.yaml` and `deleted-config.yaml` files
respectively in the `tools/deprecated-config-checker` directory. <!--
TODO(salvacorts): Add example PR -->
rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024
### Summary

This pull requests adds the basic structure for the new bloom gateway component.

- Adds new `bloom-gateway` target that runs with multiple instances joined by a ring
- Adds a querier and client component on the index gateway to filter chunk refs
- Adds the gRPC protobuf definitions for commication between index gateways and bloom gateways
- Adds a store component used on the bloom gateways to query binary bloom files

```

			     Querier   Query Frontend
			        |           |
			................................... service boundary
			        |           |
			        +----+------+
			             |
			     indexgateway.Gateway**
			             |
			   bloomgateway.BloomQuerier
			             |
			   bloomgateway.GatewayClient
			             |
			  logproto.BloomGatewayClient
			             |
			................................... service boundary
			             |
			      bloomgateway.Gateway
			             |
			       bloomshipper.Store
			             |
			      bloomshipper.Shipper
			             |
                        bloomshipper.BloomFileClient**
			             |
			        ObjectClient**
			             |
			................................... service boundary
			             |
		         object storage

** not part of this PR
```

This PR still contains a lot of TODOs and possibilities for optimisations, which will be addressed in subsequent pull requests.

Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024
**What this PR does / why we need it**:
The errors below were introduced during adding initial service structure
for BloomGateway grafana#10782 and
BloomCompactor grafana#10748 components.

```
docker-loki-read-3      | panic: runtime error: invalid memory address or nil pointer dereference
docker-loki-read-3      | [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x9e6d90]
docker-loki-read-3      |
docker-loki-read-3      | goroutine 1 [running]:
docker-loki-read-3      | github.com/grafana/dskit/kv.createClient({_, _}, {_, _}, {{{0x22f6128, 0xe}, {{0x0, 0x0}}, 0x4a817c800, 0x0, ...}, ...}, ...)
docker-loki-read-3      |       /src/loki/vendor/github.com/grafana/dskit/kv/client.go:158 +0x360
docker-loki-read-3      | github.com/grafana/dskit/kv.NewClient({{0x22ec444, 0xa}, {0x22ee95d, 0xb}, {{{0x22f6128, 0xe}, {{...}}, 0x4a817c800, 0x0, 0x3ff0000000000000, ...}, ...}, ...}, ...)
```

This PR initialies new components as part of MemberlistKV.


**Which issue(s) this PR fixes**:
Fixes #<issue number>

**Special notes for your reviewer**:
Tested on local via docker workflow

**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/setup/upgrade/_index.md`
- [ ] For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](grafana@d10549e)
- [ ] If the change is deprecating or removing a configuration option,
update the `deprecated-config.yaml` and `deleted-config.yaml` files
respectively in the `tools/deprecated-config-checker` directory. <!--
TODO(salvacorts): Add example PR -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/XXL type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants