-
Notifications
You must be signed in to change notification settings - Fork 853
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Redpanda Migrator consumer group offset migration logic #3182
Fix Redpanda Migrator consumer group offset migration logic #3182
Conversation
2e64fd5
to
9688466
Compare
fallback: | ||
- label: %s_redpanda_migrator_output | ||
redpanda_migrator: %s | ||
processors: | ||
- mapping: | | ||
meta input_label = deleted() | ||
# TODO: Use a DLQ | ||
- drop: {} | ||
processors: | ||
- log: | ||
message: | | ||
Dropping message: ${! content() } / ${! metadata() } | ||
label: %s_redpanda_migrator_output | ||
redpanda_migrator: %s | ||
processors: | ||
- mapping: | | ||
meta input_label = deleted() | ||
- check: metadata("input_label") == "redpanda_migrator_offsets_input" | ||
output: | ||
fallback: | ||
- label: %s_redpanda_migrator_offsets_output | ||
redpanda_migrator_offsets: %s | ||
# TODO: Use a DLQ | ||
- drop: {} | ||
processors: | ||
- log: | ||
message: | | ||
Dropping message: ${! content() } / ${! metadata() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really shouldn't be dropping messages here... I left this as a TODO initially thinking that we might want to configure DLQs, but it's best to just block if we can't write data. If it's a transitive issue, it will resume when the broker becomes responsive again. If not, an operator will have to look into it. Restarting Migrator should be a safe operation (even if it potentially introduces duplicate records).
fallback: | ||
- label: %s_redpanda_migrator_output | ||
redpanda_migrator: %s | ||
processors: | ||
- mapping: | | ||
meta input_label = deleted() | ||
# TODO: Use a DLQ | ||
- drop: {} | ||
processors: | ||
- log: | ||
message: | | ||
Dropping message: ${! content() } / ${! metadata() } | ||
label: %s_redpanda_migrator_output | ||
redpanda_migrator: %s | ||
processors: | ||
- mapping: | | ||
meta input_label = deleted() | ||
- check: metadata("input_label") == "redpanda_migrator_offsets_input" | ||
output: | ||
fallback: | ||
- label: %s_redpanda_migrator_offsets_output | ||
redpanda_migrator_offsets: %s | ||
# TODO: Use a DLQ | ||
- drop: {} | ||
processors: | ||
- log: | ||
message: | | ||
Dropping message: ${! content() } / ${! metadata() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same logic here, just repeated for the case when we don't use schemas.
internal/impl/kafka/enterprise/redpanda_migrator_bundle_output.tmpl.yaml
Outdated
Show resolved
Hide resolved
internal/impl/kafka/enterprise/redpanda_migrator_offsets_input.go
Outdated
Show resolved
Hide resolved
56673b2
to
62c0a1b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks great! Is it possible to write any tests for this case? I will take a deeper look at the PR on Monday. Thanks Mihai
internal/impl/kafka/enterprise/redpanda_migrator_offsets_output.go
Outdated
Show resolved
Hide resolved
Thank you for having a look Tyler! By the way, I think we won't have to decrement the timestamps in any way, since |
851ff84
to
407f84a
Compare
internal/impl/kafka/enterprise/redpanda_migrator_bundle_output.tmpl.yaml
Outdated
Show resolved
Hide resolved
internal/impl/kafka/enterprise/redpanda_migrator_offsets_input.go
Outdated
Show resolved
Hide resolved
return 0, fmt.Errorf("couldn't find the last record for topic %q partition %q: %s", topic, partition, err) | ||
} | ||
|
||
return it.Next().Offset, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we can replace this function with a ListEndOffsets call right? It should be max(ListEndOffsets - 1, 0)
I think because if the topic is empty you'll get back a high watermark that matches the log start position:
PARTITIONS
==========
PARTITION LEADER EPOCH REPLICAS LOG-START-OFFSET HIGH-WATERMARK
0 0 1 [0] 0 0
Also possible after retention kicks in
PARTITIONS
==========
PARTITION LEADER EPOCH REPLICAS LOG-START-OFFSET HIGH-WATERMARK
0 0 1 [0] 1 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we can replace this function with a ListEndOffsets call right?
That's what I wanted to do... However, ListEndOffsets
returns the offset after the offset of the last message (which I denoted as "last offset" - not sure what's a better name for it) in the topic. Here's an experiment:
> docker run --rm -it --name=source -p 8081:8081 -p 9092:9092 -p 9644:9644 redpandadata/redpanda redpanda start --node-id 0 --mode dev-container --set "rpk.additional_start_flags=[--reactor-backend=epoll]" --kafka-addr 0.0.0.0:9092 --advertise-kafka-addr host.docker.internal:9092 --schema-registry-addr 0.0.0.0:8081
$ rpk topic create test -p1 -X brokers=localhost:9092
TOPIC STATUS
test OK
$ echo foobar | rpk topic produce test -X brokers=localhost:9092
Produced to partition 0 at offset 0 with timestamp 1739834099974.
getLastRecordOffset()
:
Offset: 0
ListEndOffsets()
/ ListCommittedOffsets()
:
Offset_end: 1
Offset_comm: 1
Click this to reveal the messy code if you wish to play with it:
getLastRecordOffset()
:
func main() {
seeds := []string{"localhost:9092"}
consumeStart := map[string]map[int32]kgo.Offset{
"test": {
// The default offset begins at the end.
0: kgo.NewOffset().Relative(-1),
},
}
for {
client, err := kgo.NewClient([]kgo.Opt{kgo.SeedBrokers(seeds...), kgo.ConsumePartitions(consumeStart)}...)
if err != nil {
log.Fatalf("failed to create Kafka client: %s", err)
}
ctx, _ := context.WithTimeout(context.Background(), 10*time.Second)
fetches := client.PollFetches(ctx)
if fetches.IsClientClosed() {
log.Fatalf("failed to read message: client closed")
}
if err := fetches.Err(); err != nil {
log.Fatalf("failed to read message: %s", err)
}
it := fetches.RecordIter()
if it.Done() {
log.Fatalf("couldn't find message: %s", err)
}
rec := it.Next()
fmt.Println("Offset: ", rec.Offset)
fmt.Println("Timestamp: ", rec.Timestamp)
client.Close()
time.Sleep(200 * time.Millisecond)
}
}
ListEndOffsets()
/ ListCommittedOffsets()
:
func main() {
seeds := []string{"localhost:9092"}
client, err := kgo.NewClient([]kgo.Opt{kgo.SeedBrokers(seeds...)}...)
if err != nil {
log.Fatalf("failed to create Kafka client: %s", err)
}
defer client.Close()
adm := kadm.NewClient(client)
for {
offsets, err := adm.ListEndOffsets(context.Background(), "test")
if err != nil {
log.Fatalf("failed to read offsets: %s", err)
}
o, ok := offsets.Lookup("test", 0)
if !ok {
log.Fatalf("failed to find offset for test-0")
}
fmt.Println("Offset_end: ", o.Offset)
fmt.Println("Timestamp_end: ", o.Timestamp)
offsets, err = adm.ListCommittedOffsets(context.Background(), "test")
if err != nil {
log.Fatalf("failed to read offsets: %s", err)
}
o, ok = offsets.Lookup("test", 0)
if !ok {
log.Fatalf("failed to find offset for test-0")
}
fmt.Println("Offset_comm: ", o.Offset)
fmt.Println("Timestamp_comm: ", o.Timestamp)
time.Sleep(200 * time.Millisecond)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I wanted to do... However, ListEndOffsets returns the offset after the offset of the last message (which I denoted as "last offset" - not sure what's a better name for it) in the topic. Here's an experiment:
Yeah I think we just do end offset (known as High Watermark or HWM) minus one.
internal/impl/kafka/enterprise/redpanda_migrator_offsets_output.go
Outdated
Show resolved
Hide resolved
- Field `is_last_stable_offset` added to the `redpanda_migrator_offsets` output. - Metadata field `is_last_stable_offset` added to the `redpanda_migrator_offsets` input. - Fixed a bug with the `redpanda_migrator_offsets` input and output where the consumer group update migration logic based on timestamp lookup should no longer skip ahead in the destination cluster. This should enforce at-least-once delivery guarantees. The `redpanda_migrator_bundle` output no longer drops messages if either the `redpanda_migrator` or the `redpanda_migrator_offsets` child output throws an error. Connect will keep retrying to write the messages and apply backpressure to the input. Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
1d6ed07
to
742f029
Compare
Signed-off-by: Mihai Todor <todormihai@gmail.com>
93053fa
to
e8b2097
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one question about retries and then I think it LGTM
internal/impl/kafka/enterprise/redpanda_migrator_offsets_input.go
Outdated
Show resolved
Hide resolved
internal/impl/kafka/enterprise/redpanda_migrator_offsets_input.go
Outdated
Show resolved
Hide resolved
internal/impl/kafka/enterprise/redpanda_migrator_offsets_output.go
Outdated
Show resolved
Hide resolved
internal/impl/kafka/enterprise/redpanda_migrator_offsets_output.go
Outdated
Show resolved
Hide resolved
// TODO: Since `read_until` can't guarantee that we will read `count` messages, `topic` needs to have exactly `count` | ||
// messages in it. We should add some mechanism to the Kafka inputs to allow us to read a range of offsets if possible | ||
// (or up to a certain offset). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more about when we commit not what we read? So do we ack the messages after check
returns true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So do we ack the messages after
check
returns true?
Yep, if I use read_until
there, instead of reading 5 messages out of 10, it reads about 6-7... read_until
should tell the child input to shut down at that point, but it's not immediate, so I think it gets a bit of time to ack before kafka_franz
tears down the franz-go
client. I'm not entirely sure how it's wired under the hood, but I recall bumping into issues with read_until
before. For example, this generates 10 messages:
input:
read_until:
check: counter() == 2
input:
generate:
count: 20
batch_size: 5
interval: 0s
mapping: root = counter()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I was reading the code for read_until
and was surprised that check
only runs on the first message of every batch and the whole batch is still passed through..
Thank you Tyler for the suggestion! <3 Signed-off-by: Mihai Todor <todormihai@gmail.com>
c13626d
to
71d120e
Compare
I did the cleanup & I'll merge it later tomorrow in case you think of other stuff I should cover in there. Thank you again! |
is_high_watermark
added to theredpanda_migrator_offsets
output.kafka_is_high_watermark
added to theredpanda_migrator_offsets
input.redpanda_migrator_offsets
input and output where the consumer group update migration logic based on timestamp lookup should no longer skip ahead in the destination cluster. This should enforce at-least-once delivery guarantees.redpanda_migrator_bundle
output no longer drops messages if either theredpanda_migrator
or theredpanda_migrator_offsets
child output throws an error. Connect will keep retrying to write the messages and apply backpressure to the input.TODO: