-
Notifications
You must be signed in to change notification settings - Fork 847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Redpanda Migrator components #3026
Refactor Redpanda Migrator components #3026
Conversation
34421d0
to
081592f
Compare
a86bdbd
to
72237c4
Compare
d37239f
to
784ff42
Compare
log: res.Logger(), | ||
shutSig: shutdown.NewSignaller(), | ||
clientOpts: optsFn, | ||
topicLagGauge: res.Metrics().NewGauge("redpanda_lag", "topic", "partition"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I added the redpanda_migrator
input, I had both this gauge and the kafka_lag
metadata field. I don't know if we want any of these available by default. Also, should this gauge name be somehow derived from the actual input type (redpanda
, redpanda_common
, redpanda_migrator
, redpanda_migrator_offsets
)? It does get the label of the input if set, so maybe that's sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the label is enough. Do we really want this lag metric for all these inputs? Probably I would assume...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think it's a bit overkill and I don't recall now which conversation led to this pattern. I also emit the kafka_lag
metadata field with each message, so one could add a metric
processor in the pipeline which creates a gauge for topics as needed. One downside with this approach is if messages stop flowing completely, then this gauge wouldn't get any updates. I think the main idea was to make it easier for people to discover this metric, but it's not clear what the perf impact might be if we consume from thousands of topics, each having multiple partitions. Should I remove it? (cc @Jeffail)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like having the metric emitted here, it's relatively cheap, and extracting from meta is awkward enough no one is going to do it willingly.
784ff42
to
642fd09
Compare
34c5d16
to
5749553
Compare
Signed-off-by: Mihai Todor <todormihai@gmail.com>
- New `redpanda_migrator_offsets` input - Fields `offset_topic`, `offset_group`, `offset_partition`, `offset_commit_timestamp` and `offset_metadata` added to the `redpanda_migrator_offsets` output Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
This is required in order to pull in twmb/franz-go#838 This is needed because the `redpanda_migrator` input needs to create all the matched topics during the first call to `ReadBatch()`. Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
- Move OnConnect topic creation logic to the output to avoid the circular dependency between the input and output (the input doesn't need to know about the output) - Clean up error handling Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
Signed-off-by: Mihai Todor <todormihai@gmail.com>
This won't work until data is actually fetched... Signed-off-by: Mihai Todor <todormihai@gmail.com>
0ce0e32
to
9f67fea
Compare
Signed-off-by: Mihai Todor <todormihai@gmail.com>
11a3078
to
037d93b
Compare
I hijacked this PR to address several issues:
Fixed
redpanda_migrator
output no longer rejects messages if it can't perform schema ID translation.redpanda_migrator
input no longer converts the kafka key to string.Added
redpanda_migrator_offsets
input.offset_topic
,offset_group
,offset_partition
,offset_commit_timestamp
andoffset_metadata
added to theredpanda_migrator_offsets
output.topic_lag_refresh_period
added to theredpanda
andredpanda_common
inputs.redpanda_lag
now emitted by theredpanda
andredpanda_common
inputs.kafka_lag
now emitted by theredpanda
andredpanda_common
inputs.redpanda_migrator_bundle
input and output now set labels for their subcomponents.Changed
kafka_key
andmax_in_flight
fields of theredpanda_migrator_offsets
output are now deprecated.batch_size
,multi_header
,replication_factor
,replication_factor_override
andoutput_resource
for theredpanda_migrator
input are now deprecated.kafka_key
andmax_in_flight
for theredpanda_migrator_offsets
output are now deprecated.batching
for theredpanda_migrator
output is now deprecated.redpanda_migrator
input no longer emits tombstone messages.Redpanda Migrator offset metadata
One quick way to test this is via the following config. Note how I overwrite
kafka_offset_metadata
tofoobar
in amapping
processor.