We have 2 regions: US
and EUROPE
each have a topic with sales that happened regionaly.
We want on each region to have a way to see all sales in all regions.
Why is it that we need to set topic.config.sync=false
?
In this example, if EUROPE_sales
and US_sales
do not share the same setup... what would be sales
topic like ?
It's now up to you to sync these, and not replicator.
If you want no have this problem, just use Replicator in its default mode :)
This setup is executable in rename-format.sh
https://docs.confluent.io/current/connect/transforms/regexrouter.html
you will need to specify:
topic.config.sync=false
topic.auto.create=false
topic.preserve.partitions=false
This setup is executable in regexrouter.sh
This setup is executable in simplest.sh
The main point is that you can subscribe to a list of topics or to a regex.
It's native to the consumer protocol https://docs.confluent.io/current/clients/javadocs/org/apache/kafka/clients/consumer/KafkaConsumer.html#subscribe-java.util.regex.Pattern-
It is also there to be leveraged in the kafka-console-consumer
docker-compose exec broker-europe kafka-console-consumer --bootstrap-server broker-europe:9092 --whitelist "sales_.*" --from-beginning --max-messages 20 --property metadata.max.age.ms 30000
We specify metadata.max.age.ms 30000
(30s) because the default is 5 minutes, we want new topics to be discovered faster.
It does simplify everything as there is
- no duplication
- simpler reasoning
- simpler monitoring
- simpler ACL management
- less data traveling around
- less connectors
- native
- ...
Basically, same as the previous solution, plus we merge all the local sales_
topics to a single sales
topic.
As a follow of the previous --whitelist "sales_.*"
property, the connector will use the topic.regex
property.
We used metadata.max.age.ms
for the client, we will use topic.poll.interval.ms
to specify the discovery frequency.
Beware that while it seems simples, there is indeed
- duplication
- it kind of breaks the ACLs mindset, while it also can be useful