-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new transactional producer #130
Add new transactional producer #130
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some design questions I could use feedback on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this, already looking really good!
@vlovgr I think I covered everything |
Codecov Report
@@ Coverage Diff @@
## 0.20.x #130 +/- ##
==========================================
- Coverage 93.75% 93.62% -0.13%
==========================================
Files 40 48 +8
Lines 1153 1208 +55
Branches 78 93 +15
==========================================
+ Hits 1081 1131 +50
- Misses 72 77 +5
Continue to review full report at Codecov.
|
I've been working to get an 0.19.x-compatible version of this PR set up in my project at work as a stop-gap while we work out the rough edges of the API here. The first problem I hit was picking a transactional ID. Alpakka asks for a user-defined ID, but I found that Kafka Streams auto-generates IDs for users. When I dug into the implementation I found that Kafka Streams uses a unique transactional ID per "task" (topic/partition). Other projects like spring-kafka have followed the same pattern. From what I've read, it's effectively the only way to ensure messages don't get double-processed when partitions get rebalanced. Do you think this should change our design here at all? I could see a setup where we remove the |
If I understand correctly, The mentioned topic-partition strategy also means one producer per topic-partition, right? That might be a bit much in the single instance scenario, but for multiple instances it sounds acceptable. Perhaps this is even behaviour we can toggle in (I think the most tricky part of the producer-per-topic-partition is managing creating and closing producers as partitions are assigned and revoked. Maybe we could do something clever with rebalance listeners to get this working nicely.) |
👍 I'm ok with this merging and setting up a new PR for improvements. Thanks for all the help and review! Is there anything I can do to help with adding docs & tests on this one? |
Great! If there's anything you feel is missing in terms of docs or tests, then feel free to add it. Otherwise, I'll just have a final look through this tomorrow and then merge it. 👍 |
Thanks a lot for this @danxmoran! 👍 |
Fixes #128.
Here's an implementation which (mostly) avoids touching existing classes. There's a bunch of code copy-pasted from the existing
KafkaProducer
andProducerMessage
which might be nice to consolidate 😄