Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#374 - allow concurrent consumption of events #29

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

nachogiljaldo
Copy link

@nachogiljaldo nachogiljaldo commented Sep 30, 2023

Closes ThreeDotsLabs/watermill#374

Why?

It would be nice to have a simple way to increase the concurrency of message processing. With the current options, it is only possible to consume one message at the time (i.e. you need to get > process > ACK > get > process > ...).

That is not actually required in many use cases where events with the same key are sent to the same partition and, therefore, a partial order exists within the partition.

What?

This PR:

  • documents the previous and new consumption models
  • extracts the messageHandler struct to an interface, so multiple implementations can be provided while preserving the subscriber's structure
  • creates 2 new implementations of the interface:
    • batchedMessageHandler works by collecting batches of messages up to a certain time or duration. Those messages are then sent to the output channel. NACKed messages are managed by resending every message following the NACKed one in that topic / partition
    • partitionConcurrentMessageHandler works by multiplexing each claim into the output channel, that allows having up to N messages in-flight, where N is the number of topics/partitions in the subscriber. NACKs / ACKs are handled by each multiplexed
  • updates the tests so they run with both models (which makes them slower :/ )

IMPORTANT: I tried to preserve the current default behavior. Therefore, there is a Default consumption model which behaves the same way it does now.

Potential follow-up

A potential follow-up would be a consumer that allows for up to N messages being in-flight. That would allow for cases in which the developers want to maximize parallelism at the expense of complexity on their code.

@nachogiljaldo nachogiljaldo force-pushed the 374-allow_concurrent_consumption_of_events branch from 1cfa4da to dafe0de Compare October 1, 2023 18:25
@nachogiljaldo nachogiljaldo force-pushed the 374-allow_concurrent_consumption_of_events branch from 7466597 to c316ad9 Compare October 13, 2023 21:15
@nachogiljaldo nachogiljaldo marked this pull request as ready for review October 13, 2023 22:04
@nachogiljaldo
Copy link
Author

Hey @m110 sorry for the direct ping, but I wonder if you could, as member of the org, review the PR or ask somebody who could do it.

@nachogiljaldo nachogiljaldo force-pushed the 374-allow_concurrent_consumption_of_events branch 2 times, most recently from 644709c to f528a17 Compare March 8, 2024 14:28
@nachogiljaldo nachogiljaldo force-pushed the 374-allow_concurrent_consumption_of_events branch from f528a17 to 5c1acf0 Compare March 8, 2024 14:48
@nachogiljaldo
Copy link
Author

@aldinugroho24 @yangyangrusli @m110 are you the maintainers of this repo? If yes, would you mind having a look at this PR? I think it would be a nice improvement. Also I wonder if you could use any help maintaining this repo. I see there's a bunch of piled PRs and I'm affected by some of those bugs, so I'd be happy to help if it works.

@m110
Copy link
Member

m110 commented Aug 23, 2024

I'm sorry about the long wait @nachogiljaldo, I'm on it now. Thanks for contributing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[watermill-kafka] - cannot process events concurrently without ACKing the event
2 participants