Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

One message from Consumer? #1056

Closed
darkfrog26 opened this issue Aug 19, 2022 · 6 comments
Closed

One message from Consumer? #1056

darkfrog26 opened this issue Aug 19, 2022 · 6 comments

Comments

@darkfrog26
Copy link

I'd like to create a scenario where I'm gathering work tasks from Kafka, so each message may take a long time to run. To that end, I want to only consume one message at a time with the consumer so that another worker that connects would receive the next event in the queue instead of multiple events being buffered to my consumer. I've tried withMaxPollRecords(1), but that didn't seem to make any difference. Is there a way to accomplish this?

@MrKustra94
Copy link

From what I understand you want to achieve queue-like behaviour with Kafka, while Kafka is much more topic oriented.
withMaxPollRecords(1) related to buffer of single consumer. If another one is added (assuming that both are in the same consumer groups), then:

  • if topic has 1 partition only, then consumer will be idle
  • otherwise they will split evenly the partitions, in non intersecting fashion.

If possible, for your case it would be good if your topic had more than 1 partition. Then e.g. 2 consumers would split consumption in queue like way. This inly holds if your tasks can be split into different partitions.

@darkfrog26
Copy link
Author

darkfrog26 commented Aug 20, 2022

The test scenario I'm trying to reproduce is a scaling one:

  1. I send two messages to the same topic
  2. I create a consumer and want to process just the first message (processing of the event takes time)
  3. I create a second consumer on the same group as the first with the hope of processing the second message

What actually happens is that the first consumer receives the first message as expected, takes time to process, and while that is happening the second consumer connects but receives no messages. When the first consumer finally finishes processing the first message and commits it, it immediately receives the second message. This tells me that the first consumer is receiving both messages before the second consumer even starts.

@MrKustra94
Copy link

If topic has only 1 partition, then it is fully expected behaviour.
One partition can be only consumed by single consumer at the time.

@darkfrog26
Copy link
Author

Hmm, so is my scenario not possible with Kafka? The idea is that I want to be able to scale up additional workers on-demand (when the queue starts to fill up), but if I only have one worker that has the next N events it doesn't matter if I add additional workers since they are all already filtered to the first worker.

I appreciate you taking the time to help out with this as it's appearing it has little to do with the library and more with how Kafka works.

@MrKustra94
Copy link

MrKustra94 commented Aug 22, 2022

You can't just achieve queue-like behavior with Kafka with topic with one partition. that's not how Kafka was designed.
If You want to scale your consumer workers to N, you need at least N partitions for given topic. If your tasks can be somehow split into parallelizable groups, then partitions would be great here to use. If your topic has only one partition and you can't change its partitions count, then you can give a try to the idea inspired by Staged Event Driven Architecture and add additional worker, who will consume load from given topic and re-distribute it to your own, new topic with valid amount of partitions set.
If can't group tasks so as to increase parallelism, then Kafka can't help you, since single topic can be consumed by single consumer in the consumer group.

@aartigao
Copy link
Contributor

Closing as it's a perfectly explained pattern regarding Kafka arch and not the library.

@aartigao aartigao closed this as not planned Won't fix, can't repro, duplicate, stale Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants