-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix assign followed by fast revoke during rebalance #1294
Fix assign followed by fast revoke during rebalance #1294
Conversation
Hey there, @svroonland, @guizmaii, |
Most maintainers might be on holidays (I am). I recently rewrote this part so I'd like to look at it when I am back. |
Got it, @erikvanoosten sounds good, thanks for that and for letting me know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took me a while, but I can see now when this would happen (exactly as you describe). LGTM.
I'll leave merging up to Erik as he wanted to have a look as well.
Hi @ytalashko . Thanks again for this PR. I am wondering, is this a theoretical issue, or did you see this happening in practice? (Note, IMHO it should be fixed either way.) Although the proposed change will work, it goes against the (never documented) idea that the rebalance listener should be as simple and short as possible and not take any decisions unless absolutely necessary. Another downside of this approach is that logging, the diagnostics callbacks and metrics will be incorrect. I will propose another change shortly. |
As discovered by @ytalashko (see #1294), it is possible that a partition is assigned and immediately revoked in the same poll. No stream should be started for these partitions. Unlike #1294, this PR does _not_ hide the assigned+revoked partitions from the diagnostic events and rebalance metrics. In addition, this PR also supports the unlikely event of a assigned+lost partition.
Hey, @erikvanoosten,
It is practical, as I mentioned in my previous comment
Could you, please, give the guidance, where I can get this info, cause I never heard of it? I thought it is for triggering custom actions when the set of partitions assigned or revoked, as stated in the documentation for the rebalance listener. Also, I implemented it this way, 'cause it seems to be the right place from the perspective of responsibility segregation and simplicity. It is the same idea as @svroonland mentioned in #1298 (comment),
As noted in this comment above, the lib currently is not transparrent to metrics with lines like Also, even if we want to have some more comprehensive solution, I would vote to have a fixed version faster, and then to "optimize" the fix, since it is a pretty tricky bug which is hard to detect, and some lib users may be impacted without even knowing about that.
Wdyt? Also, the proposed fix in #1298 does bring another bug, 'cause it ignores the sequence of rebalance events calls, e.g. revoke first, or assign first in case of the same partitions. Regarding the assign lost sequence, I thought it should be impossible in practice, so hadn't added handling for it, but not sure. Why the partition should be lost right after it just assigned, it should be revoked first. In any case, if you think it is better to handle this scenario as well, we can do so. |
Thanks, good to know! Kafka never stops surprising me.
This is not a general idea and it is also not documented anywhere. This idea is purely based on experience with this library. In the past there was much more logic in the rebalance listener and because of that it became very hard to make certain changes. This was resolved by moving the logic to the main loop, and making the rebalance listener only register what happened and act on it as little as possible.
My apologies! I am very much used to most people that are happy to hand a problem off to the maintainers. I should have been more sensitive and realize that this does not apply to everybody. In addition, IMHO maintainers should be open to help other people get into the project and I failed on that aspect.
Yes, that is an excellent observation 🙏 ! I have closed #1298 because of it. To fix this according to the idea that the rebalance listener should do as little as possible, it should maintain a list of changes instead of merging everything into a few sets. However, that will be a bigger change with the only immediate benefit being that metrics can be correct (see next paragraph). I am also still going back and forward on whether we should let the library user know that a partition was assigned and immediately revoked. When I was writing #1298 I realized that most of the time you really don't care and it is a burden that you need to take this into account. In short, I have gone full circle and think we should follow the approach of this PR now. In addition, we can keep in mind a future change in which we make the rebalance listener smarter by making it keep a list of changes instead of the current approach. @ytalashko could you please make a change such that assign+lost is handled the same as assign+revoke? If you want, could you also take a look at adding a unit test in RunloopSpec? (Note, we only started writing unit tests for Runloop recently, so there aren't many unit tests yet. We do have many integration tests.) |
Hey, @erikvanoosten,
Got it, thanks for sharing this info.
Yeah, that's sounds great
Sure, will do
Sure, 👍 |
…ition which being revoked right after assignment within the same RebalanceEvent
zio-kafka-test/src/test/scala/zio/kafka/consumer/internal/RunloopSpec.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Just a few small comments.
zio-kafka-test/src/test/scala/zio/kafka/consumer/internal/RunloopSpec.scala
Outdated
Show resolved
Hide resolved
zio-kafka-test/src/test/scala/zio/kafka/consumer/internal/RunloopSpec.scala
Outdated
Show resolved
Hide resolved
zio-kafka-test/src/test/scala/zio/kafka/consumer/internal/RunloopSpec.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're good to merge. @ytalashko do you agree?
Yeah, sounds good, to me, thanks @erikvanoosten! |
A bit late, but I was wondering: although a partition stream is started, does it actually get records? |
I checked that. If we receive records for a partition that does not have a running stream, those records are silently ignored. Not great, but in thus case helpful. |
But this issue is about the inverse situation: there is a running stream but it doesn't get any records. Right? |
Yeah, both streams for the same partition received same records, creating many duplicates. |
You need to enable commitSafeRebalance (not the exact name) to prevent duplicates during a rebalance. |
Yeah, this is a good point, thanks. The services (consumers) are using rebalance safe commits. The duplicated messages were coming only from specific partitions, from those with the duplicated partition streams, and far past rebalance events occurrence. |
I was going to write that this is not possible, because the broker decides what to sent to the client. But that is not entirely true, perhaps, because we have a stream, we also resume the partition, and this causes the broker to send records, even though the partition is not assigned. I am just guessing, we'd need to experiment to find out if this is indeed the case. |
Maybe you are right about |
I was describing your case! :)
Except for bugs, it is not possible that you would get duplicated records in a single consumer. That is not how the java client works. So I'll put my money on duplication across different consumers. |
Arrchhh. Scratch the last few messages. I just looked in the code (https://github.com/zio/zio-kafka/blob/master/zio-kafka/src/main/scala/zio/kafka/consumer/internal/Runloop.scala#L437). Zio-kafka only resumes partitions that are actually assigned. So the bug solved in this PR is very minor, as far as I can see, it can not have lead to duplicate records. |
It is not java-clients lib which created duplication of messages, it is zio-kafka lib. Maybe I've not explained it well. |
Ouch, that's right. That bit of code is fine under the assumption that no duplicate TopicPartitions are in the In this line the duplicate partition would be selected for being started. |
In that case I'd say that this is not a minor or rare issue, but quite a proper bug.. |
Ouch! Yes, you are correct! That is a major bug indeed! I never considered that a partition could be assigned, revoked and assigned again. |
Yeah, exactly, and, to note, this line is also fine, it is not it's responsibility to judge the input its given, at least from my point of view. |
@ytalashko I have updated the description of the PR to describe the situation to the latest insights. Can you check it please? |
Thanks, just checked, looks good, 👍 |
When an assign and revoke for the same partition follow each other very quickly, in the same poll, right now we disregard the revoke and incorrectly start a stream for the partition. Initially this is not a problem because the stream will simply not get any records. However, when later the partition is assigned again, a second stream is started for that partition, and both streams will receive all the records of the partition, leading to duplicated processing.
With this change, an assigned and immediately revoked partition will be ignored (no stream is started, the partition will not be counted in metrics and it will not be reported in the diagnostics callback). The same change is made for an assign followed by a 'lost'.
Another implementation has been considered: in the rebalance listener maintain a list of assign/revoke/lost events instead of sets with assigned/revoked/lost partitions. However, this is a bigger change and the only immediate benefit is that we can correctly report the number of assigned/revoked/lost partitions in the metrics.