Rebalancing fails partly when one of two clients with multiple single active consumers (with the same name) on a superstream crashes #13372
-
Community Support Policy
RabbitMQ version used4.0.5 Erlang version used27.2.x Operating system (distribution) usedLinux version 5.15.153.1-microsoft-standard-WSL2 How is RabbitMQ deployed?Community Docker image rabbitmq-diagnostics status outputSee https://www.rabbitmq.com/docs/cli to learn how to use rabbitmq-diagnostics
Logs from node 1 (with sensitive values edited out)See https://www.rabbitmq.com/docs/logging to learn how to collect logs
Logs from node 2 (if applicable, with sensitive values edited out)See https://www.rabbitmq.com/docs/logging to learn how to collect logs
Logs from node 3 (if applicable, with sensitive values edited out)See https://www.rabbitmq.com/docs/logging to learn how to collect logs
rabbitmq.confSee https://www.rabbitmq.com/docs/configure#config-location to learn how to find rabbitmq.conf file location
Steps to deploy RabbitMQ clusterdocker-compose up glrabbitmq: Steps to reproduce the behavior in questionA super stream 'CmmnEvents' is created with 8 partitions. advanced.configSee https://www.rabbitmq.com/docs/configure#config-location to learn how to find advanced.config file location
Application code# PASTE CODE HERE, BETWEEN BACKTICKS Kubernetes deployment file# Relevant parts of K8S deployment that demonstrate how RabbitMQ is deployed
# PASTE YAML HERE, BETWEEN BACKTICKS What problem are you trying to solve?When multiple client applications define multiple single active consumers (with the same name) on the same partitioned superstream, and when one client crashes, we expect the remaining client to take over consuming all the partitions, but this scenario always ends with 2 of the 8 partitions unconsumed. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 5 replies
-
You deploy 2 instances of the same application, is that correct? |
Beta Was this translation helpful? Give feedback.
-
I could not reproduce the issue. Can you provide a step-by-step procedure? Here is the procedure I used. We expect something similar from your end to help us diagnose the issue. Start a 3-node cluster. Using the stream Java client Docker setup is convenient:
(to stop the cluster later: In another terminal tab, get Stream PerfTest and run it to simulate a super stream consumer (it creates the super stream as well):
In yet another terminal tab, run another consumer:
List the consumers of the group for the
List the Java processes with
Kill one of the Stream PerfTest processes:
List the consumers on the
There is still a consumer, it was inactive before and has been promoted to active. |
Beta Was this translation helpful? Give feedback.
I identified the problem, I'm working on a fix.