Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafka consumers through shotover frequently seeing NOT_COORDINATOR warnings #1687

Closed
justinweng-instaclustr opened this issue Jul 11, 2024 · 0 comments · Fixed by #1693
Closed
Assignees

Comments

@justinweng-instaclustr
Copy link
Collaborator

When consuming from multi-partition topics through Shotover, consumers are frequently showing warnings logs like these:

[main] WARN org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Offset commit failed on partition test-2 at offset 0: This is not the correct coordinator.
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null) is unavailable or invalid due to cause: error response NOT_COORDINATOR. isDisconnected: false. Rediscovery will be attempted.
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Requesting disconnect from last known coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] INFO org.apache.kafka.clients.NetworkClient - [Consumer clientId=consumer-my-group-1, groupId=my-group] Client requested disconnect from node 2147483645

The consumers are able to consume the messages eventually, but this seems to be causing the clients to frequently disconnect and reconnect. See this log sample for example, which shows messages being consumed in between bouts of disconnects:

topic = test, partition = 4, offset = 5, key = my-key, value = my-value
[main] WARN org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Offset commit failed on partition test-2 at offset 0: This is not the correct coordinator.
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null) is unavailable or invalid due to cause: error response NOT_COORDINATOR. isDisconnected: false. Rediscovery will be attempted.
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Requesting disconnect from last known coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] INFO org.apache.kafka.clients.NetworkClient - [Consumer clientId=consumer-my-group-1, groupId=my-group] Client requested disconnect from node 2147483645
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Discovered group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null) is unavailable or invalid due to cause: coordinator unavailable. isDisconnected: false. Rediscovery will be attempted.
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Requesting disconnect from last known coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Discovered group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null) is unavailable or invalid due to cause: coordinator unavailable. isDisconnected: false. Rediscovery will be attempted.
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Requesting disconnect from last known coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Discovered group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] WARN org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Offset commit failed on partition test-2 at offset 0: This is not the correct coordinator.
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null) is unavailable or invalid due to cause: error response NOT_COORDINATOR. isDisconnected: false. Rediscovery will be attempted.
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Requesting disconnect from last known coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] INFO org.apache.kafka.clients.NetworkClient - [Consumer clientId=consumer-my-group-1, groupId=my-group] Client requested disconnect from node 2147483645
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Discovered group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null) is unavailable or invalid due to cause: coordinator unavailable. isDisconnected: false. Rediscovery will be attempted.
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Requesting disconnect from last known coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Discovered group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
topic = test, partition = 4, offset = 6, key = my-key, value = my-value
topic = test, partition = 4, offset = 7, key = my-key, value = my-value
[main] WARN org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Offset commit failed on partition test-2 at offset 0: This is not the correct coordinator.
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null) is unavailable or invalid due to cause: error response NOT_COORDINATOR. isDisconnected: false. Rediscovery will be attempted.
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Requesting disconnect from last known coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] INFO org.apache.kafka.clients.NetworkClient - [Consumer clientId=consumer-my-group-1, groupId=my-group] Client requested disconnect from node 2147483645
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Discovered group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null) is unavailable or invalid due to cause: coordinator unavailable. isDisconnected: false. Rediscovery will be attempted.
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Requesting disconnect from last known coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Discovered group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
topic = test, partition = 4, offset = 8, key = my-key, value = my-value
topic = test, partition = 4, offset = 9, key = my-key, value = my-value
topic = test, partition = 4, offset = 10, key = my-key, value = my-value
topic = test, partition = 4, offset = 11, key = my-key, value = my-value
[main] WARN org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Offset commit failed on partition test-2 at offset 0: This is not the correct coordinator.
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null) is unavailable or invalid due to cause: error response NOT_COORDINATOR. isDisconnected: false. Rediscovery will be attempted.
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Requesting disconnect from last known coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] INFO org.apache.kafka.clients.NetworkClient - [Consumer clientId=consumer-my-group-1, groupId=my-group] Client requested disconnect from node 2147483645
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Discovered group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null) is unavailable or invalid due to cause: coordinator unavailable. isDisconnected: false. Rediscovery will be attempted.
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Requesting disconnect from last known coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Discovered group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null) is unavailable or invalid due to cause: coordinator unavailable. isDisconnected: false. Rediscovery will be attempted.
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Requesting disconnect from last known coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Discovered group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] WARN org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Offset commit failed on partition test-2 at offset 0: This is not the correct coordinator.
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Group coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null) is unavailable or invalid due to cause: error response NOT_COORDINATOR. isDisconnected: false. Rediscovery will be attempted.
[main] INFO org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-my-group-1, groupId=my-group] Requesting disconnect from last known coordinator rack2.paris-kafka-psc-test.instaclustr.com:9091 (id: 2147483645 rack: null)
[main] INFO org.apache.kafka.clients.NetworkClient - [Consumer clientId=consumer-my-group-1, groupId=my-group] Client requested disconnect from node 2147483645

This has been observed with the Java and Confluent Kafka C# consumers so far.

While these consumers were running, the shotover_out_of_rack_requests_count metric was steadily increasing (by at least 1 per second), which I suspect could be related.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment