-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failures to resolve addresses are logged continuously instead of once, like connection failures #186
Comments
The core philosophy of librdkafka is to hide all gory Kafka communication details from the application. One broker being down, or all, might be a permanent error, but it may also be a temporary failure (incomplete routing, firewall rules, cluster not actually started yet, etc). Instead of librdkafka bailing out Having said that it is sometimes useful for an application to find out if none of the brokers are up, and |
The repeating resolve errors should be silenced indeed. |
Lua bindings sounds great! Is this something you will open source? |
|
That should fix it. |
I got below error once i kill the one of the kafka process (I have 3, with replication-factor 3) 1465420107.985 RDKAFKA-3-ERROR: rdkafka#producer-5109: localhost:9094/bootstrap: Connect to ipv4#127.0.0.1:9094 failed: Connection refused Original Kafka State:./kafka-topics.sh --describe --zookeeper localhost:2181 --topic PCRF After Kill one server process:./kafka-topics.sh --describe --zookeeper localhost:2181 --topic PCRF |
@mmanoj do you get the message repeatedly? If so, can you provide mutliple consecutive logs? |
I got this while I test the failover of the brokers. Topic created with kafka script as follows: Please specify the log required, I will arrange, I not get what you mean by "mutliple consecutive logs" |
@edenhill |
[2016-06-11 15:57:29,998] WARN [ReplicaFetcherThread-0-2], Error in fetch kafka.server.ReplicaFetcherThread$FetchRequest@7b6ea4bf (kafka.server.ReplicaFetcherThread) |
I got this multiple round of testing. Please advice. |
Those broker logs indicate connectivity problems between your brokers, so not something related to librdkafka. |
I'm not sure what your initial problem is, you mentioned a single log line but this issue covers repeated unsuppressed identical log lines. |
my issue same as issue described in SOHU-Co/kafka-node#175 I use librdkafka to connect to broker addresses, I manually shutdown brokers round-robin manner and see the message publish to kafka. While I test this scenario i saw publisher unable to process and it's stuck process to other brokers. |
I observed this while kill the leader broker. And my message publishing process hang which use librdkafka. Any idea or possible solution for this senario. I'm using rdkafka_example.c with little modifications. |
Topic:PCRF PartitionCount:1 ReplicationFactor:3 Configs: After Leader Fail:: Topic:PCRF PartitionCount:1 ReplicationFactor:3 Configs: |
The text below describes the original question: librdkafka continuously tries to reconnect to brokers it cannot connect to. The question included a note that address resolve errors are logged continously.
The continuous reconnect is by design, and no longer part of the 'issue' but I left it for reference
It appears to me that when librdkafka fails to connect (or fails to resolve) a broker, it keeps retrying indefinitely. Am I overlooking a configuration parameter or is this expected behavior?
I'm in a scenario where my users can send Kafka message to a broker (Lua wrapper around librdkafka). The user can configure the broker's address. In case the broker is down, or if the user entered an invalid broker address I'd rather not have constant retries in the background.
Internally I solve the problem now by removing inactive Producers after a certain timeout.
This works alright for connection failures (i.e. Connection refused), but it produces a lot of spam when there is a failure in resolving the broker's address. In this case the library prints the error message to terminal continuously.
To reproduce:
This 'Connection refused' is only printed once because of the duplicate-error-check in
rd_kafka_broker_connect
.The resolve errors are printed continuously.
So my questions are:
1.1. If not, would you be interested in a pull request for this if I find the time to do it?
I must admit that I'm in no way an expert when it comes to Kafka, so there may very well be a good reason for which you must continuously try to reconnect.
Thanks for your time/efforts
The text was updated successfully, but these errors were encountered: