-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #1083: Deal with brokers that disappear, reappear with different IP address #1085
Conversation
When KafkaClient connects to a broker in _maybe_connect, it inserts into self._conns a BrokerConnection configured with the current host/port for that node. The BrokerConnection remains there forever, though, so if the broker's IP or host ever changes, KafkaClient has no way to deal with this. The fix is to compare the latest metadata with the current node's connection, and if the host/IP has changed, decommission the old connection and allow a new one to be created. There's also a common race condition on broker startup where the initial metadata request sometimes returns an empty list of brokers, but subsequent requests behave normally. So, we must deal with broker being None here. This change is conservative in that it doesn't remove the connection from self._conns unless the new broker metadata contains an entry for that same node with a new IP/port.
Tested with a running producer and consumer connected to a single Kafka broker, which I repeatedly killed and restarted, sometimes with a new IP. Let me know if there's any docs that describe how to run the other automated tests. In testing, producer and consumer always reconnected, but under some circumstances a few messages were lost: depending on when the reconnect happens during the broker's boot procedure, the error message returned could be different, so sometimes Sender._can_retry will return true and sometimes not. Different issue. |
This fixes #1083
Can you file a issue for this? Don't want to lose track of it. |
Looks good! Test failures are unrelated, caused by a pylint bug. |
I've fixed the pylint errors in master. Can you rebase to get a clean test run? |
bump @originsmike |
|
…1085) When KafkaClient connects to a broker in _maybe_connect, it inserts into self._conns a BrokerConnection configured with the current host/port for that node. The BrokerConnection remains there forever, though, so if the broker's IP or host ever changes, KafkaClient has no way to deal with this. The fix is to compare the latest metadata with the current node's connection, and if the host/IP has changed, decommission the old connection and allow a new one to be created. There's also a common race condition on broker startup where the initial metadata request sometimes returns an empty list of brokers, but subsequent requests behave normally. So, we must deal with broker being None here. This change is conservative in that it doesn't remove the connection from self._conns unless the new broker metadata contains an entry for that same node with a new IP/port.
Pushed to master w/ fixup to only check disconnected nodes. Thanks! |
…pkp#1085) When KafkaClient connects to a broker in _maybe_connect, it inserts into self._conns a BrokerConnection configured with the current host/port for that node. The BrokerConnection remains there forever, though, so if the broker's IP or host ever changes, KafkaClient has no way to deal with this. The fix is to compare the latest metadata with the current node's connection, and if the host/IP has changed, decommission the old connection and allow a new one to be created. There's also a common race condition on broker startup where the initial metadata request sometimes returns an empty list of brokers, but subsequent requests behave normally. So, we must deal with broker being None here. This change is conservative in that it doesn't remove the connection from self._conns unless the new broker metadata contains an entry for that same node with a new IP/port.
…pkp#1085) When KafkaClient connects to a broker in _maybe_connect, it inserts into self._conns a BrokerConnection configured with the current host/port for that node. The BrokerConnection remains there forever, though, so if the broker's IP or host ever changes, KafkaClient has no way to deal with this. The fix is to compare the latest metadata with the current node's connection, and if the host/IP has changed, decommission the old connection and allow a new one to be created. There's also a common race condition on broker startup where the initial metadata request sometimes returns an empty list of brokers, but subsequent requests behave normally. So, we must deal with broker being None here. This change is conservative in that it doesn't remove the connection from self._conns unless the new broker metadata contains an entry for that same node with a new IP/port. # Conflicts: # kafka/client_async.py
…pkp#1085) When KafkaClient connects to a broker in _maybe_connect, it inserts into self._conns a BrokerConnection configured with the current host/port for that node. The BrokerConnection remains there forever, though, so if the broker's IP or host ever changes, KafkaClient has no way to deal with this. The fix is to compare the latest metadata with the current node's connection, and if the host/IP has changed, decommission the old connection and allow a new one to be created. There's also a common race condition on broker startup where the initial metadata request sometimes returns an empty list of brokers, but subsequent requests behave normally. So, we must deal with broker being None here. This change is conservative in that it doesn't remove the connection from self._conns unless the new broker metadata contains an entry for that same node with a new IP/port.
…pkp#1085) When KafkaClient connects to a broker in _maybe_connect, it inserts into self._conns a BrokerConnection configured with the current host/port for that node. The BrokerConnection remains there forever, though, so if the broker's IP or host ever changes, KafkaClient has no way to deal with this. The fix is to compare the latest metadata with the current node's connection, and if the host/IP has changed, decommission the old connection and allow a new one to be created. There's also a common race condition on broker startup where the initial metadata request sometimes returns an empty list of brokers, but subsequent requests behave normally. So, we must deal with broker being None here. This change is conservative in that it doesn't remove the connection from self._conns unless the new broker metadata contains an entry for that same node with a new IP/port.
When KafkaClient connects to a broker in _maybe_connect, it inserts into self._conns a BrokerConnection configured with the current host/port for that node. The BrokerConnection remains there forever, though, so if the broker's IP or host ever changes, KafkaClient has no way to deal with this.
The fix is to compare the latest metadata with the current node's connection, and if the host/IP has changed, decommission the old connection and allow a new one to be created.
There's also a common race condition on broker startup where the initial metadata request sometimes returns an empty list of brokers, but subsequent requests behave normally. So, we must deal with broker being None here. This change is conservative in that it doesn't remove the connection from self._conns unless the new broker metadata contains an entry for that same node with a new IP/port.