-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SyncProducer SendMessage
blocks for infinity when encounters a Network Timeout
#2129
Comments
slaunay
added a commit
to slaunay/sarama
that referenced
this issue
Feb 8, 2022
- add unit test to reproduce the deadlock by simulating a network error - document possible deadlock when closing the Broker from an AsyncProduce callback when handling a response error - add closeBroker goroutine and channel to asynchronously close a Broker - reuse the stopchan channel to signal that the closeBroker goroutine is done - update TestBrokerProducerShutdown to check goroutine leak by closing the input vs the stopchan channel - fixes IBM#2129
slaunay
added a commit
to slaunay/sarama
that referenced
this issue
Feb 8, 2022
- add unit test to reproduce the deadlock by simulating a network error - document possible deadlock when closing the Broker from an AsyncProduce callback when handling a response error - add closeBroker goroutine and channel to asynchronously close a Broker once - reuse the stopchan channel to signal that the closeBroker goroutine is done - update TestBrokerProducerShutdown to check goroutine leak by closing the input vs the stopchan channel - fixes IBM#2129
docmerlin
added a commit
to influxdata/kapacitor
that referenced
this issue
Mar 10, 2022
docmerlin
added a commit
to influxdata/kapacitor
that referenced
this issue
Mar 10, 2022
docmerlin
added a commit
to influxdata/kapacitor
that referenced
this issue
Mar 10, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Versions
Configuration
I'm running
KAFKA_CREATE_TOPICS: "mytopic:3:3"
on a 3 broker cluster locally using https://github.com/wurstmeister/kafka-docker/.I run the above application and after a while I
docker pause $(docker ps -f name=kafka -q)
, wait for all 3 network timeouts to happen and thendocker unpause $(docker ps -f name=kafka -q)
.The application does not recover and I can see the number of goroutines raising on the metrics. I have waited up to an hour.
The same scenario works fine with the previous version of Sarama (
1.30.1
), you can see it recovers a bit after unpausing the container. And I can confirm that1.31.1
has not solved this issue.Logs 1.31.0 and 1.31.1
logs: CLICK ME
Logs 1.30.1
logs: CLICK ME
Problem Description
I think this is the same issue as what was described in #2121 (comment) and #2121 (comment) in #2121 but since they seem not to be exactly the same problem as reported in the original issue, I decided to open a new issue.
The text was updated successfully, but these errors were encountered: