-
Notifications
You must be signed in to change notification settings - Fork 803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
failed to get zk lock on Burrow 1.0 #322
Comments
No. I've had no problems with problems with the ZK lock in the master branch (or the release). |
I had the same error in https://hub.docker.com/r/solsson/burrow/ built directly from source. Lots of them actually, something like 10/second for 2.5 hours. I can't see any issues or maintenance with the Kafka cluster around when it started. Could dig more though. My setup is in Kubernetes, Yolean/kubernetes-kafka#125. The image was built prior to e47ec4c - could the issue there be related? |
Pod restart solved the issue. Burrow seems to be working normally again. |
Any update on this? Since it's a "tried to acquire lock twice" message, I assume it's a code bug in your fork. Haven't seen this at all in master. If there's no update, I'll close this in a few days. |
Hi @toddpalino, you may go ahead and close it. I shipped our fork to production yesterday. I'll monitor and if come back I'll let you know. 😄 @solsson reported similar issue tho... unsure if it is a code bug in our fork, but you never know. All I did was adding two new notifiers and enabling |
That shouldn't cause it, but if the ZK logic doesn't match up between the fork and master it could cause a problem. |
Just observed the same issue in first hours of running burrow in a Docker container built from https://github.com/linkedin/Burrow/archive/v1.0.0.tar.gz. Container restart resolved. |
I can reliably reproduce on latest master by disrupting the network for 10-20 seconds. |
Having a similar issue, but a warning instead of error message. Burrow is running and listing consumers & lags in HTTP Endpoints, but no notifications were sent. {"level":"warn","ts":1533153351.2114217,"msg":"failed to get zk lock","type":"coordinator","name":"notifier","error":"strconv.Atoi: parsing "test": invalid syntax"} |
Same issue as mentioned by @cluyihunter . Restarting the burrow instance started the notifier again. {"level":"warn","ts":1533736850.2858596,"msg":"failed to get zk lock","type":"coordinator","name":"notifier","error":"zk: trying to acquire a lock twice"} |
We regularly have the same issue as @cluyihunter, with the warning text the same as given by @vivekyaji . We are running v1.1.0 in a container, with Kafka and ZooKeeper also in containers. Restarting the Burrow container solves the problem. This makes sense because, if I understand correctly, the lock uses an ephemeral znode, so stopping the container will remove the lock znode. @toddpalino how is the zk lock released? It looks like the |
The same issue as @cluyihunter, The HTTP endpoint server is working, but no notification sent out. |
Same issue with burrow 1.2.2. |
We had a fork from the old version with some custom notifiers and I'm just merging Burrow 1.0 into it. I've left it running overnight locally and when I got back today there were heaps of failed to get zk lock message. See bellow.
Burrow, kafka and zookeeper were running through docker containers.
Is this a known issue?
Thanks in advance,
Gus
The text was updated successfully, but these errors were encountered: