-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
very quick offerRescinded, brokers not starting #292
Comments
Do you see anything interesting in the mesos agent logs? (slave-6 in this case). It looks like its exactly 10 seconds after the ACCEPT was sent, which seems suspiciously like the offer is timing out. Do you have an offer timeout set to 10 seconds on mesos by any chance? There should be a log entry directly after that accept saying something like Can you also check the master logs and see if you see the Launch Tasks message getting there from the framework? Finally, what version of mesos are you running? |
In fact, only entry in the logs apart from ACCEPTs and immediate DECLINEs is happening when I re-start the broker. Neither clients nor master sow anything related to Kafka in the logs. I'm currently debugging some problems due to overlay network configuration on the cluster, do you think this might be related? |
I have switched to Marathon scheduler. Logs are a bit more verbose now.
|
That's odd that you weren't getting those verbose logs before. Anyways I suspect the issue is the executor crashing on launch
I think I remember someone reporting seeing this when the version of kafka the framework was built against didn't match the version they were running. Do yours match? The default build is 0.10.2.0 now. Also, it seems like some of your mesos agents might not be able to hit the scheduler:
|
I've substituded --api to Mesos-DNS provided name, got one step closer:
Ypu mean if version in docker matches the one i do ./kafka-mesos.sh broker start 0 with? Both built from same repo based on https://github.com/mesos/kafka 20 minutes apart. |
Yep, pretty sure this is due to a kafka mismatch. Make sure you're running the scala 2.11 build of kafka and are the version matches what kafka-mesos was built against. |
gradle.build says scalaVersion = "2.11.8". Now, I'm not sure how do I verify I run it using same Scala.
It only says Java 8 is installed, how do I make sure what I launch it with? |
I mean the version of kafka, that needs to be the 2.11 version. What is the file name of the kafka .tgz you're using? If you're using the docker build script I'd be weary, I don't know if it still works (or ever worked...) |
It turns out my gradle said Mesos version 0.28.0 while I'm running 1.1.0, so I'm testing that. kafka-mesos-0.10.1.0-SNAPSHOT-kafka_2.11-0.10.1.1.jar Ahaaa. "build-image.sh" has hardcoded values! I missed that during the build. Let me rebuild and see if it helps. |
it works. would you be interested in PR? |
sorry, didn't see this! Yes please! :D |
I've started the scheduler via ./kafka-mesos.sh scheduler:
I can see Mesos having a Kafka framework, but not one of the agents/workers.
Logs are showing ACCEPT -> couple of seconds -> offerRescinded cycle around my cluster.
Any hints as to how do I debug this?
The text was updated successfully, but these errors were encountered: