-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standalone broker on MacOS often fail to start after workstation sleep or reboot #15401
Comments
The issue had no activity for 30 days, mark with Stale label. |
Anecdotally, things seem better after 2.10. The broker still regularly crashes on workstation sleep, but when restarted it at least seems to come up a lot more reliably. Was any work done in 2.10 that may have improved things? |
Hmmm, nope; I'm now able to reliably induce this issue on 2.10. Nothing's bound to Pulsar's ports, but it fails to come up. The logs from a recent start attempt are attached. |
The issue had no activity for 30 days, mark with Stale label. |
I didn't realize you could start pulsar that way, thanks for sharing! Your issue helped me solve a question I have been wondering about for a while (thank you!). I think I found the likely root cause of your issue, but you will need to confirm. If I am correct, here is the TLDR: Pulsar discovers its IP address for things like Based on your logs, I noticed that some of your stack traces indicate trying to connect to
Port 4181 is a bookkeeper port, and for some reason the client is trying to connect to a non-local host, but we already know that the bookkeeper should be running on localhost. I see the same behavior on my host machine:
Interestingly, in my case, Pulsar fails when I am off my work VPN, but it works when I am on the VPN. I've see the same behavior when running Pulsar Standalone, which makes sense since the brew installation is Pulsar Standalone. The nuance comes here: pulsar/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfigurationUtils.java Lines 43 to 51 in 76646cf
advertisedAddress configured, it performs this lookup, and that lookup relies on a reverse lookup of your hostname, as described here https://stackoverflow.com/a/18488982. After updating my /etc/hosts file to map my hostname to map my hostname to 127.0.0.1 , standalone now works and I've confirmed that 2.10.1 works after putting my Mac to sleep and waking it up again. (I only just realized tonight that Standalone was not previously working because my hostname was not correctly resolving to 127.0.0.1 .)
I also noticed that running Let me know if this helps. |
I've just realized that |
@michaeljmarshall you may still be onto something; just because the IP in my logs is an RFC1918 address doesn't mean it necessarily resolves to my host (or a router). I'll try setting |
That seems to work, at least so far. Which of the |
The brew service command uses standalone mode, which you can see in the formula code: https://github.com/Homebrew/homebrew-core/blob/9441684087afc2c59ef7c6341bc434e3b818a950/Formula/apache-pulsar.rb#L60. When running Pulsar as a standalone broker, all configuration is passed via the |
I agree that this configuration makes sense as a default for the Homebrew formula, though it might be worth mentioning it on the dev@ mailing list. You can set the pulsar/pulsar-broker/src/main/java/org/apache/pulsar/PulsarStandaloneStarter.java Lines 71 to 79 in 5bad9b3
|
@michaeljmarshall sounds good, I'll put that on my to-do list. In the mean time, this bug should probably stay open: the appropriate fix, I think, is to either: |
The issue had no activity for 30 days, mark with Stale label. |
Describe the bug
Often, but not always, if I start Pulsar standalone on my workstation and then close/reopen my laptop lid or restart the computer, subsequent attempts to start the standalone broker fail. The JVM process keeps running and emits some errors in its logs (see attachment), but never reaches a connectible state.
Repeatedly restarting the broker usually causes this condition to go away. Sometimes that doesn't work, and I have to remove all broker data files (i.e. reinstall Pulsar) to make it start.
This only seems to happen when publishes have been recorded to a persistent topic.
This does not happen when using Pulsar Standalone in Docker, or on my Linux laptop. It appears to be MacOS specific.
To Reproduce
brew install apache-pulsar
brew services start apache-pulsar
brew services restart apache-pulsar
and attempt to connect to Pulsar; it will sometimes fail.Expected behavior
Desktop (please complete the following information):
Additional context
brew services
is likely not the problem. It doesn't do anything special, and Pulsar does sometimes work when restarted viabrew services
.Attached is a copy of my logs from a broker start attempt that did not become connectible after 5min.
nostart.log
The text was updated successfully, but these errors were encountered: