Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: "Address already in use" when Sopel reconnects with webhooks enabled #70

Closed
dgw opened this issue Jul 27, 2020 · 2 comments · Fixed by #72
Closed

Error: "Address already in use" when Sopel reconnects with webhooks enabled #70

dgw opened this issue Jul 27, 2020 · 2 comments · Fixed by #72
Labels
Milestone

Comments

@dgw
Copy link
Member

dgw commented Jul 27, 2020

We've been seeing OSError: [Errno 98] Address already in use on the "official" Sopel instance when something interrupts its ZNC connection. (Later, failure of the webhook server to bind at setup() during reconnection leads to AttributeError: 'NoneType' object has no attribute 'shutdown' the next time Sopel exits or reconnects, since this plugin's shutdown() routine doesn't correctly check if the server object is valid before trying to shut it down.)

Sopel reconnecting is very similar to what happens during stop-start or restart: It shutdown()s all plugins, waits a bit, then setup()s all the plugins again. The difference is, it's the same process running those setup()s again instead of a new one. But interestingly, it's not possible to trigger the same behavior by starting sopel, doing Ctrl-C, and immediately starting sopel again.

It seems like the listener socket (from Bottle) is getting stuck in TIME_WAIT for a few minutes because the local end initiated the closure (this is just part of how TCP works). When the whole process terminates, the leftover socket is probably freed by the OS—but not when the process stays running, i.e. during a reconnection.

I'm not quite done researching the flag SO_REUSEADDR and its possible (side-)effects or other considerations, but it might help. Probably would require a custom Bottle backend to set up the option.

@dgw dgw added the bug label Jul 27, 2020
@dgw dgw added this to the 0.4.0 milestone Jul 27, 2020
@HumorBaby
Copy link
Contributor

Oops... didn't see this issue before opening #72 😬

You bring up an interesting point I didn't really consider:

since this plugin's shutdown() routine doesn't correctly check if the server object is valid before trying to shut it down

While #72 fixes the issue of actually un-binding the socket during the plugin's shutdown(), it does't really address the point above… do you think it's worth including along side the fix in #72? I just wanted to point it out to bring it to the front of your/our attention as your review #72 (or decide to close in favor of other options, e.g., SO_REUSEADDR).

@dgw
Copy link
Member Author

dgw commented Aug 8, 2020

I'd call them "separate but related". Fixing things so that the server never fails to bind theoretically makes it unnecessary to check if the object is valid before trying to shut it down… but "never" is a strong word, and no matter how many checks there are, someone is going to break it somehow in the wild. It's just a matter of time.

So yeah, feel free to toss in an if server is not None (that's pseudocode!) check somewhere in the same PR, just in case it still fails to bind at some point. (Maybe later we can consider an automatic retry feature. Like at most n retries spaced m minutes apart, triggered if the server fails to bind at setup() time.)

server_close() is a solution I never discovered, because the chain of superclasses between Bottle and socketserver.BaseServer (where it's defined, AFAICT) is long and, uh, Twisted. (Sorry… but not really. 😁) It might work just fine on its own for the problem we discovered on the "official" Sopel instance, and you should feel free to pip install -e a modified version of this plugin to test it out since it runs on your infra. The GitHub notifications there are just "nice to have", since everyone with an actual interest in following development activity has GitHub email notifications turned on anyway.

In fact, I'd be perfectly happy if server_close() solves the problem on its own. There are some considerations for using SO_REUSEADDR that I'm not sure I've explored enough to fully understand the possible effects on this plugin's operation. And if I don't have to think about those at all, it leaves more time for working on the main bot code. 😉

@dgw dgw linked a pull request Oct 19, 2020 that will close this issue
@dgw dgw closed this as completed in #72 Oct 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants