Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hang after ping timeout w/asyncio #2449

Closed
dgw opened this issue May 2, 2023 · 3 comments · Fixed by #2431
Closed

Hang after ping timeout w/asyncio #2449

dgw opened this issue May 2, 2023 · 3 comments · Fixed by #2431
Labels
Bug Things to squish; generally used for issues Core/Networking
Milestone

Comments

@dgw
Copy link
Member

dgw commented May 2, 2023

Description

It seems my VPS will occasionally lose its network connection for long enough (or at just the right time) for Sopel to reach ping timeout. However, when this happens, Sopel doesn't try to reconnect.

Reproduction steps

  1. Trigger ping timeout
  2. Bot doesn't try to reconnect

Expected behavior

In 7.x, upon an unexpected disconnection such as this, Sopel would print "reconnecting in 20 seconds" and then do so.

Relevant logs

[2023-05-02 22:50:54,287] sopel.irc.backends   WARNING  - Reached timeout (120.0s); closing connection.

# bot sat here for several minutes until I attached to its tmux session and pressed Ctrl-C

^CException ignored in: <module 'threading' from '/usr/lib/python3.8/threading.py'>
Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 1388, in _shutdown
    lock.acquire()
KeyboardInterrupt:

Notes

While Sopel prints to its log that it "Reached timeout", the QUIT messages for both instances this happened to today (within 3 seconds of each other) said the connection was reset by peer.
Maybe that affects how to reproduce the issue, or maybe it's just IRCd weirdness.

I have turned both bots' logging_level settings to DEBUG in hopes of capturing more details if this happens again.

The commit reported as my version is the currently approved HEAD of #2430, which I checked out at @Exirel's suggestion after observing this problem once before to see if it would help. Apparently it didn't.

Sopel version

c4a34f7

Installation method

pip install

Python version

3.8.10

Operating system

Ubuntu 20.04.6

IRCd

Rizon, plexus-4(hybrid-8.1.20)(20220405_0-612)

Relevant plugins

No response

@dgw dgw added Bug Things to squish; generally used for issues Needs Triage Issues that need to be reviewed and categorized labels May 2, 2023
@dgw dgw added this to the 8.0.0 milestone May 2, 2023
@dgw
Copy link
Member Author

dgw commented May 2, 2023

Possibly fixed by #2431. I have upgraded the Sopel instances in question to ad21253 (current HEAD of that PR) for testing, in addition to turning on DEBUG logs.

@Exirel
Copy link
Contributor

Exirel commented May 6, 2023

That's exactly the sort of traceback I got on my end when working on #2431: usually it comes from an error raised in a task that wasn't caught properly. So I hope #2431 fixes it!

@dgw dgw removed the Needs Triage Issues that need to be reviewed and categorized label May 21, 2023
@dgw dgw linked a pull request May 21, 2023 that will close this issue
4 tasks
@dgw
Copy link
Member Author

dgw commented May 21, 2023

Indeed appears fixed by #2431, according to results of the most recent networking glitch on my server. Sopel reconnected just fine on its own, on both bots running the HEAD of #2431.

@dgw dgw closed this as completed in #2431 May 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Things to squish; generally used for issues Core/Networking
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants