Hang after ping timeout w/asyncio #2449

dgw · 2023-05-02T22:14:47Z

Description

It seems my VPS will occasionally lose its network connection for long enough (or at just the right time) for Sopel to reach ping timeout. However, when this happens, Sopel doesn't try to reconnect.

Reproduction steps

Trigger ping timeout
Bot doesn't try to reconnect

Expected behavior

In 7.x, upon an unexpected disconnection such as this, Sopel would print "reconnecting in 20 seconds" and then do so.

Relevant logs

[2023-05-02 22:50:54,287] sopel.irc.backends   WARNING  - Reached timeout (120.0s); closing connection.

# bot sat here for several minutes until I attached to its tmux session and pressed Ctrl-C

^CException ignored in: <module 'threading' from '/usr/lib/python3.8/threading.py'>
Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 1388, in _shutdown
    lock.acquire()
KeyboardInterrupt:

Notes

While Sopel prints to its log that it "Reached timeout", the QUIT messages for both instances this happened to today (within 3 seconds of each other) said the connection was reset by peer.
Maybe that affects how to reproduce the issue, or maybe it's just IRCd weirdness.

I have turned both bots' logging_level settings to DEBUG in hopes of capturing more details if this happens again.

The commit reported as my version is the currently approved HEAD of #2430, which I checked out at @Exirel's suggestion after observing this problem once before to see if it would help. Apparently it didn't.

Sopel version

c4a34f7

Installation method

pip install

Python version

3.8.10

Operating system

Ubuntu 20.04.6

IRCd

Rizon, plexus-4(hybrid-8.1.20)(20220405_0-612)

Relevant plugins

No response

The text was updated successfully, but these errors were encountered:

dgw · 2023-05-02T22:32:41Z

Possibly fixed by #2431. I have upgraded the Sopel instances in question to ad21253 (current HEAD of that PR) for testing, in addition to turning on DEBUG logs.

Exirel · 2023-05-06T14:11:30Z

That's exactly the sort of traceback I got on my end when working on #2431: usually it comes from an error raised in a task that wasn't caught properly. So I hope #2431 fixes it!

dgw · 2023-05-21T10:13:39Z

Indeed appears fixed by #2431, according to results of the most recent networking glitch on my server. Sopel reconnected just fine on its own, on both bots running the HEAD of #2431.

dgw added Bug Things to squish; generally used for issues Needs Triage Issues that need to be reviewed and categorized labels May 2, 2023

dgw added this to the 8.0.0 milestone May 2, 2023

dgw added the Core/Networking label May 2, 2023

dgw mentioned this issue May 2, 2023

irc: properly manage exception of the run-forever loop #2431

Merged

4 tasks

dgw removed the Needs Triage Issues that need to be reviewed and categorized label May 21, 2023

dgw linked a pull request May 21, 2023 that will close this issue

irc: properly manage exception of the run-forever loop #2431

Merged

4 tasks

dgw closed this as completed in #2431 May 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hang after ping timeout w/asyncio #2449

Hang after ping timeout w/asyncio #2449

dgw commented May 2, 2023

dgw commented May 2, 2023

Exirel commented May 6, 2023

dgw commented May 21, 2023

Hang after ping timeout w/asyncio #2449

Hang after ping timeout w/asyncio #2449

Comments

dgw commented May 2, 2023

Description

Reproduction steps

Expected behavior

Relevant logs

Notes

Sopel version

Installation method

Python version

Operating system

IRCd

Relevant plugins

dgw commented May 2, 2023

Exirel commented May 6, 2023

dgw commented May 21, 2023