Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
I noticed that occasionally my program would get stuck with one thread spinning at near 100% CPU. After a bunch of investigation with rust-gdb and tokio-console and stripping it down to a much smaller program, I narrowed it down to a websocket task being very active doing nothing.
Here you can see the spawned future stuck running:
It is being woken up ~253k times per second, so that explains the 100% CPU:
After some digging, I found this related code:
ethers-rs/ethers-providers/src/transports/ws.rs
Lines 402 to 408 in 3df1527
Dropping that error is bad because it might be an
Io(Kind(UnexpectedEof))
. If that's the case, the websocket needs to reconnect.note: run with
RUST_BACKTRACE=1
environment variable to display a backtraceSolution
Quick fix is to just return an error when the socket gives an error.
I think a more robust fix might be to use something like https://docs.rs/stubborn-io/latest/stubborn_io/.
PR Checklist
I'm not sure how to write a test for this.