Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idle thread leak in session manager #4089

Closed
1 task done
bemevolent opened this issue Aug 6, 2023 · 5 comments
Closed
1 task done

Idle thread leak in session manager #4089

bemevolent opened this issue Aug 6, 2023 · 5 comments
Labels
C-bug An unexpected or incorrect behavior

Comments

@bemevolent
Copy link
Contributor

Describe the bug

I wanted to try out tokio-console from the existing PR. After having that running for a bit, I noticed idle threads continuously growing which might be related to other RPC timeouts others have been running into.

After adding some logging, it looks like in session manager in start_pending_outbound_session every once in a while the the future never returns and a thread is left idle.

let stream = match TcpStream::connect(remote_addr).await {
Ok(stream) => MeteredStream::new_with_meter(stream, bandwidth_meter),
Err(error) => {
let _ = events
.send(PendingSessionEvent::OutgoingConnectionError {
remote_addr,
session_id,
peer_id: remote_peer_id,
error,
})
.await;
return
}
};

Steps to reproduce

  • Add naming to all task spawning
  • Run tokio-console and watch idle threads grow

Node logs

No response

Platform(s)

Mac (Apple Silicon)

What version/commit are you on?

5885ba6

What database version are you on?

1

If you've built Reth from source, provide the full command you used

RUSTFLAGS="--cfg tokio_unstable" cargo install --locked --path bin/reth --bin reth --features tokio-console

Code of Conduct

  • I agree to follow the Code of Conduct
@bemevolent bemevolent added C-bug An unexpected or incorrect behavior S-needs-triage This issue needs to be labelled labels Aug 6, 2023
@mattsse
Copy link
Collaborator

mattsse commented Aug 6, 2023

related to other RPC timeouts others have been running into.

related how?

@bemevolent
Copy link
Contributor Author

Lots of idle threads could lead to thread starvation / context switching issues since jsonrpsee also spawns threads outside of the Task Executor. I could be way off here though, still getting familiar with tokio and the task management system being used.

@mattsse
Copy link
Collaborator

mattsse commented Aug 6, 2023

This task has lost its waker, and will never be woken again.

interesting, need to investigate where this can happen

@mattsse
Copy link
Collaborator

mattsse commented Aug 6, 2023

this appears to be a bug in tokio-console with nested spawns (which is the case here)

ZcashFoundation/zebra#4581

tokio-rs/console#345

tokio-rs/console#380

the list of idle tasks is definitely reported wrong, because we restrict how many outbound connections can be active at once,

I wonder if we should wrap this in a timeout task

@bemevolent
Copy link
Contributor Author

Ah thanks, I'll close this. Since we were already discussing in this thread I wanted to check my understanding - for the rpc servers they are using their own threads with jsonrpsee_server::ServerBuilder and normal tokio::spawns. Is that expected or should they be using the main tokio multi_thread runtime and a task executor? @mattsse

@github-project-automation github-project-automation bot moved this from Todo to Done in Reth Tracker Aug 7, 2023
@DaniPopes DaniPopes removed the S-needs-triage This issue needs to be labelled label Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug An unexpected or incorrect behavior
Projects
Archived in project
Development

No branches or pull requests

3 participants