-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sync: bounded mpsc memory use is proportional to max waiters #2637
Comments
As a side note: we have also observed occasional latency spikes after switching some code in Linkerd that used |
@hawkw I am interested in working on that if it is ok. I had a question though.. It seems to me that switching to the new semaphore would necessitate the removal of the |
Ah, I have this work in progress. It is close to completion.I hope to have a PR up today. I should have posted that, sorry. |
Updates the mpsc channel to use the intrusive waker based sempahore. This enables using `Sender` with `&self`. Instead of using `Sender::poll_ready` to ensure capacity and updating the `Sender` state, `async fn Sender::reserve()` is added. This function returns a `Permit` value representing the reserved capacity. Fixes: #2637 Refs: #2718 (intrusive waiters)
Updates the mpsc channel to use the intrusive waker based sempahore. This enables using `Sender` with `&self`. Instead of using `Sender::poll_ready` to ensure capacity and updating the `Sender` state, `async fn Sender::reserve()` is added. This function returns a `Permit` value representing the reserved capacity. Fixes: #2637 Refs: #2718 (intrusive waiters)
This branch updates Tower to Tokio 0.3. Unlike #474, this branch uses Tokio 0.3's synchronization primitives, rather than continuing to depend on Tokio 0.2. I think that we ought to try to use Tokio 0.3's channels whenever feasible, because the 0.2 channels have pathological memory usage patterns in some cases (see tokio-rs/tokio#2637). @LucioFranco let me know what you think of the approach used here and we can compare notes! For the most part, this was a pretty mechanical change: updating versions in Cargo.toml, tracking feature flag changes, renaming `tokio::time::delay` to `sleep`, and so on. Tokio's channel receivers also lost their `poll_recv` methods, but we can easily replicate that by enabling the `"stream"` feature and using `poll_next` instead. The one actually significant change is that `tokio::sync::mpsc::Sender` lost its `poll_ready` method, which impacts the way `tower::buffer` is implemeted. When the buffer's channel is full, we want to exert backpressure in `poll_ready`, so that callers such as load balancers could choose to call another service rather than waiting for buffer capacity. Previously, we did this by calling `poll_ready` on the underlying channel sender. Unfortunately, this can't be done easily using Tokio 0.3's bounded MPSC channel, because it no longer exposes a polling-based interface, only an `async fn ready`, which borrows the sender. Therefore, we implement our own bounded MPSC on top of the unbounded channel, using a semaphore to limit how many items are in the channel. I factored out the code for polling a semaphore acquire future from `limit::concurrency` into its own module, and reused it in `Buffer`. Additionally, the buffer tests needed to be updated, because they currently don't actually poll the buffer service before calling it. This violates the `Service` contract, and the new code actually fails as a result. Closes #473 Closes #474 Co-authored-by: Lucio Franco <luciofranco14@gmail.com> Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Version
Platform
all
Description
The memory use of the bounded MPSC channel is proportional to the maximum number of tasks that have concurrently waited on the channel. This isn't technically a "memory leak" as reported by tools like
valgrind
, as the memory is freed when the channel is dropped. However, in systems where channels are long-lived, it may as well be a memory leak --- especially if the workload is bursty and the number of waiting tasks occasionally spikes.tokio::sync
's locking primitives, likeMutex
, do not exhibit the same behavior. Instead, the memory use of aMutex
remains constant regardless of how many tasks are waiting for it. This is because the locks use the new intrusive semaphore implementation added in #2325. The intrusive semaphore does not require additional memory allocations per waiting task, while the old semaphore implementation (which uses a non-intrusive linked list) does. In addition, cancelled waiters in the new semaphore are unlinked from the list when the waiting task terminates, while the old implementation only removes cancelled tasks from the wait list when it notifies a waiter.I wrote a quick demo of this that compares the
mpsc
andMutex
:https://github.com/hawkw/tokio-buffer-repro
It spawns a bunch of tasks that all attempt to wait on
Sender::send
/Mutex::lock
when thempsc
is full or theMutex
is already locked. The tasks cancel themselves after a period of time. During this, the repro prints the process' current memory use.At the end of the test, before the
Mutex
/mpsc
are dropped, we see the following:We should update the MPSC channel to use the new semaphore implementation.
The text was updated successfully, but these errors were encountered: