Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tokio::sync::Lock leaks memory #2237

Closed
olix0r opened this issue Feb 13, 2020 · 3 comments
Closed

tokio::sync::Lock leaks memory #2237

olix0r opened this issue Feb 13, 2020 · 3 comments

Comments

@olix0r
Copy link
Member

olix0r commented Feb 13, 2020

Version

0.1.22

Platform

Linux golly 5.3.0-7625-generic #27~1576774585~18.04~c7868f8-Ubuntu SMP Thu Dec 19 20:37:47 UTC  x86_64 x86_64 x86_64 GNU/Linux

Subcrates

tokio_sync

Description

Under concurrency, tokio_sync::Lock leaks memory when acquired locks are dropped. We have a reliable repro for this.

:; cargo build --release && valgrind --leak-check=full target/release/lockyleaks
    Finished release [optimized] target(s) in 0.02s
==16442== Memcheck, a memory error detector
==16442== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==16442== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==16442== Command: target/release/lockyleaks
==16442==
==16442==
==16442== HEAP SUMMARY:
==16442==     in use at exit: 123,726,189 bytes in 1,100,504 blocks
==16442==   total heap usage: 2,222,200 allocs, 1,121,696 frees, 720,964,308 bytes allocated
==16442==
==16442== 48 bytes in 2 blocks are possibly lost in loss record 7 of 50
==16442==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16442==    by 0x1289D2: tokio_threadpool::worker::Worker::run (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x1163ED: tokio_timer::clock::clock::with_default (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x1174E2: tokio::runtime::threadpool::builder::Builder::build::{{closure}} (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x130BD8: std::thread::local::LocalKey<T>::with (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x12BBA7: std::sys_common::backtrace::__rust_begin_short_backtrace (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x1321DB: _ZN3std9panicking3try7do_call17h7aab2d4390bd4174E.llvm.4420981283127027568 (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x148A99: __rust_maybe_catch_panic (lib.rs:78)
==16442==    by 0x13278F: core::ops::function::FnOnce::call_once{{vtable-shim}} (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x13FE5E: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once (boxed.rs:1022)
==16442==    by 0x14822F: call_once<(),alloc::boxed::Box<FnOnce<()>>> (boxed.rs:1022)
==16442==    by 0x14822F: start_thread (thread.rs:13)
==16442==    by 0x14822F: std::sys::unix::thread::Thread::new::thread_start (thread.rs:80)
==16442==    by 0x524F6DA: start_thread (pthread_create.c:463)
==16442==
==16442== 23,144 bytes in 11 blocks are possibly lost in loss record 41 of 50
==16442==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16442==    by 0x136A5A: crossbeam_epoch::internal::Local::register (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x1381B4: _ZN3std6thread5local4fast12Key$LT$T$GT$14try_initialize17h902033f5292db545E.llvm.635284287880880640 (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x133E33: crossbeam_deque::Stealer<T>::steal_batch_and_pop (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x128C75: tokio_threadpool::worker::Worker::run (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x1163ED: tokio_timer::clock::clock::with_default (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x1174E2: tokio::runtime::threadpool::builder::Builder::build::{{closure}} (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x130BD8: std::thread::local::LocalKey<T>::with (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x12BBA7: std::sys_common::backtrace::__rust_begin_short_backtrace (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x1321DB: _ZN3std9panicking3try7do_call17h7aab2d4390bd4174E.llvm.4420981283127027568 (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x148A99: __rust_maybe_catch_panic (lib.rs:78)
==16442==    by 0x13278F: core::ops::function::FnOnce::call_once{{vtable-shim}} (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==
==16442== 1,147,648 bytes in 8,966 blocks are possibly lost in loss record 46 of 50
==16442==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16442==    by 0x13188E: <&tokio_threadpool::sender::Sender as tokio_executor::executor::Executor>::spawn (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x13170D: <tokio_threadpool::sender::Sender as tokio_executor::executor::Executor>::spawn (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x1387EA: std::thread::local::LocalKey<T>::with (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x13869C: <tokio_executor::global::DefaultExecutor as tokio_executor::executor::Executor>::spawn (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x112FC6: tokio::executor::spawn (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x11406F: <futures::future::lazy::Lazy<F,R> as futures::future::Future>::poll (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x130221: futures::task_impl::std::set (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x12CAC5: futures::task_impl::Spawn<T>::poll_future_notify (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x132180: _ZN3std9panicking3try7do_call17h0321a6d6b023a5f6E.llvm.4420981283127027568 (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x148A99: __rust_maybe_catch_panic (lib.rs:78)
==16442==    by 0x1304E8: tokio_threadpool::task::Task::run (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==
==16442== 2,164,064 bytes in 19,322 blocks are possibly lost in loss record 47 of 50
==16442==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16442==    by 0x1261D8: tokio_sync::semaphore::Permit::poll_acquire (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x115D07: <lockyleaks::AcquireAndIdle as futures::future::Future>::poll (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x115AB7: <lockyleaks::Loop as futures::future::Future>::poll (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x130221: futures::task_impl::std::set (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x12CAC5: futures::task_impl::Spawn<T>::poll_future_notify (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x132180: _ZN3std9panicking3try7do_call17h0321a6d6b023a5f6E.llvm.4420981283127027568 (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x148A99: __rust_maybe_catch_panic (lib.rs:78)
==16442==    by 0x1304E8: tokio_threadpool::task::Task::run (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x1296B9: tokio_threadpool::worker::Worker::run_task (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x128D54: tokio_threadpool::worker::Worker::run (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x1163ED: tokio_timer::clock::clock::with_default (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==
==16442== 2,994,208 (112 direct, 2,994,096 indirect) bytes in 1 blocks are definitely lost in loss record 49 of 50
==16442==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16442==    by 0x1261D8: tokio_sync::semaphore::Permit::poll_acquire (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x115D07: <lockyleaks::AcquireAndIdle as futures::future::Future>::poll (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x115AB7: <lockyleaks::Loop as futures::future::Future>::poll (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x130221: futures::task_impl::std::set (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x12CAC5: futures::task_impl::Spawn<T>::poll_future_notify (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x132180: _ZN3std9panicking3try7do_call17h0321a6d6b023a5f6E.llvm.4420981283127027568 (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x148A99: __rust_maybe_catch_panic (lib.rs:78)
==16442==    by 0x1304E8: tokio_threadpool::task::Task::run (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x1296B9: tokio_threadpool::worker::Worker::run_task (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x128B7F: tokio_threadpool::worker::Worker::run (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x1163ED: tokio_timer::clock::clock::with_default (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==
==16442== 116,918,928 bytes in 1,043,919 blocks are possibly lost in loss record 50 of 50
==16442==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16442==    by 0x1261D8: tokio_sync::semaphore::Permit::poll_acquire (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x115D07: <lockyleaks::AcquireAndIdle as futures::future::Future>::poll (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x115AB7: <lockyleaks::Loop as futures::future::Future>::poll (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x130221: futures::task_impl::std::set (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x12CAC5: futures::task_impl::Spawn<T>::poll_future_notify (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x132180: _ZN3std9panicking3try7do_call17h0321a6d6b023a5f6E.llvm.4420981283127027568 (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x148A99: __rust_maybe_catch_panic (lib.rs:78)
==16442==    by 0x1304E8: tokio_threadpool::task::Task::run (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x1296B9: tokio_threadpool::worker::Worker::run_task (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x128B7F: tokio_threadpool::worker::Worker::run (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==    by 0x1163ED: tokio_timer::clock::clock::with_default (in /home/ver/b/lockyleaks/target/release/lockyleaks)
==16442==
==16442== LEAK SUMMARY:
==16442==    definitely lost: 112 bytes in 1 blocks
==16442==    indirectly lost: 2,994,096 bytes in 26,733 blocks
==16442==      possibly lost: 120,253,832 bytes in 1,072,220 blocks
==16442==    still reachable: 478,149 bytes in 1,550 blocks
==16442==         suppressed: 0 bytes in 0 blocks
==16442== Reachable blocks (those to which a pointer was found) are not shown.
==16442== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==16442==
==16442== For counts of detected and suppressed errors, rerun with: -v
==16442== ERROR SUMMARY: 6 errors from 6 contexts (suppressed: 0 from 0)

We believe that this leak may be present in 0.2.x versions as well, though. @hawkw is investigating.

@hawkw
Copy link
Member

hawkw commented Feb 13, 2020

I've ported the repro to Tokio 0.2, and I cannot reproduce the leak.

Valgrind:

eliza@ares:~/lockyleaks$ git status
On branch eliza/0.2
Your branch is up-to-date with 'origin/eliza/0.2'.
nothing to commit, working tree clean
eliza@ares:~/lockyleaks$ valgrind --leak-check=full target/release/lockyleaks
==14193== Memcheck, a memory error detector
==14193== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==14193== Using Valgrind-3.12.0.SVN and LibVEX; rerun with -h for copyright info
==14193== Command: target/release/lockyleaks
==14193==
==14193==
==14193== HEAP SUMMARY:
==14193==     in use at exit: 32 bytes in 1 blocks
==14193==   total heap usage: 2,201,504 allocs, 2,201,503 frees, 465,264,680 bytes allocated
==14193==
==14193== LEAK SUMMARY:
==14193==    definitely lost: 0 bytes in 0 blocks
==14193==    indirectly lost: 0 bytes in 0 blocks
==14193==      possibly lost: 0 bytes in 0 blocks
==14193==    still reachable: 32 bytes in 1 blocks
==14193==         suppressed: 0 bytes in 0 blocks
==14193== Reachable blocks (those to which a pointer was found) are not shown.
==14193== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==14193==
==14193== For counts of detected and suppressed errors, rerun with: -v
==14193== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

running the 0.1 repro (blue line is RSS):
mem_plot_01
running the 0.2 repro (blue line is RSS):
mem_plot_02_fixed

AFAICT, this issue only exists in 0.1. I suspect we can probably fix it pretty easily by backporting the 0.2 impl to 0.1

@hawkw
Copy link
Member

hawkw commented Feb 13, 2020

I have a branch (https://github.com/tokio-rs/tokio/tree/eliza/0.2-semaphore) that backports the 0.2 semaphore implementation to tokio-sync from v0.1.x (by copying and pasting + fixing up API differences). Valgrind no longer reports any of the semaphore-related leaks (and I think the crossbeam-deque one may be a false positive/something that exists for all tokio 0.1 programs):

eliza@ares:~/lockyleaks$ valgrind --leak-check=full target/release/lockyleaks
==40575== Memcheck, a memory error detector
==40575== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==40575== Using Valgrind-3.12.0.SVN and LibVEX; rerun with -h for copyright info
==40575== Command: target/release/lockyleaks
==40575==
==40575==
==40575== HEAP SUMMARY:
==40575==     in use at exit: 279,904 bytes in 199 blocks
==40575==   total heap usage: 2,215,615 allocs, 2,215,416 frees, 681,160,960 bytes allocated
==40575==
==40575== 69,432 bytes in 33 blocks are possibly lost in loss record 17 of 18
==40575==    at 0x4C2BBAF: malloc (vg_replace_malloc.c:299)
==40575==    by 0x13FE0A: crossbeam_epoch::internal::Local::register::hf487475157c3d28f (in /home/eliza/lockyleaks/target/release/lockyleaks)
==40575==    by 0x141564: _ZN3std6thread5local4fast12Key$LT$T$GT$14try_initialize17h902033f5292db545E.llvm.635284287880880640 (in /home/eliza/lockyleaks/target/release/lockyleaks)
==40575==    by 0x13D1E3: crossbeam_deque::Stealer$LT$T$GT$::steal_batch_and_pop::he4c6fad24a4c4a1e (in /home/eliza/lockyleaks/target/release/lockyleaks)
==40575==    by 0x132025: tokio_threadpool::worker::Worker::run::ha87b7d0e3fcf61d4 (in /home/eliza/lockyleaks/target/release/lockyleaks)
==40575==    by 0x11F57D: tokio_timer::clock::clock::with_default::hbbd61c3a192ca8a3 (in /home/eliza/lockyleaks/target/release/lockyleaks)
==40575==    by 0x120EF2: tokio::runtime::threadpool::builder::Builder::build::_$u7b$$u7b$closure$u7d$$u7d$::hefd82a6e3b44000c (in /home/eliza/lockyleaks/target/release/lockyleaks)
==40575==    by 0x139F88: std::thread::local::LocalKey$LT$T$GT$::with::h47b196d4586ff021 (in /home/eliza/lockyleaks/target/release/lockyleaks)
==40575==    by 0x134F57: std::sys_common::backtrace::__rust_begin_short_backtrace::h19e070dd689ab654 (in /home/eliza/lockyleaks/target/release/lockyleaks)
==40575==    by 0x13B58B: _ZN3std9panicking3try7do_call17h7aab2d4390bd4174E.llvm.4420981283127027568 (in /home/eliza/lockyleaks/target/release/lockyleaks)
==40575==    by 0x151E49: __rust_maybe_catch_panic (lib.rs:78)
==40575==    by 0x13BB3F: core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h9d9d0735a48e18a1 (in /home/eliza/lockyleaks/target/release/lockyleaks)
==40575==
==40575== LEAK SUMMARY:
==40575==    definitely lost: 0 bytes in 0 blocks
==40575==    indirectly lost: 0 bytes in 0 blocks
==40575==      possibly lost: 69,432 bytes in 33 blocks
==40575==    still reachable: 210,472 bytes in 166 blocks
==40575==         suppressed: 0 bytes in 0 blocks
==40575== Reachable blocks (those to which a pointer was found) are not shown.
==40575== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==40575==
==40575== For counts of detected and suppressed errors, rerun with: -v
==40575== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

I'm surprised that the RSS graphs still look similar to the ones for the unpatched 0.1 repro, but that may be due to jemalloc; I'm inclined to treat valgrind as a source of truth.

@carllerche any thoughts? Shall I go ahead and open a PR against v0.1.x to backport the 0.2 semaphore impl?

@hawkw
Copy link
Member

hawkw commented Feb 14, 2020

An update on this: we believe that an issue still exists in the tokio 0.2 version of Semaphore where the memory used by the semaphore grows when a clone is polled and then dropped without having acquired a permit. However, all allocated memory is released when the Semaphore itself is dropped, so valgrind doesn't report a leak. In the 0.1 semaphore, it's likely that allocations were actually lost (and not freed when dropping the semaphore), resulting in a leak detected by valgrind.

olix0r added a commit to linkerd/linkerd2-proxy that referenced this issue Feb 17, 2020
As described in tokio-rs/tokio#2237, the `tokio::sync::Semaphore` can
hold unbounded memory, especially when the semaphor is being contested
and consumers drop interest. Unfortunately, this use case is common in
the proxy, especially when a destination service is unavailable and the
proxy is timing out requests.

This change reimplements the Lock middleware without using
`tokio::sync::Semaphore`. This implementation is in some ways more
naive and inefficient, but it appears to be better suited for the
proxy's needs. Specifically, waiters are stored in a LIFO stack, which
optimizes for minimizing latency. Under certain high-load scenarios,
this Lock could be forced to grow its waiters set without cleaning up
expired watchers. If this becomes a more serious concern, we could
change the implementation to use a FIFO queue of waiters.
olix0r added a commit to linkerd/linkerd2-proxy that referenced this issue Feb 17, 2020
As described in tokio-rs/tokio#2237, the `tokio::sync::Semaphore` can
hold unbounded memory, especially when the semaphor is being contested
and consumers drop interest. Unfortunately, this use case is common in
the proxy, especially when a destination service is unavailable and the
proxy is timing out requests.

This change reimplements the Lock middleware without using
`tokio::sync::Semaphore`. This implementation is in some ways more
naive and inefficient, but it appears to be better suited for the
proxy's needs. Specifically, waiters are stored in a LIFO stack, which
optimizes for minimizing latency. Under certain high-load scenarios,
this Lock could be forced to grow its waiters set without cleaning up
expired watchers. If this becomes a more serious concern, we could
change the implementation to use a FIFO queue of waiters.
olix0r added a commit to linkerd/linkerd2-proxy that referenced this issue Feb 17, 2020
As described in tokio-rs/tokio#2237, the `tokio::sync::Semaphore` can
hold unbounded memory, especially when the semaphor is being contested
and consumers drop interest. Unfortunately, this use case is common in
the proxy, especially when a destination service is unavailable and the
proxy is timing out requests.

This change reimplements the Lock middleware without using
`tokio::sync::Semaphore`. This implementation is in some ways more
naive and inefficient, but it appears to be better suited for the
proxy's needs. Specifically, waiters are stored in a LIFO stack, which
optimizes for minimizing latency. Under certain high-load scenarios,
this Lock could be forced to grow its waiters set without cleaning up
expired watchers. If this becomes a more serious concern, we could
change the implementation to use a FIFO queue of waiters.
olix0r added a commit to linkerd/linkerd2-proxy that referenced this issue Feb 17, 2020
As described in tokio-rs/tokio#2237, the `tokio::sync::Semaphore` can
hold unbounded memory, especially when the semaphor is being contested
and consumers drop interest. Unfortunately, this use case is common in
the proxy, especially when a destination service is unavailable and the
proxy is timing out requests.

This change reimplements the Lock middleware without using
`tokio::sync::Semaphore`. This implementation is in some ways more
naive and inefficient, but it appears to be better suited for the
proxy's needs. Specifically, waiters are stored in a LIFO stack, which
optimizes for minimizing latency. Under certain high-load scenarios,
this Lock could be forced to grow its waiters set without cleaning up
expired watchers. If this becomes a more serious concern, we could
change the implementation to use a FIFO queue of waiters.
@hawkw hawkw closed this as completed in acf8a7d Mar 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants