subtraction overflow in tokio-threadpool #244

srijs · 2018-03-21T04:35:56Z

On current master, there seems to be a race condition in tokio-threadpool where occasionally I run into a subtraction overflow when logging the number of workers (https://github.com/tokio-rs/tokio/blob/master/tokio-threadpool/src/lib.rs#L1089).

thread panicked at 'attempt to subtract with overflow', /Users/srijs/.cargo/git/checkouts/tokio-377c595163f99a10/7b1306e/tokio-threadpool/src/
lib.rs:1089:53

Maybe the atomic ordering is not strict enough?

The text was updated successfully, but these errors were encountered:

carllerche · 2018-03-21T04:42:22Z

Thanks for the report. I've been working out some bugs, though I am not sure if it is related.

Work is here: #243. Maybe you can see if the bug happens on that branch?

carllerche · 2018-03-22T16:58:32Z

This specific problem is fixed by #243. I opened #248 to track the new issue.

kamyuentse · 2018-04-10T07:20:05Z

This also exist here: https://github.com/tokio-rs/tokio/blob/master/tokio-threadpool/src/pool/mod.rs#L172

carllerche · 2018-04-10T20:33:48Z

@kamyuentse Is this w/ futures2 enabled or not?

kamyuentse · 2018-04-11T03:02:29Z

@carllerche Without the feature futures2, this issue happened randomly when I debug the hyper:unfutures-02, there is a test (conn_drop_prevents_pool_checkout) fails before, but sean remove it and merge the PR.

Log

... Drop the worker inside Runtime::threadpool

 TRACE tokio_threadpool::pool     > worker_terminated; num_workers=0
 TRACE tokio_threadpool::pool     > notifying shutdown task
 TRACE tokio_threadpool::shutdown > Shutdown::poll
 TRACE tokio_reactor              > event Readable Token(4194303)
 DEBUG tokio_reactor              > loop process - 1 events, 0.000s
 DEBUG tokio_reactor::background  > shutting background reactor down NOW
 TRACE tokio_threadpool::notifier > Notifier::notify; id=0x10f81f130
 TRACE tokio_threadpool::pool     >   -> submitting to random; idx=4
 ^
 |--- Test will be passed when this line not appears,
        or spawn a new worker for this task before reactor has shut down.
for example (
 TRACE tokio_threadpool::pool     >   -> submitting to random; idx=4
 TRACE tokio_threadpool::worker   > spawning new worker thread; id=4
 DEBUG tokio_reactor::background  > background reactor has shutdown
 TRACE tokio_threadpool::worker   > shutting down thread; idx=4
 TRACE tokio_threadpool::worker   > found work while draining; signal_work
thread 'tokio-runtime-worker-4' panicked at 'attempt to subtract with overflow', /Users/kam/.cargo/git/checkouts/tokio-377c595163f99a10/dbcd835/tokio-threadpool/src/pool/mod.rs:172:53
note: Run with `RUST_BACKTRACE=1` for a backtrace.
)
 DEBUG tokio_reactor::background  > background reactor has shutdown

Submit task before the reactor shutdown	Spawn a new worker	Overflow panic	Test pass
Y	Y	Y	Y
Y	N	N	N
N	N	N	Y

I hope this information is helpful.

carllerche · 2018-04-11T03:28:12Z

Ok, thanks. Do you have steps to repro the issue?

kamyuentse · 2018-04-11T03:32:39Z

@carllerche Enable RUST_LOG=trace then run that test case several times, I think this issue will arise. But I am not trying to write a minimal test case without the hyper.

carllerche · 2018-04-11T03:49:45Z

@kamalmarhubi Which hyper branch? Master?

kamalmarhubi · 2018-04-11T04:12:13Z

@kamyuentse I think you were the intended target of a notification :-)

kamyuentse · 2018-04-11T04:19:24Z

@carllerche hyper:33874f9a7577eb20374c6d8943ce665feb9619e0 with this patch, which be removed:

diff --git a/tests/client.rs b/tests/client.rs
index 6705d370..0053a74f 100644
--- a/tests/client.rs
+++ b/tests/client.rs
@@ -1017,6 +1017,67 @@ mod dispatch_impl {
         let _ = t.select(close).wait();
     }
 
+    #[test]
+    fn conn_drop_prevents_pool_checkout() {
+        // a drop might happen for any sort of reason, and we can protect
+        // against a lot of them, but if the `runtime` is dropped, we can't
+        // really catch that. So, this is case to always check.
+        //
+        // See https://github.com/hyperium/hyper/issues/1429
+
+        use std::error::Error;
+        let _ = pretty_env_logger::try_init();
+
+        let server = TcpListener::bind("127.0.0.1:0").unwrap();
+        let addr = server.local_addr().unwrap();
+        let runtime = Runtime::new().unwrap();
+        let handle = runtime.reactor().clone();
+
+        let (tx1, rx1) = oneshot::channel();
+
+        thread::spawn(move || {
+            let mut sock = server.accept().unwrap().0;
+            sock.set_read_timeout(Some(Duration::from_secs(5))).unwrap();
+            sock.set_write_timeout(Some(Duration::from_secs(5))).unwrap();
+            let mut buf = [0; 4096];
+            sock.read(&mut buf).expect("read 1");
+            sock.write_all(b"HTTP/1.1 200 OK\r\nContent-Length: 0\r\n\r\n").unwrap();
+            sock.read(&mut buf).expect("read 2");
+            let _ = tx1.send(());
+        });
+
+        let uri = format!("http://{}/a", addr).parse::<hyper::Uri>().unwrap();
+
+        let client = Client::configure().build_with_executor(&handle, runtime.executor());
+
+        let req = Request::builder()
+            .uri(uri.clone())
+            .body(Body::empty())
+            .unwrap();
+        let res = client.request(req).and_then(move |res| {
+            assert_eq!(res.status(), hyper::StatusCode::OK);
+            res.into_body().into_stream().concat2()
+        });
+
+        res.wait().unwrap();
+
+        // drop previous runtime
+        drop(runtime);
+        let timeout = Delay::new(Duration::from_millis(200));
+        let rx = rx1.map_err(|_| hyper::Error::Io(io::Error::new(io::ErrorKind::Other, "thread panicked")));
+        let rx = rx.and_then(move |_| timeout.map_err(|e| e.into()));
+
+        let req = Request::builder()
+            .uri(uri)
+            .body(Body::empty())
+            .unwrap();
+        let res = client.request(req);
+        // this does trigger an 'event loop gone' error, but before, it would
+        // panic internally on a `SendError`, which is what we're testing against.
+        let err = res.join(rx).map(|r| r.0).wait().unwrap_err();
+        assert_eq!(err.description(), "event loop gone");
+    }
+
     #[test]
     fn client_custom_executor() {
         let server = TcpListener::bind("127.0.0.1:0").unwrap();

Run with RUST_LOG=trace cargo test --color=always --package hyper --test client dispatch_impl::conn_drop_prevents_pool_checkout -- --nocapture --exact

EDIT: Because the tokio-threadpool has been refactored in the master, should use tokio:master when running the test. Sorry for that I not clarify this before.

carllerche · 2018-04-12T05:32:26Z

I'm reopening this until I can investigate the issue described above.

carllerche · 2018-04-14T04:55:57Z

#317 changes all the logic around where the panic occurs. I'm going to wait to see if the issue happens once that work lands.

carllerche · 2018-04-23T18:09:14Z

@kamyuentse I'm not seeing this issue happen w/ the latest changes on master (#317). Are you able to confirm this?

kamyuentse · 2018-05-02T04:45:00Z

@carllerche Sorry for the delay, I was out last week. I checked and not seeing this issue happen. Thanks for your great work!

carllerche · 2018-05-02T04:48:28Z

@kamyuentse Great! Thanks for verifying 👍

carllerche closed this as completed Mar 22, 2018

carllerche reopened this Apr 12, 2018

carllerche closed this as completed May 2, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

subtraction overflow in tokio-threadpool #244

subtraction overflow in tokio-threadpool #244

srijs commented Mar 21, 2018

carllerche commented Mar 21, 2018

carllerche commented Mar 22, 2018

kamyuentse commented Apr 10, 2018

carllerche commented Apr 10, 2018

kamyuentse commented Apr 11, 2018

carllerche commented Apr 11, 2018

kamyuentse commented Apr 11, 2018

carllerche commented Apr 11, 2018

kamalmarhubi commented Apr 11, 2018

kamyuentse commented Apr 11, 2018 •

edited

Loading

carllerche commented Apr 12, 2018

carllerche commented Apr 14, 2018

carllerche commented Apr 23, 2018

kamyuentse commented May 2, 2018

carllerche commented May 2, 2018

subtraction overflow in tokio-threadpool #244

subtraction overflow in tokio-threadpool #244

Comments

srijs commented Mar 21, 2018

carllerche commented Mar 21, 2018

carllerche commented Mar 22, 2018

kamyuentse commented Apr 10, 2018

carllerche commented Apr 10, 2018

kamyuentse commented Apr 11, 2018

carllerche commented Apr 11, 2018

kamyuentse commented Apr 11, 2018

carllerche commented Apr 11, 2018

kamalmarhubi commented Apr 11, 2018

kamyuentse commented Apr 11, 2018 • edited Loading

carllerche commented Apr 12, 2018

carllerche commented Apr 14, 2018

carllerche commented Apr 23, 2018

kamyuentse commented May 2, 2018

carllerche commented May 2, 2018

kamyuentse commented Apr 11, 2018 •

edited

Loading