Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subtraction overflow in tokio-threadpool #244

Closed
srijs opened this issue Mar 21, 2018 · 15 comments
Closed

subtraction overflow in tokio-threadpool #244

srijs opened this issue Mar 21, 2018 · 15 comments

Comments

@srijs
Copy link
Contributor

srijs commented Mar 21, 2018

On current master, there seems to be a race condition in tokio-threadpool where occasionally I run into a subtraction overflow when logging the number of workers (https://github.com/tokio-rs/tokio/blob/master/tokio-threadpool/src/lib.rs#L1089).

thread panicked at 'attempt to subtract with overflow', /Users/srijs/.cargo/git/checkouts/tokio-377c595163f99a10/7b1306e/tokio-threadpool/src/
lib.rs:1089:53

Maybe the atomic ordering is not strict enough?

@carllerche
Copy link
Member

Thanks for the report. I've been working out some bugs, though I am not sure if it is related.

Work is here: #243. Maybe you can see if the bug happens on that branch?

@carllerche
Copy link
Member

This specific problem is fixed by #243. I opened #248 to track the new issue.

@kamyuentse
Copy link
Contributor

@carllerche
Copy link
Member

@kamyuentse Is this w/ futures2 enabled or not?

@kamyuentse
Copy link
Contributor

@carllerche Without the feature futures2, this issue happened randomly when I debug the hyper:unfutures-02, there is a test (conn_drop_prevents_pool_checkout) fails before, but sean remove it and merge the PR.

Log

... Drop the worker inside Runtime::threadpool

 TRACE tokio_threadpool::pool     > worker_terminated; num_workers=0
 TRACE tokio_threadpool::pool     > notifying shutdown task
 TRACE tokio_threadpool::shutdown > Shutdown::poll
 TRACE tokio_reactor              > event Readable Token(4194303)
 DEBUG tokio_reactor              > loop process - 1 events, 0.000s
 DEBUG tokio_reactor::background  > shutting background reactor down NOW
 TRACE tokio_threadpool::notifier > Notifier::notify; id=0x10f81f130
 TRACE tokio_threadpool::pool     >   -> submitting to random; idx=4
 ^
 |--- Test will be passed when this line not appears,
        or spawn a new worker for this task before reactor has shut down.
for example (
 TRACE tokio_threadpool::pool     >   -> submitting to random; idx=4
 TRACE tokio_threadpool::worker   > spawning new worker thread; id=4
 DEBUG tokio_reactor::background  > background reactor has shutdown
 TRACE tokio_threadpool::worker   > shutting down thread; idx=4
 TRACE tokio_threadpool::worker   > found work while draining; signal_work
thread 'tokio-runtime-worker-4' panicked at 'attempt to subtract with overflow', /Users/kam/.cargo/git/checkouts/tokio-377c595163f99a10/dbcd835/tokio-threadpool/src/pool/mod.rs:172:53
note: Run with `RUST_BACKTRACE=1` for a backtrace.
)
 DEBUG tokio_reactor::background  > background reactor has shutdown
Submit task before the reactor shutdown Spawn a new worker Overflow panic Test pass
Y Y Y Y
Y N N N
N N N Y

I hope this information is helpful.

@carllerche
Copy link
Member

Ok, thanks. Do you have steps to repro the issue?

@kamyuentse
Copy link
Contributor

@carllerche Enable RUST_LOG=trace then run that test case several times, I think this issue will arise. But I am not trying to write a minimal test case without the hyper.

@carllerche
Copy link
Member

@kamalmarhubi Which hyper branch? Master?

@kamalmarhubi
Copy link
Contributor

@kamyuentse I think you were the intended target of a notification :-)

@kamyuentse
Copy link
Contributor

kamyuentse commented Apr 11, 2018

@carllerche hyper:33874f9a7577eb20374c6d8943ce665feb9619e0 with this patch, which be removed:

diff --git a/tests/client.rs b/tests/client.rs
index 6705d370..0053a74f 100644
--- a/tests/client.rs
+++ b/tests/client.rs
@@ -1017,6 +1017,67 @@ mod dispatch_impl {
         let _ = t.select(close).wait();
     }
 
+    #[test]
+    fn conn_drop_prevents_pool_checkout() {
+        // a drop might happen for any sort of reason, and we can protect
+        // against a lot of them, but if the `runtime` is dropped, we can't
+        // really catch that. So, this is case to always check.
+        //
+        // See https://github.com/hyperium/hyper/issues/1429
+
+        use std::error::Error;
+        let _ = pretty_env_logger::try_init();
+
+        let server = TcpListener::bind("127.0.0.1:0").unwrap();
+        let addr = server.local_addr().unwrap();
+        let runtime = Runtime::new().unwrap();
+        let handle = runtime.reactor().clone();
+
+        let (tx1, rx1) = oneshot::channel();
+
+        thread::spawn(move || {
+            let mut sock = server.accept().unwrap().0;
+            sock.set_read_timeout(Some(Duration::from_secs(5))).unwrap();
+            sock.set_write_timeout(Some(Duration::from_secs(5))).unwrap();
+            let mut buf = [0; 4096];
+            sock.read(&mut buf).expect("read 1");
+            sock.write_all(b"HTTP/1.1 200 OK\r\nContent-Length: 0\r\n\r\n").unwrap();
+            sock.read(&mut buf).expect("read 2");
+            let _ = tx1.send(());
+        });
+
+        let uri = format!("http://{}/a", addr).parse::<hyper::Uri>().unwrap();
+
+        let client = Client::configure().build_with_executor(&handle, runtime.executor());
+
+        let req = Request::builder()
+            .uri(uri.clone())
+            .body(Body::empty())
+            .unwrap();
+        let res = client.request(req).and_then(move |res| {
+            assert_eq!(res.status(), hyper::StatusCode::OK);
+            res.into_body().into_stream().concat2()
+        });
+
+        res.wait().unwrap();
+
+        // drop previous runtime
+        drop(runtime);
+        let timeout = Delay::new(Duration::from_millis(200));
+        let rx = rx1.map_err(|_| hyper::Error::Io(io::Error::new(io::ErrorKind::Other, "thread panicked")));
+        let rx = rx.and_then(move |_| timeout.map_err(|e| e.into()));
+
+        let req = Request::builder()
+            .uri(uri)
+            .body(Body::empty())
+            .unwrap();
+        let res = client.request(req);
+        // this does trigger an 'event loop gone' error, but before, it would
+        // panic internally on a `SendError`, which is what we're testing against.
+        let err = res.join(rx).map(|r| r.0).wait().unwrap_err();
+        assert_eq!(err.description(), "event loop gone");
+    }
+
     #[test]
     fn client_custom_executor() {
         let server = TcpListener::bind("127.0.0.1:0").unwrap();

Run with RUST_LOG=trace cargo test --color=always --package hyper --test client dispatch_impl::conn_drop_prevents_pool_checkout -- --nocapture --exact

EDIT: Because the tokio-threadpool has been refactored in the master, should use tokio:master when running the test. Sorry for that I not clarify this before.

@carllerche
Copy link
Member

I'm reopening this until I can investigate the issue described above.

@carllerche carllerche reopened this Apr 12, 2018
@carllerche
Copy link
Member

#317 changes all the logic around where the panic occurs. I'm going to wait to see if the issue happens once that work lands.

@carllerche
Copy link
Member

@kamyuentse I'm not seeing this issue happen w/ the latest changes on master (#317). Are you able to confirm this?

@kamyuentse
Copy link
Contributor

@carllerche Sorry for the delay, I was out last week. I checked and not seeing this issue happen. Thanks for your great work!

@carllerche
Copy link
Member

@kamyuentse Great! Thanks for verifying 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants