-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Client connection leak with more than one body chunk #1479
Comments
Without keepalives, we get too many sockets in TIME_WAIT. With keepalives, we are bumping into hyperium/hyper#1479
Can you try with 7fe9710? I ran into what looks like the same issue and that commit fixed it for me. |
Tested with 7fe9710 and it worked! |
@unixpickle FYI, |
Very speedy fix! |
I'm still not entirely sure what the problem was, though. 7fe9710 supposedly works because it delays EOF until the connection has been added back to the pool. This seems to imply that I could work around the problem on 0.11.24 by sleeping after reading the body. However, this is not the case. Here's an updated client that still does not work on 0.11.24: extern crate futures;
extern crate hyper;
extern crate simple_logger;
extern crate tokio_core;
use std::thread::sleep;
use std::time::Duration;
use futures::{Future, IntoFuture, Stream};
use hyper::{Chunk, Client, Error};
use tokio_core::reactor::Core;
fn main() {
let mut core = Core::new().unwrap();
let handle = core.handle();
let client = Client::configure().keep_alive(true).build(&handle);
loop {
let res = client.get("http://127.0.0.1:8080/".parse().unwrap())
.and_then(|resp| resp.body().fold((0usize, Vec::<u8>::new()), join_chunks));
let (num_chunks, data) = core.run(res).unwrap();
println!("got {} bytes ({} chunks)", data.len(), num_chunks);
sleep(Duration::from_millis(100));
}
}
fn join_chunks(
mut state: (usize, Vec<u8>),
next: Chunk
) -> Box<Future<Item = (usize, Vec<u8>), Error = Error>> {
state.1.extend(next.to_vec());
Box::new(Ok((state.0 + 1, state.1)).into_future())
} There must be something else I'm missing. For example, I don't see why this |
Sleeping the thread won't fix it: the connection task (future) that was spawned needs to be polled again, and that can't happen if the |
Ah, that's very helpful. However, I'm still not sure I see the full picture. Here's a modified version where the core runs forever, and the sleep is done using a future. The problem still happens on 0.11.24. In this case, the core is running forever, so @seanmonstar's previous explanation doesn't apply. extern crate futures;
extern crate hyper;
extern crate simple_logger;
extern crate tokio_core;
use std::time::Duration;
use futures::{Future, IntoFuture, Stream};
use futures::stream::repeat;
use hyper::{Chunk, Client, Error};
use tokio_core::reactor::{Core, Timeout};
fn main() {
let mut core = Core::new().unwrap();
let handle = core.handle();
let client = Client::configure().keep_alive(true).build(&handle);
core.run(repeat::<(), Error>(()).for_each(move |_| {
let local_handle = handle.clone();
client.get("http://127.0.0.1:8080/".parse().unwrap())
.and_then(|resp| resp.body().fold((0usize, Vec::<u8>::new()), join_chunks))
.and_then(move |(num_chunks, data)| {
println!("got {} bytes ({} chunks)", data.len(), num_chunks);
Timeout::new(Duration::from_millis(100), &local_handle).unwrap()
.map_err(From::from).map(|_| ())
})
})).unwrap();
}
fn join_chunks(
mut state: (usize, Vec<u8>),
next: Chunk
) -> Box<Future<Item = (usize, Vec<u8>), Error = Error>> {
state.1.extend(next.to_vec());
Box::new(Ok((state.0 + 1, state.1)).into_future())
} By the way, I have been testing for the problem by running |
I don't know exactly where the issue is, but it's fixed in 0.11.25, so I personally wouldn't expend much effort myself to look further XD |
I did some investigating, since I didn't think this should be left a mystery. I found that commit 7fe9710 fixed another bug in addition to the delayed EOF bug. The bug was in this block of code from 0.11.24 (see in src/client/mod.rs): if let Ok(Async::NotReady) = pooled.tx.poll_ready() {
// If the executor doesn't have room, oh well. Things will likely
// be blowing up soon, but this specific task isn't required.
let _ = executor.execute(future::poll_fn(move || {
pooled.tx.poll_ready().map_err(|_| ())
}));
} The problem here is that And, it hardly needs to be said: the bug doesn't happen for 1 chunk because the first time the executed poll_fn is called, the connection is ready to be given back to the pool. The EOF bug is one issue that needed to be fixed. My original reproduction fell victim to both bugs, but the later one with Timeouts only fell victim to the task rescheduling bug. If anybody is interested, here is a minimal patch on top of 0.11.24 that fixes the task rescheduling bug but not the EOF bug: diff --git a/src/client/conn.rs b/src/client/conn.rs
index f6a139b5..ed5d7147 100644
--- a/src/client/conn.rs
+++ b/src/client/conn.rs
@@ -126,6 +126,10 @@ impl<B> SendRequest<B>
self.dispatch.poll_ready()
}
+ pub(super) fn is_wanting(&self) -> bool {
+ self.dispatch.is_wanting()
+ }
+
pub(super) fn is_closed(&self) -> bool {
self.dispatch.is_closed()
}
diff --git a/src/client/dispatch.rs b/src/client/dispatch.rs
index efc720d8..d7943d9b 100644
--- a/src/client/dispatch.rs
+++ b/src/client/dispatch.rs
@@ -45,6 +45,10 @@ impl<T, U> Sender<T, U> {
}
}
+ pub fn is_wanting(&self) -> bool {
+ self.giver.is_wanting()
+ }
+
pub fn is_closed(&self) -> bool {
self.giver.is_canceled()
}
diff --git a/src/client/mod.rs b/src/client/mod.rs
index eb8c10ea..e81ef5dc 100644
--- a/src/client/mod.rs
+++ b/src/client/mod.rs
@@ -253,7 +253,7 @@ where C: Connect,
// for a new request to start.
//
// It won't be ready if there is a body to stream.
- if let Ok(Async::NotReady) = pooled.tx.poll_ready() {
+ if !pooled.tx.is_wanting() {
// If the executor doesn't have room, oh well. Things will likely
// be blowing up soon, but this specific task isn't required.
let _ = executor.execute(future::poll_fn(move || {
@@ -661,4 +661,3 @@ mod background {
}
}
}
-
diff --git a/src/client/signal.rs b/src/client/signal.rs
index 2ddf67f7..7fd6bb44 100644
--- a/src/client/signal.rs
+++ b/src/client/signal.rs
@@ -88,6 +88,10 @@ impl Giver {
}
}
+ pub fn is_wanting(&self) -> bool {
+ self.inner.state.load(Ordering::SeqCst) == STATE_WANT
+ }
+
pub fn is_canceled(&self) -> bool {
self.inner.state.load(Ordering::SeqCst) == STATE_CLOSED
} |
Ah yes, I had noticed this as well a little while ago, and forgotten that the fix included that. Generally, implementations of |
In a web application where the server provides more than ~8062 bytes of data to the client, the client's connection leaks. This can lead to all sorts of resource exhaustion errors. I have found that this correlates perfectly with
Response::body()
producing more than oneChunk
.The client/server programs below reproduce this issue. The server usually dies with the following error:
This is likely related to commit ef40081 and issue #1397
Update: this is using hyper version 0.11.24. Added Cargo.toml file below.
client.rs:
server.rs
Cargo.toml
The text was updated successfully, but these errors were encountered: