Description
Version
[[package]]
name = "hyper"
version = "0.14.28"
[[package]]
name = "h2"
version = "0.3.24"
[[package]]
name = "tonic"
version = "0.10.2"
Platform
Linux ****** 6.6.13-200.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Jan 20 18:03:28 UTC 2024 x86_64 GNU/Linux
and also
Linux ****** 6.5.11-300.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Nov 8 22:37:57 UTC 2023 x86_64 GNU/Linux
Description
In case of a lot of concurrent queries being requested by the client, and with a limit on the number of concurrent queries by the server (but with no load shedding) both the client and server eventually seem to just hang, neither of then doing anything (i.e, there is no CPU usage, and the server, if queried by a different client responds to the new requests from that client).
The exact reproducibility is hard to nail down, but it seems to have something to do with the amount of data being transferred, since if I make messages that are being transferred very small I cannot reproduce.
During attempts I have once also witnessed this
thread 'tokio-runtime-worker' panicked at 'assertion failed: self.window_size.0 >= sz as i32', /home/abizjak/.cargo/registry/src/index.crates.io-6f17d22bba15001f/h2-0.3.24/src/proto/streams/flow_control.rs:181:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
but unfortunately I was not running with RUST_BACKTRACE, and I have been unable to reproduce it again.
One way to structure the description:
The server code is this
use tonic::{transport::Server, Request, Response, Status};
use hello_world::greeter_server::{Greeter, GreeterServer};
use hello_world::{HelloReply, HelloRequest};
pub mod hello_world {
tonic::include_proto!("helloworld");
}
#[derive(Default)]
pub struct MyGreeter {}
#[tonic::async_trait]
impl Greeter for MyGreeter {
async fn say_hello(
&self,
request: Request<HelloRequest>,
) -> Result<Response<HelloReply>, Status> {
let reply = hello_world::HelloReply {
message: format!(
"{:?}, {}",
[0; 10000], // I have been unable to reproduce with size 1000 here.
request.into_inner().name,
),
};
Ok(Response::new(reply))
}
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let addr = "[::1]:50051".parse().unwrap();
let greeter = MyGreeter::default();
Server::builder()
.concurrency_limit_per_connection(100)
.add_service(GreeterServer::new(greeter))
.serve(addr)
.await?;
Ok(())
}
And the client code is
use futures::stream::FuturesUnordered;
use futures::TryStreamExt;
use hello_world::greeter_client::GreeterClient;
use hello_world::HelloRequest;
use tracing_subscriber::filter::LevelFilter;
pub mod hello_world {
tonic::include_proto!("helloworld");
}
// 300 -> Hang
// 150 -> Hang
// 100 -> Hang
// 50 -> No hang
const N: usize = 80;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// It seems a lot harder to reproduce with logging. I have been unable to do so with trace logging.
// So consider commenting out.
{
use tracing_subscriber::prelude::*;
let log_filter = tracing_subscriber::filter::Targets::new()
.with_target("h2", LevelFilter::DEBUG)
.with_target("hyper", LevelFilter::DEBUG);
tracing_subscriber::registry()
.with(tracing_subscriber::fmt::layer())
.with(log_filter)
.init();
}
let client = GreeterClient::connect("http://[::1]:50051").await?;
let futs = FuturesUnordered::new();
let mut message = String::with_capacity(N);
for _ in 0..N {
message.push('1');
}
// The goal is to make concurrent requests.
for i in 0..100_000 {
let request = tonic::Request::new(HelloRequest { name: message.clone() });
let mut client = client.clone();
futs.push(async move {
client.say_hello(request).await.map(|r| (r, i))
});
}
futs.try_for_each_concurrent(None, move |(r, i)| async move {
let msg = r.into_inner().message;
let len = msg.len();
eprintln!("{len}, {i}",);
Ok(())
})
.await?;
Ok(())
}
I am attaching a complete Rust project with lock files that can be run as follows
- Start server
cargo run --release --bin server
- Run client
cargo run --release --bin client
Omitting --release
also seems to reproduce it.
I am attaching two things
- the project that can be run example.tar.gz
- a log (of the client) of one of the runs which ended in a hang log.txt.gz