You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[[package]]
name = "hyper"
version = "0.14.28"
[[package]]
name = "h2"
version = "0.3.24"
[[package]]
name = "tonic"
version = "0.10.2"
Platform
Linux ****** 6.6.13-200.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Jan 20 18:03:28 UTC 2024 x86_64 GNU/Linux
and also
Linux ****** 6.5.11-300.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Nov 8 22:37:57 UTC 2023 x86_64 GNU/Linux
Description
In case of a lot of concurrent queries being requested by the client, and with a limit on the number of concurrent queries by the server (but with no load shedding) both the client and server eventually seem to just hang, neither of then doing anything (i.e, there is no CPU usage, and the server, if queried by a different client responds to the new requests from that client).
The exact reproducibility is hard to nail down, but it seems to have something to do with the amount of data being transferred, since if I make messages that are being transferred very small I cannot reproduce.
During attempts I have once also witnessed this
thread 'tokio-runtime-worker' panicked at 'assertion failed: self.window_size.0 >= sz as i32', /home/abizjak/.cargo/registry/src/index.crates.io-6f17d22bba15001f/h2-0.3.24/src/proto/streams/flow_control.rs:181:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
but unfortunately I was not running with RUST_BACKTRACE, and I have been unable to reproduce it again.
One way to structure the description:
The server code is this
use tonic::{transport::Server,Request,Response,Status};use hello_world::greeter_server::{Greeter,GreeterServer};use hello_world::{HelloReply,HelloRequest};pubmod hello_world {
tonic::include_proto!("helloworld");}#[derive(Default)]pubstructMyGreeter{}#[tonic::async_trait]implGreeterforMyGreeter{asyncfnsay_hello(&self,request:Request<HelloRequest>,) -> Result<Response<HelloReply>,Status>{let reply = hello_world::HelloReply{message:format!("{:?}, {}",
[0; 10000], // I have been unable to reproduce with size 1000 here.
request.into_inner().name,
),};Ok(Response::new(reply))}}#[tokio::main]asyncfnmain() -> Result<(),Box<dyn std::error::Error>>{let addr = "[::1]:50051".parse().unwrap();let greeter = MyGreeter::default();Server::builder().concurrency_limit_per_connection(100).add_service(GreeterServer::new(greeter)).serve(addr).await?;Ok(())}
And the client code is
use futures::stream::FuturesUnordered;use futures::TryStreamExt;use hello_world::greeter_client::GreeterClient;use hello_world::HelloRequest;use tracing_subscriber::filter::LevelFilter;pubmod hello_world {
tonic::include_proto!("helloworld");}// 300 -> Hang// 150 -> Hang// 100 -> Hang// 50 -> No hangconstN:usize = 80;#[tokio::main]asyncfnmain() -> Result<(),Box<dyn std::error::Error>>{// It seems a lot harder to reproduce with logging. I have been unable to do so with trace logging.// So consider commenting out.{use tracing_subscriber::prelude::*;let log_filter = tracing_subscriber::filter::Targets::new().with_target("h2",LevelFilter::DEBUG).with_target("hyper",LevelFilter::DEBUG);
tracing_subscriber::registry().with(tracing_subscriber::fmt::layer()).with(log_filter).init();}let client = GreeterClient::connect("http://[::1]:50051").await?;let futs = FuturesUnordered::new();letmut message = String::with_capacity(N);for _ in0..N{
message.push('1');}// The goal is to make concurrent requests.for i in0..100_000{let request = tonic::Request::new(HelloRequest{name: message.clone()});letmut client = client.clone();
futs.push(asyncmove{
client.say_hello(request).await.map(|r| (r, i))});}
futs.try_for_each_concurrent(None,move |(r, i)| asyncmove{let msg = r.into_inner().message;let len = msg.len();eprintln!("{len}, {i}",);Ok(())}).await?;Ok(())}
I am attaching a complete Rust project with lock files that can be run as follows
FYI, I have reported this to hyper, but I don't really know whether it's a hyper issue or in tonic or h2. It seems unlikely it's a tonic issue, the only reason I'm using it is that the issue originally arose in a grpc server we were testing.
@dswij (in Discord) if I understand correctly, narrowed this issue down to the interaction between the concurrency limit and maximum concurrent streams limit.
Yes. Setting a low concurrency_limit together with a high maximum_concurrent_streams in tonic will result in a deadlock. Here, it's because the concurrency limit is lower than the default HTTP/2 max concurrent streams.
Setting concurrency_limit_per_connection >= max_concurrent_streams should work to get around this issue for the moment. i.e.
Version
Platform
and also
Description
In case of a lot of concurrent queries being requested by the client, and with a limit on the number of concurrent queries by the server (but with no load shedding) both the client and server eventually seem to just hang, neither of then doing anything (i.e, there is no CPU usage, and the server, if queried by a different client responds to the new requests from that client).
The exact reproducibility is hard to nail down, but it seems to have something to do with the amount of data being transferred, since if I make messages that are being transferred very small I cannot reproduce.
During attempts I have once also witnessed this
but unfortunately I was not running with RUST_BACKTRACE, and I have been unable to reproduce it again.
One way to structure the description:
The server code is this
And the client code is
I am attaching a complete Rust project with lock files that can be run as follows
Omitting
--release
also seems to reproduce it.I am attaching two things
The text was updated successfully, but these errors were encountered: