fix: sqllogic test hangs (cluster mod + clickhouse handler) #9615
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/
Summary
to avoid sqllogic test hanging in cluster mode, in the clickhouse handler, wraps resultset stream pulling action in the "query-ctx" thread.
how to re-produce: pls see the summary of #9576
while execution queries:
the pulling of
SendableDataBlockStream
, callsPipelinePullingExecutor::pull_data
(for PipelinePullingExecutor)https://github.com/datafuselabs/databend/blob/3b45a672222b5bc928810dd0b3a64803d18a590d/src/query/service/src/pipelines/executor/pipeline_pulling_executor.rs#L176-L185
note that here the
receive
is NOT async, if data is not available, the runtime thread might be trapped in this loop.FlightExchange
also uses "tokio-runtime-worker" threads to pulling data from flight rpc, and forwarding data to downstreamhttps://github.com/datafuselabs/databend/blob/750852820100579243172a507d8d6e455080a768/src/query/service/src/api/rpc/flight_client.rs#L131-L147
if rt threads are trapped in 1, waiting for data from
FlightExchange
, and async tasks ofFlightExchange
are waiting for rt thread to drive them, the execution hangs.NOTE: seems that not all the threads that named
tokio-runtime-threads
are trapped in 1 while execution of query hangs. but a) I am not sure if there were other ad-hoc tokio runtimes with default thread names there. b) do not know if work stealing of tokio scheduling helps hereto avoid hanging, the clickhouse handler uses
query-ctx
thread to pull the data in this PRCloses #9576