Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query hangs on collecting stream from recursive CTE #9680

Closed
korowa opened this issue Mar 18, 2024 · 4 comments · Fixed by #9687
Closed

Query hangs on collecting stream from recursive CTE #9680

korowa opened this issue Mar 18, 2024 · 4 comments · Fixed by #9687
Assignees
Labels
bug Something isn't working

Comments

@korowa
Copy link
Contributor

korowa commented Mar 18, 2024

Describe the bug

Query with recursive CTE hangs while trying to collect stream originated from WorkTableExec -- this might occur when recursive part contains HashJoin/CrossJoin/NestedLoopJoin, and WorkTableExec should be collected in memory as a join build side.

To Reproduce

Reproducer based on this test query:

WITH RECURSIVE "recursive_cte" AS (
    SELECT 1 as "val"
  UNION ALL (
    WITH "sub_cte" AS (
      SELECT 2 as "val"
    )
    SELECT
      2 as "val"
    FROM "recursive_cte"
      CROSS JOIN "sub_cte"
    WHERE "recursive_cte"."val" < 2
  )
)
SELECT * FROM "recursive_cte";

Produces following plan, and is unable to complete, constantly executing CrossJoinExec

physical_plan
RecursiveQueryExec: name=recursive_cte, is_distinct=false
--ProjectionExec: expr=[1 as val]
----PlaceholderRowExec
--ProjectionExec: expr=[2 as val]
----CrossJoinExec
------CoalescePartitionsExec
--------CoalesceBatchesExec: target_batch_size=8182
----------FilterExec: val@0 < 2
------------RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=1
--------------WorkTableExec: name=recursive_cte
------ProjectionExec: expr=[2 as val]
--------PlaceholderRowExec

In the same time, identical query with swapped inputs works as expected

WITH RECURSIVE "recursive_cte" AS (
    SELECT 1 as "val"
  UNION ALL (
    WITH "sub_cte" AS (
      SELECT 2 as "val"
    )
    SELECT
      2 as "val"
    FROM "sub_cte"
      CROSS JOIN "recursive_cte"
    WHERE "recursive_cte"."val" < 2
  )
)
SELECT * FROM "recursive_cte";

Expected behavior

Ideally, recursive CTE should be able to handle collect() stream, created by WorkTableExec.

Another option might be tracking calls to CTE temp table, and throw runtime error in case their number increases some preconfigured threshold value.

Third option is to forbid recursive CTE temp tables to be used as build-side join inputs (not preferrable as it looks like too much unnecessary optimizer(s) modifications, and not reliable -- same issue might occur in case of aggregations/sorts/maybe some other "blocking" operators).

Additional context

No response

@korowa korowa added the bug Something isn't working label Mar 18, 2024
@alamb
Copy link
Contributor

alamb commented Mar 18, 2024

FYI @jonahgao and @matthewgapp

@jonahgao
Copy link
Member

take

@jonahgao
Copy link
Member

I plan to investigate and try to fix this.

@jonahgao
Copy link
Member

It seems that this is caused by OnceAsync, which makes the left data build only once during the entire query. It should be built every iteration.
https://github.com/apache/arrow-datafusion/blob/b0b329ba39403b9e87156d6f9b8c5464dc6d2480/datafusion/physical-plan/src/joins/cross_join.rs#L61-L62

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants