Skip to content

[FEATURE] Speedup Join with option max=n by converting to TopHits Aggregation #4927

@LantaoJin

Description

@LantaoJin

Is your feature request related to a problem?
Is your feature request related to a problem?
Currently, the join without option max=n will generate a SortMergeJoin operator and the Sort can be pushed down to each sides.

But for join with option max=n, the logical plan contains a window function in right side, we cannot push down the Sort to DSL.

What solution would you like?
This issue will pushdown the right side (with option max=n) to a TopHits aggregation which could highly speedup the join processing, even convert the whole join to a HashJoin.

What alternatives have you considered?
A clear and concise description of any alternative solutions or features you've considered.

Do you have any additional context?
#4844

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestpushdownpushdown related issues

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions