Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cross joins #2994

Open
universalmind303 opened this issue Oct 3, 2024 · 3 comments
Open

cross joins #2994

universalmind303 opened this issue Oct 3, 2024 · 3 comments
Labels
p1 Important to tackle soon, but preemptable by p0

Comments

@universalmind303
Copy link
Contributor

Is your feature request related to a problem? Please describe.
I want to perform cross joins using daft

Describe the solution you'd like
df1.join(df2, how='cross')

Describe alternatives you've considered
df1.join(df2, on=lit(1))

@universalmind303
Copy link
Contributor Author

universalmind303 commented Oct 3, 2024

additionally,

a cross join followed by a filter comparing columns between the two inputs should be optimized into a inner join

example:

df1.join(df2, how='cross').where(df1['text'] == df2['name'])

this can be optimized to
df1.join(df2, left_on=col('text'), right_on=col('name') how='inner')

@universalmind303 universalmind303 added p0 Priority 0 - to be addressed immediately p1 Important to tackle soon, but preemptable by p0 and removed p0 Priority 0 - to be addressed immediately labels Oct 4, 2024
@universalmind303
Copy link
Contributor Author

For reference, Datafusion has an eliminate_cross_join rule that rewrites cross joins to inner joins where possible

https://github.com/GlareDB/arrow-datafusion/blob/20b298e9d82e483e28087e595c409a8cc04872f3/datafusion/optimizer/src/eliminate_cross_join.rs#L44

@universalmind303
Copy link
Contributor Author

created #3095 for the optimizer rule.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p1 Important to tackle soon, but preemptable by p0
Projects
Status: No status
Development

No branches or pull requests

1 participant