Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add subquery representation #134

Merged
merged 9 commits into from
Feb 6, 2022

Conversation

cpcloud
Copy link
Contributor

@cpcloud cpcloud commented Feb 2, 2022

This PR adds support for correlated subqueries to substrait.

Notes:

  1. It'll be easier to review commit by commit, since I had to combine
    expressions and relations together to get the protos to compile.
  2. While there are actually only three kinds of subqueries (scalar, EXISTS and ANY),
    we chose to keep a few of the higher level representations so that information about
    the original intention is preserved.

Closes #119.

cc @pdet @Mytherin, would love your thoughts here!

@cpcloud cpcloud force-pushed the correlated-subqueries branch from 530f4c2 to 28e87f2 Compare February 2, 2022 18:02
@cpcloud cpcloud requested a review from jacques-n February 3, 2022 18:53
Copy link
Contributor

@jacques-n jacques-n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good. Interested to see if others agree.

proto/substrait/relations.proto Outdated Show resolved Hide resolved
proto/substrait/relations.proto Outdated Show resolved Hide resolved
@jacques-n
Copy link
Contributor

This looks good to me but I'd like to get some more eyes on it.

@jacques-n
Copy link
Contributor

I lied... we need to also add the docs part of this.

Once people agree with the shape of this, let's try to update the appropriate parts of the expressions/etc sections.

Mytherin
Mytherin previously approved these changes Feb 5, 2022
Copy link

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me as a way of representing unflattened subqueries. The only potential missing feature I see here is support for the ARRAYsubquery, e.g. in Postgres:

select array(select * from generate_series(0,5,1));
--  {0,1,2,3,4,5}

@cpcloud
Copy link
Contributor Author

cpcloud commented Feb 5, 2022

Looks good to me as a way of representing unflattened subqueries. The only potential missing feature I see here is support for the ARRAYsubquery, e.g. in Postgres:

select array(select * from generate_series(0,5,1));
--  {0,1,2,3,4,5}

Thanks!

Yeah, functions that accept subqueries (or sets of rows) would be an interesting follow up.

This would likely also need work in the type system.

@cpcloud
Copy link
Contributor Author

cpcloud commented Feb 6, 2022

I will add some documentation to the spec later today about subqueries.

@cpcloud cpcloud force-pushed the correlated-subqueries branch from 7c476a1 to ae9300e Compare February 6, 2022 16:08
@cpcloud cpcloud changed the title feat: add correlated subquery representation feat: add subquery representation Feb 6, 2022
@cpcloud cpcloud added the enhancement New feature or request label Feb 6, 2022
Copy link
Contributor

@jacques-n jacques-n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working through this!

@jacques-n jacques-n merged commit 3670518 into substrait-io:main Feb 6, 2022
@cpcloud cpcloud deleted the correlated-subqueries branch April 28, 2022 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Introduce the ability to express a correlated subquery
3 participants