Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support some of pipe operators #1759

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

simonvandel
Copy link

Part of #1758

Still missing (ran out of time today, can be done in follow-ups perhaps)

  • join
  • union|intersect|except
  • call
  • tablesample
  • pivot
  • unpivot

I'm a first time contributor - please let me know if there are better places to store the PipeOperator in the AST

Part of apache#1758

Still missing
- join
- union|intersect|except
- call
- tablesample
- pivot
- unpivot
Copy link
Contributor

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @simonvandel! The changes look reasonable to me overall, left some comments

#[derive(Debug, Clone, PartialEq, PartialOrd, Eq, Ord, Hash)]
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
#[cfg_attr(feature = "visitor", derive(Visit, VisitMut))]
pub enum PipeOperator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a description and link to the docs https://cloud.google.com/bigquery/docs/pipe-syntax-guide#basic_syntax

Select {
exprs: Vec<SelectItem>,
},
Extend {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice to include a short example doc string for the variants and where possible links to the documentation, like here for extended for example https://cloud.google.com/bigquery/docs/pipe-syntax-guide#extend-pipe-operator

Comment on lines +10381 to +10386
// AGGREGATE <agg_expr> [[AS] alias], ...
//
// and
//
// AGGREGATE [<agg_expr> [[AS] alias], ...]
// GROUP BY <grouping_expr> [AS alias], ...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah so I think these comments would be better placed on in the doc strings for each variant (left similar comment at that site)


// Syntax from "SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL"
// https://storage.googleapis.com/gweb-research2023-media/pubtools/1004848.pdf
while self.consume_token(&Token::VerticalBarRightAngleBracket) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move the logic to a dedicated function like parse_pipe_operators()?

Comment on lines +10326 to +10327
// Syntax from "SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL"
// https://storage.googleapis.com/gweb-research2023-media/pubtools/1004848.pdf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Syntax from "SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL"
// https://storage.googleapis.com/gweb-research2023-media/pubtools/1004848.pdf

/// |> limit 1
/// ```
///
/// See "SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL" https://research.google/pubs/sql-has-problems-we-can-fix-them-pipe-syntax-in-sql/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can link to a concrete syntax like this instead
https://cloud.google.com/bigquery/docs/pipe-syntax-guide#basic_syntax

The paper wouldn't really help as much in the context of someone working on the codebase or trying to use the library

let exprs = self.parse_comma_separated(Parser::parse_order_by_expr)?;
pipe_operators.push(PipeOperator::OrderBy { exprs })
}
_ => {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This clause look unhandled, for the loop i imagine instead we can something like?

while self.consume_token(&Token::VerticalBarRightAngleBracket) {
    operators.push(self.parse_pipe_operator()?);
}
fn parse_pipe_operator() {
    if self.parse_keyword(SELECT) {
        // select operator
    } // ...
    else {
       self.expected("...")
    }
}

dialects.verified_stmt("SELECT * FROM users |> ORDER BY id ASC");
dialects.verified_stmt("SELECT * FROM users |> ORDER BY id DESC");
dialects.verified_stmt("SELECT * FROM users |> ORDER BY id DESC, name ASC");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we include a test case with more than one pipe operator? e.g. SELECT * FROM T |> a |> b |> c

@iffyio iffyio marked this pull request as draft March 18, 2025 06:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants