-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support RelationSubquery
PPL
#775
Changes from 4 commits
64685fb
27c1faa
93ff020
dfc3eb0
1a4451a
863dfb8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -247,17 +247,27 @@ mlArg | |
|
||
// clauses | ||
fromClause | ||
: SOURCE EQUAL tableSourceClause | ||
| INDEX EQUAL tableSourceClause | ||
: SOURCE EQUAL tableOrSubqueryClause | ||
| INDEX EQUAL tableOrSubqueryClause | ||
; | ||
|
||
tableOrSubqueryClause | ||
: LT_SQR_PRTHS subSearch RT_SQR_PRTHS (AS alias = qualifiedName)? | ||
| tableSourceClause | ||
; | ||
|
||
// One tableSourceClause will generate one Relation node with/without one alias | ||
// even if the relation contains more than one table sources. | ||
// These table sources in one relation will be readed one by one in OpenSearch. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. are we support this
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. The current valid syntax is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
is it a valid grammer in spark-sql. If not, does it confuse user?
PPL on OpenSearch support it, it is multiple opensearch index. Should we let the Catalog handle table name resolution? for openserach catalog, it can resolve table name as multiple index properly. for instance, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes. It is a valid grammer in opensearch-spark. For example
generate a Spark plan with Union
Oh, that is the key difference, opensearch-spark can't handle it since "tb1, tb2, tb3" in backticks will be handled as a whole and name with comma is invalid in Spark. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. PPL on OpenSearch supports:
But PPL on Spark supports the first two. I would suggest to mark the third as invalid since users will treat the content in backticks as a whole as usual. `accounts, account2` seems more specific for OpenSearch domain. For the instance you provided above, my suggestion is treating content in backticks as a whole. @penghuo
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Any different thoughts? I think it's worth to open a meta issue in sql repo for further discussion if we couldn't get align here. This context in a closed PR could be easily lost. |
||
// But it may have different behaivours in different execution backends. | ||
// For example, a Spark UnresovledRelation node only accepts one data source. | ||
tableSourceClause | ||
: tableSource (COMMA tableSource)* | ||
: tableSource (COMMA tableSource)* (AS alias = qualifiedName)? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Table alias is useful in query which contains a subquery, for example select a, (
select sum(b)
from catalog.schema.table1 as t1
where t1.a = t2.a
) sum_b
from catalog.schema.table2 as t2
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. thanks for the detailed review - can you also add this explanation to the ppl-subquery-command doc ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. I will give more examples in doc. |
||
; | ||
|
||
// join | ||
joinCommand | ||
: (joinType) JOIN sideAlias joinHintList? joinCriteria? right = tableSource | ||
: (joinType) JOIN sideAlias joinHintList? joinCriteria? right = tableOrSubqueryClause | ||
; | ||
|
||
joinType | ||
|
@@ -279,13 +289,13 @@ joinCriteria | |
; | ||
|
||
joinHintList | ||
: hintPair (COMMA? hintPair)* | ||
; | ||
: hintPair (COMMA? hintPair)* | ||
; | ||
|
||
hintPair | ||
: leftHintKey = LEFT_HINT DOT ID EQUAL leftHintValue = ident #leftHint | ||
| rightHintKey = RIGHT_HINT DOT ID EQUAL rightHintValue = ident #rightHint | ||
; | ||
: leftHintKey = LEFT_HINT DOT ID EQUAL leftHintValue = ident #leftHint | ||
| rightHintKey = RIGHT_HINT DOT ID EQUAL rightHintValue = ident #rightHint | ||
; | ||
|
||
renameClasue | ||
: orignalField = wcFieldExpression AS renamedField = wcFieldExpression | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two queries as examples. One example query ignores SEARCH keyword. Keep a SEARCH keyword in this example query.