-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-16475][SQL] Broadcast Hint for SQL Queries #14132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @rxin and @hvanhovell . |
|
If the direction is right, I can move on adding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this because the BRACKETED_COMMENT rule is now expecting at least one character?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Could you give me some workaround advice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My advice would be to add the HINT_PREFIX rule ('/*+')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you try to add this as a case (| '/**/' -> channel(HIDDEN)) to the BRACKETED_COMMENT rule?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. It seems we can not do that due to channel(HIDDEN).
->command in lexer rule BRACKETED_COMMENT must be last element of single outermost alt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok nvm....
|
This looks pretty good! |
|
Thank you for quick review! I'll let you know after updating. |
|
Test build #62082 has finished for PR 14132 at commit
|
|
Oops. Five errors, too. I'll fix these tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are we trying to support here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can also do this (is easier in the AST builder): | hintName=identifier '(' parameters+=identifier parameters+=identifier ')'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It supports 'INDEX(t idx_emp)' style. For example, I included one in the test.
|
Test build #62162 has finished for PR 14132 at commit
|
|
For the HINT_PREFIX, I tried at the first, but still have some problem. So, I couldn't include the very previous commit. |
|
Thank you always, @hvanhovell . And, sorry for the delay. Since last Saturday, I need to do some important personal stuff offline, so the time is limited for me. :( |
|
I'm back. I'll resolve them. |
|
So far, I couldn't do the following two advices. I just inform you that I'm still working these. :)
|
|
Now, only minor |
|
Hi, @hvanhovell . So far, I tried in the following way for hint
- : '/*+' hintStatement '*/'
+ : HINT_PREFIX hintStatement '*/'
;
hintStatement
@@ -961,12 +961,12 @@ SIMPLE_COMMENT
: '--' ~[\r\n]* '\r'? '\n'? -> channel(HIDDEN)
;
-BRACKETED_EMPTY_COMMENT
- : '/**/' -> channel(HIDDEN)
+HINT_PREFIX
+ : '/*+'
;
BRACKETED_COMMENT
- : '/*' ~[+] .*? '*/' -> channel(HIDDEN)
+ : '/*' .*? '*/' -> channel(HIDDEN) <--- The original one.
; |
|
Test build #62196 has finished for PR 14132 at commit
|
|
Test build #62198 has finished for PR 14132 at commit
|
|
Hi, @rxin . |
|
Hi, @hvanhovell . |
|
@dongjoon-hyun sure. It was merely a suggestion to get rid of the |
|
Thank you, @hvanhovell . |
|
PR description is updated. |
…d more description.
|
Oh, the one failure is due to a new MAPJOIN testcase in the master branch (I added yesterday.) |
|
Test build #62975 has finished for PR 14132 at commit
|
|
Hi, @cloud-fan . |
|
This PR grows too much. Sometime, scrolling is too slow. I close this and open a new one #14426 . |
What changes were proposed in this pull request?
This PR aims to achieve the following two goals in Spark SQL.
1. Generic Hint Syntax
The generic hints are parsed and transformed into concrete hints by
SubstituteHintsof Analyzer. The unknown hints are removed, too. For example,Hint("MAPJOIN")is transformed intoBroadcastJoinand other hints are removed currently.Unlink Hive,
NEWMAPJOIN(t)is allowed for accepting new Spark Hints.2. Broadcast Hints
The followings are recognized. Technically, broadcast hints are matched
UnresolvedRelationto support HiveMetastoreRelation. The style ofdatabase_name.table_nameis not allowed in this PR.Examples
How was this patch tested?
Pass the Jenkins tests with new testcases.