-
Notifications
You must be signed in to change notification settings - Fork 2.8k
BigQuery Interpreter for Apazhe Zeppelin[ZEPPELIN-1153] #1170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| String projId = getProperty(PROJECT_ID); | ||
| long wTime = Long.parseLong(getProperty(WAIT_TIME)); | ||
| Iterator<GetQueryResultsResponse> pages = run(sql, projId, wTime); | ||
| while (pages.hasNext()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually we want to handle errors here as well (I.e bad SQL syntax or network issues) and set paragraph error status to Error by returning InterpreterResult(Code.ERROR) in such cases.
I.e by looking at the code below, it seems that pages can be null in case of connectivity issues, resulting in the NPE instead.
|
Great contribution, thank you! Couple of questions on top of few one already raised above:
Feel free to ask here, if something is not clearly or in case you have any further questions! |
| * </p> | ||
| * | ||
| */ | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above lines are better for bigquery.md as @bzz said in #1170 (comment). You can refer to many existed docs in here. It would be helpful to users who want to try BigQuery in Zeppelin.
|
Thanks! I will work on incorporating the feedback. |
| } | ||
| } | ||
| } | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add the above property list to bigquery.md as a configuration table like this.
|
Hi,
Thanks! |
| <li><a href="{{BASE_PATH}}/interpreter/scalding.html">Scalding</a></li> | ||
| <li><a href="{{BASE_PATH}}/interpreter/shell.html">Shell</a></li> | ||
| <li><a href="{{BASE_PATH}}/interpreter/spark.html">Spark</a></li> | ||
| <li><a href="{{BASE_PATH}}/interpreter/bigquery.html">BigQuery</a></li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you can see, the lists are in alphabetical order. You need to put bigquery.html between alluxio.html and cassandra.html.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed this
|
@babupe thank you for taking care. @AhyoungRyu thank you for review. Please let me do another pass on it and get back to you. |
| * Copyright 2016 Google Inc. | ||
| * | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would strongly advise to keep original text of the license here and in other files: See the NOTICE file distributed with this work for additional information regarding copyright ownership.
So copyright notice need to be moved to root NOTICE file and refer to ./bigquery/ module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the google copyright should be completely removed here?
Where should I include the Google copyright then? Can you share an example?
|
@babupe great job, I have highlighted few issues inline in code above. Two last things:
|
|
https://github.com/GoogleCloudPlatform/gcloud-java/blob/master/TESTING.md#testing-code-that-uses-bigquery |
|
Please clarify regarding the LICENSE file. |
|
+1 for technical docs |
| The Interpreter opens a connection with the BigQuery Service using the supplied Google project ID and the compute environment variables. | ||
|
|
||
| # Google BigQuery API Javadoc | ||
| [API Javadocs](https://developers.google.com/resources/api-libraries/documentation/bigquery/v2/java/latest/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is JavaDoc for the artefact
<groupId>com.google.apis</groupId>
<artifactId>google-api-services-bigquery</artifactId>
<version>v2-rev265-1.21.0</version>
right?
AFAIK it's an open-source library, so would you be so kind to add a link here to it's source code please? This could help future maintainers to keep up with changes, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. These packages are licensed under Apache 2.0. I have asked around to see if the code is publicly available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any updates on this one?
|
Made the changes and pushed them. Hope its good now |
|
Great stuff! Thank you for taking care. Will merge this contribution to master and branch-0.6, if there is no further discussion. |
Final updates before merging bigquery interpreter
|
Awesome! Merged this. |
|
CI build failed due to networking issues which are not related. |
|
While running locally I got: @babupe do you think it would make sense to expose this though interpreter configuration properties, same as One more thing - in this case the status of the paragraph is ERROR, but there are no output, only logs. It would be great to be able to notify user about the reason of the failure In any way, we need to document this properly for people to know that there are pre-requests, before using this interpreter. I will submit the PR with docs soon, but it would be great if you could take a look at error propagation. |
|
Added babupe#2 After configuring credentials I got strange behavior: query
Which is a bit frustrating experience, since it is |
add docs for BigQuery auth outside of GCE
|
Sorry about the delay. Merged it now. Thank you so much for documenting this! |
|
Thank you! Sounds awesome to me. CI failed due to networking issue, again.. |
|
I have now pushed the change to capture and show bigquery exceptions with bad statements both in logs and interpreter. Will look into the credentials. |
|
Do you think the external auth could be a blocker for the initial merge? |
|
Got it, makes perfect sense! Thank you for updating error handling and the plan to address external auth cfg later sounds good. CI failure is still on Travis, not able to reach network resources: Looks great to me, merging to master and branch-0.6 if there is not further duscussion! |
### What is this PR for? Google BigQuery is a popular no-ops datawarehouse. This commit will enable Apache Zeppelin users to perform BI and Analytics on their datasets in BigQuery. ### What type of PR is it? Feature ### Todos * Make bigquery interpreter appear in the interpreters section in the UI * Build SQL completion * Authorization of non-gcp ### What is the Jira issue? https://issues.apache.org/jira/browse/ZEPPELIN-1153 ### How should this be tested? copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml Add org.apache.zeppelin.bigquery.bigQueryInterpreter to property zeppelin.interpreters in zeppelin-site.xml Start Zeppelin Add BigQuery Interpreter with your project ID Create new note with %bsql.sql and run your SQL against public datasets in bigquery. ### Screenshots (if appropriate)  ### Questions: * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? No Author: Babu Prasad Elumalai <babupe@google.com> Author: babupe <babupe@google.com> Author: Alexander Bezzubov <bzz@apache.org> Closes #1170 from babupe/babupe-bigquery and squashes the following commits: ffed801 [Babu Prasad Elumalai] pushing BQ Exception to logs and Interpreter error output d3c2316 [babupe] Merge pull request #2 from bzz/babupe-add-auth-docs 64525b8 [Alexander Bezzubov] Fix typos in docs 03a777f [Alexander Bezzubov] add docs for BigQuery auth outside of GCE fcab6b7 [babupe] Merge pull request #1 from bzz/babupe-final 6a95333 [Alexander Bezzubov] Rename Apach2.0 license for google's code to adhere naming conventions 7d4f40b [Alexander Bezzubov] Add exidentaly removed licenses due to merge conflict 3be1912 [Babu Prasad Elumalai] New changes 41e076e [Babu Prasad Elumalai] Fixed formatting with readme file 97874a4 [Babu Prasad Elumalai] Pushing cropped screenshots 64affbb [babupe] Added cropped interpreter screenshot 4a1d29c [Babu Prasad Elumalai] Removed unnecessary dependencies in pom.xml e520b7b [Babu Prasad Elumalai] Exclude constants.json file for rat plugin since its static config file 69cb724 [Babu Prasad Elumalai] Fixed license header and added manual unit test documentation bbf26cc [Babu Prasad Elumalai] Added path and specific wording 4a3153f [Babu Prasad Elumalai] removed bad package from import d0c8e01 [Babu Prasad Elumalai] Added technical description to bigquery.md b6d181c [Babu Prasad Elumalai] Trying to add screenshot in README 569757f [Babu Prasad Elumalai] Incorporated feedback 764385c [Babu Prasad Elumalai] Interpreter modification, License, doc changes d85abd2 [Babu Prasad Elumalai] Modified code and license 17f6d89 [Babu Prasad Elumalai] ZEPPELIN-1153 comments committed 8fa647b [Babu Prasad Elumalai] BigQuery Interpreter for Apazhe Zeppelin 22e3487 [babupe] Update LICENSE e88b017 [babupe] Created a new license file d90e10f [babupe] Removed BigQuery from notice aa52553 [Babu Prasad Elumalai] Merge branch 'master' of https://github.com/apache/zeppelin ae096d2 [Babu Prasad Elumalai] License changes 20962d2 [Babu Prasad Elumalai] Pushing license changes 3d5f8e7 [Babu Prasad Elumalai] Modified license header 5a2e674 [Babu Prasad Elumalai] Added license info for Jackson library and added BQ API source 4db74c1 [Babu Prasad Elumalai] Adding license stuff 31c373f [Babu Prasad Elumalai] Fixed formatting with readme file 287744c [Babu Prasad Elumalai] Merge branch 'babupe-bigquery' of https://github.com/babupe/zeppelin into babupe-bigquery f318b20 [Babu Prasad Elumalai] Pushing cropped screenshots 17fd4e8 [babupe] Added cropped interpreter screenshot f872aa0 [Babu Prasad Elumalai] Removed unnecessary dependencies in pom.xml 5983e36 [Babu Prasad Elumalai] Exclude constants.json file for rat plugin since its static config file 11e88dc [Babu Prasad Elumalai] Replaced license header with formatting 4b82abd [Babu Prasad Elumalai] Fixed license header and added manual unit test documentation 87f5efe [Babu Prasad Elumalai] Added path and specific wording 6132d78 [Babu Prasad Elumalai] Fixing License and skipping failing tests 2254a49 [Babu Prasad Elumalai] removed bad package from import 73e3f6d [Babu Prasad Elumalai] Added technical description to bigquery.md 089820b [Babu Prasad Elumalai] Trying to add screenshot in README a00b48e [Babu Prasad Elumalai] Incorporated feedback 17846f1 [Babu Prasad Elumalai] Interpreter modification, License, doc changes 50c41fc [Babu Prasad Elumalai] Modified code and license 75d8ee6 [Babu Prasad Elumalai] ZEPPELIN-1153 comments committed 2a2bedc [Babu Prasad Elumalai] BigQuery Interpreter for Apazhe Zeppelin (cherry picked from commit 57c264d) Signed-off-by: Alexander Bezzubov <bzz@apache.org>
### What is this PR for? Google BigQuery is a popular no-ops datawarehouse. This commit will enable Apache Zeppelin users to perform BI and Analytics on their datasets in BigQuery. ### What type of PR is it? Feature ### Todos * Make bigquery interpreter appear in the interpreters section in the UI * Build SQL completion * Authorization of non-gcp ### What is the Jira issue? https://issues.apache.org/jira/browse/ZEPPELIN-1153 ### How should this be tested? copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml Add org.apache.zeppelin.bigquery.bigQueryInterpreter to property zeppelin.interpreters in zeppelin-site.xml Start Zeppelin Add BigQuery Interpreter with your project ID Create new note with %bsql.sql and run your SQL against public datasets in bigquery. ### Screenshots (if appropriate)  ### Questions: * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? No Author: Babu Prasad Elumalai <babupe@google.com> Author: babupe <babupe@google.com> Author: Alexander Bezzubov <bzz@apache.org> Closes apache#1170 from babupe/babupe-bigquery and squashes the following commits: ffed801 [Babu Prasad Elumalai] pushing BQ Exception to logs and Interpreter error output d3c2316 [babupe] Merge pull request apache#2 from bzz/babupe-add-auth-docs 64525b8 [Alexander Bezzubov] Fix typos in docs 03a777f [Alexander Bezzubov] add docs for BigQuery auth outside of GCE fcab6b7 [babupe] Merge pull request apache#1 from bzz/babupe-final 6a95333 [Alexander Bezzubov] Rename Apach2.0 license for google's code to adhere naming conventions 7d4f40b [Alexander Bezzubov] Add exidentaly removed licenses due to merge conflict 3be1912 [Babu Prasad Elumalai] New changes 41e076e [Babu Prasad Elumalai] Fixed formatting with readme file 97874a4 [Babu Prasad Elumalai] Pushing cropped screenshots 64affbb [babupe] Added cropped interpreter screenshot 4a1d29c [Babu Prasad Elumalai] Removed unnecessary dependencies in pom.xml e520b7b [Babu Prasad Elumalai] Exclude constants.json file for rat plugin since its static config file 69cb724 [Babu Prasad Elumalai] Fixed license header and added manual unit test documentation bbf26cc [Babu Prasad Elumalai] Added path and specific wording 4a3153f [Babu Prasad Elumalai] removed bad package from import d0c8e01 [Babu Prasad Elumalai] Added technical description to bigquery.md b6d181c [Babu Prasad Elumalai] Trying to add screenshot in README 569757f [Babu Prasad Elumalai] Incorporated feedback 764385c [Babu Prasad Elumalai] Interpreter modification, License, doc changes d85abd2 [Babu Prasad Elumalai] Modified code and license 17f6d89 [Babu Prasad Elumalai] ZEPPELIN-1153 comments committed 8fa647b [Babu Prasad Elumalai] BigQuery Interpreter for Apazhe Zeppelin 22e3487 [babupe] Update LICENSE e88b017 [babupe] Created a new license file d90e10f [babupe] Removed BigQuery from notice aa52553 [Babu Prasad Elumalai] Merge branch 'master' of https://github.com/apache/zeppelin ae096d2 [Babu Prasad Elumalai] License changes 20962d2 [Babu Prasad Elumalai] Pushing license changes 3d5f8e7 [Babu Prasad Elumalai] Modified license header 5a2e674 [Babu Prasad Elumalai] Added license info for Jackson library and added BQ API source 4db74c1 [Babu Prasad Elumalai] Adding license stuff 31c373f [Babu Prasad Elumalai] Fixed formatting with readme file 287744c [Babu Prasad Elumalai] Merge branch 'babupe-bigquery' of https://github.com/babupe/zeppelin into babupe-bigquery f318b20 [Babu Prasad Elumalai] Pushing cropped screenshots 17fd4e8 [babupe] Added cropped interpreter screenshot f872aa0 [Babu Prasad Elumalai] Removed unnecessary dependencies in pom.xml 5983e36 [Babu Prasad Elumalai] Exclude constants.json file for rat plugin since its static config file 11e88dc [Babu Prasad Elumalai] Replaced license header with formatting 4b82abd [Babu Prasad Elumalai] Fixed license header and added manual unit test documentation 87f5efe [Babu Prasad Elumalai] Added path and specific wording 6132d78 [Babu Prasad Elumalai] Fixing License and skipping failing tests 2254a49 [Babu Prasad Elumalai] removed bad package from import 73e3f6d [Babu Prasad Elumalai] Added technical description to bigquery.md 089820b [Babu Prasad Elumalai] Trying to add screenshot in README a00b48e [Babu Prasad Elumalai] Incorporated feedback 17846f1 [Babu Prasad Elumalai] Interpreter modification, License, doc changes 50c41fc [Babu Prasad Elumalai] Modified code and license 75d8ee6 [Babu Prasad Elumalai] ZEPPELIN-1153 comments committed 2a2bedc [Babu Prasad Elumalai] BigQuery Interpreter for Apazhe Zeppelin


What is this PR for?
Google BigQuery is a popular no-ops datawarehouse. This commit will enable Apache Zeppelin users to perform BI and Analytics on their datasets in BigQuery.
What type of PR is it?
Feature
Todos
What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1153
How should this be tested?
copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml
Add org.apache.zeppelin.bigquery.bigQueryInterpreter to property zeppelin.interpreters in zeppelin-site.xml
Start Zeppelin
Add BigQuery Interpreter with your project ID
Create new note with %bsql.sql and run your SQL against public datasets in bigquery.
Screenshots (if appropriate)
Questions: