ZEPPELIN-1115: Python - interpreter for SQL over DataFrame #1164

bzz · 2016-07-11T14:46:38Z

What is this PR for?

Add new interpreter to Python group: %python.sql for SQL over DataFrame support

What type of PR is it?

Improvement

TODOs

add new interpreter %python.sql
add test
make Python-dependant tests, excluded from CI
- PythonInterpreterWithPythonInstalledTest
- PythonPandasSqlInterpreterTest
- run manually by mvn -Dpython.test.exclude='' test -pl python -am
add docs %python.sql
make %python.sql fail gracefully in case there is no Pandas or PandaSQL installed
after [ZEPPELIN-605] Add support for Scala 2.11 #747 is merged - rebase and remove -Dpython.test.exclude='' from both profiles

What is the Jira issue?

ZEPPELIN-1115

How should this be tested?

mvn -Dpython.test.exclude='' test -pl python -am should pass or manually run

Given the DataFrame i.e

%python
import pandas as pd
rates = pd.read_csv("bank.csv", sep=";")

SQL query it like

%python.sql
SELECT * FROM rates LIMIT 10

Screenshots (if appropriate)

Questions:

Does the licenses files need update? No, no dependencies were included in source or binary release
Is there breaking changes for older versions? No
Does this needs documentation? Yes

felixcheung · 2016-07-12T23:23:40Z

docs/interpreter/python.md


 ## Pandas integration
-[Zeppelin Display System]({{BASE_PATH}}/displaysystem/basicdisplaysystem.html#table) provides simple API to visualize data in Pandas DataFrames, same as in Matplotlib.
+Apace Zeppelin [Table Display System]({{BASE_PATH}}/displaysystem/basicdisplaysystem.html#table) provides build-in data visualization capabilities. Python interpreter leverages it to visualize Pandas DataFrames though similar `z.show()` API, same as with [Matplotlib integration](#matplotlib-integration).


Thank you for proof-reading! Late night commits a bad...

You mean built-in?
And how about adding this link http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html to Pandas DataFrames? It would be helpful to users i think :)

(Great work indeed! 👍 )

AhyoungRyu · 2016-07-13T03:00:46Z

docs/interpreter/python.md

+
 ## Technical description

 For in-depth technical details on current implementation plese reffer [python/README.md](https://github.com/apache/zeppelin/blob/master/python/README.md).


There is a typo. plese reffer -> please refer to

bzz · 2016-07-14T00:54:05Z

Documentation review addressed in e432961

bzz · 2016-07-14T05:17:30Z

feedback on graceful failure addressed in a378226

khalidhuseynov · 2016-07-14T05:46:56Z

Thanks for the improvement, LGTM

bzz · 2016-07-14T06:08:28Z

Thank you guys for prompt reviews!

Have added one minor TODO item to cleanup test profiles on CI, will merge after #747

bzz · 2016-07-15T08:08:02Z

Done, merging after CI ♻️ if there is no further discussion

### What is this PR for? Add new interpreter to Python group: `%python.sql` for SQL over DataFrame support ### What type of PR is it? Improvement ### TODOs * [x] add new interpreter `%python.sql` * [x] add test * [x] make Python-dependant tests, excluded from CI * PythonInterpreterWithPythonInstalledTest * PythonPandasSqlInterpreterTest * run manually by `mvn -Dpython.test.exclude='' test -pl python -am` * [x] add docs `%python.sql` * [x] make `%python.sql` fail gracefully in case there is no Pandas or PandaSQL installed * [x] after apache#747 is merged - rebase and remove `-Dpython.test.exclude=''` from both profiles ### What is the Jira issue? [ZEPPELIN-1115](https://issues.apache.org/jira/browse/ZEPPELIN-1115) ### How should this be tested? `mvn -Dpython.test.exclude='' test -pl python -am` should pass or manually run - Given the DataFrame i.e ``` %python import pandas as pd rates = pd.read_csv("bank.csv", sep=";") ``` - SQL query it like ``` %python.sql SELECT * FROM rates LIMIT 10 ``` ### Screenshots (if appropriate) ![screen shot 2016-07-11 at 23 56 04](https://cloud.githubusercontent.com/assets/5582506/16735171/1ebb9354-47c3-11e6-9354-6364e9374a20.png) ### Questions: * Does the licenses files need update? No, no dependencies were included in source or binary release * Is there breaking changes for older versions? No * Does this needs documentation? Yes Author: Alexander Bezzubov <bzz@apache.org> Closes apache#1164 from bzz/ZEPPELIN-1115/python/add-sql-for-dataframes and squashes the following commits: 0f2f852 [Alexander Bezzubov] Fail SQL gracefully if no python dependencies installed aca2bdf [Alexander Bezzubov] Fix typos in docs ⚡ 158ba6a [Alexander Bezzubov] Remove third-party dependant test from CI 5fe46fc [Alexander Bezzubov] Update Python Matplotlib notebook example 72884c8 [Alexander Bezzubov] Add docs for %python.sql feature e931dc4 [Alexander Bezzubov] Make test for PythonPandasSqlInterpreter usable 76bbb44 [Alexander Bezzubov] Complete implementation of the PythonPandasSqlInterpreter f6ca1eb [Alexander Bezzubov] Add %python.sql to interpreter menue 11ba490 [Alexander Bezzubov] Add draft implementation of %python.sql for DataFrames

bzz changed the title ~~ZEPPELIN-1115:~~ ZEPPELIN-1115: Python - new interpreter for SQL over DataFrame Jul 12, 2016

bzz changed the title ~~ZEPPELIN-1115: Python - new interpreter for SQL over DataFrame~~ ZEPPELIN-1115: Python - interpreter for SQL over DataFrame Jul 12, 2016

felixcheung reviewed Jul 12, 2016
View reviewed changes

bzz mentioned this pull request Jul 12, 2016

BigQuery Interpreter for Apazhe Zeppelin[ZEPPELIN-1153] #1170

Closed

bzz force-pushed the ZEPPELIN-1115/python/add-sql-for-dataframes branch from d20c678 to 886949b Compare July 13, 2016 00:42

AhyoungRyu reviewed Jul 13, 2016
View reviewed changes

bzz force-pushed the ZEPPELIN-1115/python/add-sql-for-dataframes branch from 11da87c to a378226 Compare July 14, 2016 06:05

bzz added 9 commits July 15, 2016 17:05

Add draft implementation of %python.sql for DataFrames

11ba490

Add %python.sql to interpreter menue

f6ca1eb

Complete implementation of the PythonPandasSqlInterpreter

76bbb44

Make test for PythonPandasSqlInterpreter usable

e931dc4

Add docs for %python.sql feature

72884c8

Update Python Matplotlib notebook example

5fe46fc

Remove third-party dependant test from CI

158ba6a

Fix typos in docs ⚡

aca2bdf

Fail SQL gracefully if no python dependencies installed

0f2f852

bzz force-pushed the ZEPPELIN-1115/python/add-sql-for-dataframes branch from a378226 to 0f2f852 Compare July 15, 2016 08:07

asfgit closed this in d8b54cf Jul 15, 2016

bzz deleted the ZEPPELIN-1115/python/add-sql-for-dataframes branch July 15, 2016 09:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ZEPPELIN-1115: Python - interpreter for SQL over DataFrame #1164

ZEPPELIN-1115: Python - interpreter for SQL over DataFrame #1164

Uh oh!

bzz commented Jul 11, 2016 •

edited

Loading

Uh oh!

felixcheung Jul 12, 2016

Uh oh!

bzz Jul 13, 2016

Uh oh!

AhyoungRyu Jul 13, 2016

Uh oh!

AhyoungRyu Jul 13, 2016 •

edited

Loading

Uh oh!

bzz commented Jul 14, 2016

Uh oh!

bzz commented Jul 14, 2016 •

edited

Loading

Uh oh!

khalidhuseynov commented Jul 14, 2016

Uh oh!

bzz commented Jul 14, 2016

Uh oh!

bzz commented Jul 15, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants


		## Technical description

		For in-depth technical details on current implementation plese reffer [python/README.md](https://github.com/apache/zeppelin/blob/master/python/README.md).

ZEPPELIN-1115: Python - interpreter for SQL over DataFrame #1164

ZEPPELIN-1115: Python - interpreter for SQL over DataFrame #1164

Uh oh!

Conversation

bzz commented Jul 11, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What is this PR for?

What type of PR is it?

TODOs

What is the Jira issue?

How should this be tested?

Screenshots (if appropriate)

Questions:

Uh oh!

felixcheung Jul 12, 2016

Choose a reason for hiding this comment

Uh oh!

bzz Jul 13, 2016

Choose a reason for hiding this comment

Uh oh!

AhyoungRyu Jul 13, 2016

Choose a reason for hiding this comment

Uh oh!

AhyoungRyu Jul 13, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bzz commented Jul 14, 2016

Uh oh!

bzz commented Jul 14, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

khalidhuseynov commented Jul 14, 2016

Uh oh!

bzz commented Jul 14, 2016

Uh oh!

bzz commented Jul 15, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bzz commented Jul 11, 2016 •

edited

Loading

AhyoungRyu Jul 13, 2016 •

edited

Loading

bzz commented Jul 14, 2016 •

edited

Loading