[SPARK-16380][EXAMPLES] Update SQL examples and programming guide for Python language binding #14317

liancheng · 2016-07-22T12:04:18Z

This PR is based on PR #14098 authored by @wangmiao1981.

What changes were proposed in this pull request?

This PR replaces the original Python Spark SQL example file with the following three files:

sql/basic.py

Demonstrates basic Spark SQL features.
sql/datasource.py

Demonstrates various Spark SQL data sources.
sql/hive.py

Demonstrates Spark SQL Hive interaction.

This PR also removes hard-coded Python example snippets in the SQL programming guide by extracting snippets from the above files using the include_example Liquid template tag.

How was this patch tested?

Manually tested.

SparkQA · 2016-07-22T12:33:18Z

Test build #62724 has finished for PR 14317 at commit 5849497.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-07-22T12:43:15Z

Test build #62725 has finished for PR 14317 at commit ba8aae4.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

liancheng · 2016-07-23T08:12:51Z

@JoshRosen Would you mind to have a look at this? Thanks!

rxin · 2016-07-23T18:40:37Z

Merging in master/2.0.

… Python language binding This PR is based on PR #14098 authored by wangmiao1981. ## What changes were proposed in this pull request? This PR replaces the original Python Spark SQL example file with the following three files: - `sql/basic.py` Demonstrates basic Spark SQL features. - `sql/datasource.py` Demonstrates various Spark SQL data sources. - `sql/hive.py` Demonstrates Spark SQL Hive interaction. This PR also removes hard-coded Python example snippets in the SQL programming guide by extracting snippets from the above files using the `include_example` Liquid template tag. ## How was this patch tested? Manually tested. Author: wm624@hotmail.com <wm624@hotmail.com> Author: Cheng Lian <lian@databricks.com> Closes #14317 from liancheng/py-examples-update. (cherry picked from commit 53b2456) Signed-off-by: Reynold Xin <rxin@databricks.com>

wangmiao1981 · 2016-07-23T18:42:50Z

docs/sql-programming-guide.md

 The entry point into all functionality in Spark is the [`SparkSession`](api/python/pyspark.sql.html#pyspark.sql.SparkSession) class. To create a basic `SparkSession`, just use `SparkSession.builder`:

-{% include_example init_session python/sql.py %}
+{% include_example init_session python/sql/basic.py %}


The file name is not consistent with Scala and Java version. The file names are SparkSQLExample.scala and SparkSQLExample.java. The Hive and Data Source examples file names are not consistent either.

For Scala and Java, it's a convention that the file name should be the same as the (major) class defined in the file, while camel case file name doesn't conform to Python code convention. You may check other PySpark file names in the repo as a reference.

wangmiao1981 · 2016-07-23T18:46:59Z

examples/src/main/python/sql/basic.py

+    # +-------+
+
+    # Select everybody, but increment the age by 1
+    df.select(df['name'], df['age'] + 1).show()


Do you want to use col('...'). I have tested it and it works.

Yea, I know I brought up this issue, but it is still in question... Although df['...'] has potential issue with self-join, it is the way Pandas DataFrame works. Considering we've tried to workaround various self-join corner cases within Catalyst, now I tend to preserve it as is. Maybe we'll deprecate this syntax later.

wangmiao1981 and others added 8 commits July 22, 2016 19:41

add SQL example in python

fc9dc47

add examples and change md file

b19384f

fix python style error

cfceeb9

address review comments, part 1, not completed, WIP

ed27618

fix python style error

c79a2b0

update examples

a3c95e2

move SQL example into sql folder

4375660

Updates Python Spark SQL examples

5849497

liancheng mentioned this pull request Jul 22, 2016

[SPARK-16380][SQL][Example]:Update SQL examples and programming guide for Python language binding #14098

Closed

Fixes minor styling issues

ba8aae4

wangmiao1981 reviewed Jul 23, 2016
View reviewed changes

asfgit closed this in 53b2456 Jul 23, 2016

wangmiao1981 reviewed Jul 23, 2016
View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-16380][EXAMPLES] Update SQL examples and programming guide for Python language binding #14317

[SPARK-16380][EXAMPLES] Update SQL examples and programming guide for Python language binding #14317

Uh oh!

liancheng commented Jul 22, 2016 •

edited

Loading

Uh oh!

SparkQA commented Jul 22, 2016

Uh oh!

SparkQA commented Jul 22, 2016

Uh oh!

liancheng commented Jul 23, 2016

Uh oh!

rxin commented Jul 23, 2016

Uh oh!

wangmiao1981 Jul 23, 2016

Uh oh!

liancheng Jul 24, 2016

Uh oh!

wangmiao1981 Jul 23, 2016

Uh oh!

liancheng Jul 24, 2016 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-16380][EXAMPLES] Update SQL examples and programming guide for Python language binding #14317

[SPARK-16380][EXAMPLES] Update SQL examples and programming guide for Python language binding #14317

Uh oh!

Conversation

liancheng commented Jul 22, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Jul 22, 2016

Uh oh!

SparkQA commented Jul 22, 2016

Uh oh!

liancheng commented Jul 23, 2016

Uh oh!

rxin commented Jul 23, 2016

Uh oh!

wangmiao1981 Jul 23, 2016

Choose a reason for hiding this comment

Uh oh!

liancheng Jul 24, 2016

Choose a reason for hiding this comment

Uh oh!

wangmiao1981 Jul 23, 2016

Choose a reason for hiding this comment

Uh oh!

liancheng Jul 24, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

liancheng commented Jul 22, 2016 •

edited

Loading

liancheng Jul 24, 2016 •

edited

Loading