[SPARK-42570][CONNECT][PYTHON] Fix DataFrameReader to use the default source #40166

ueshin · 2023-02-25T00:14:21Z

What changes were proposed in this pull request?

Fixes DataFrameReader to use the default source.

Why are the changes needed?

spark.read.load(path)

should work and use the default source without specifying the format.

Does this PR introduce any user-facing change?

The format doesn't need to be specified.

How was this patch tested?

Enabled related tests.

ueshin · 2023-02-25T00:15:42Z

python/pyspark/sql/tests/test_readwriter.py

-        actual = self.spark.read.load(path=tmpPath)
-        self.assertEqual(sorted(df.collect()), sorted(actual.collect()))
-        self.spark.sql("SET spark.sql.sources.default=" + defaultDataSourceName)
+        try:


The changes in this file is to make the cleanup done properly.

HyukjinKwon

Nice, thanks Takuya

amaliujia · 2023-02-25T00:36:30Z

LGTM

What is the default source BTW?

connector/connect/common/src/main/protobuf/spark/connect/relations.proto

ueshin · 2023-02-25T00:49:18Z

What is the default source BTW?

If format is not set, the value from SQL conf 'spark.sql.sources.default' will be used.

hvanhovell · 2023-02-25T18:13:43Z

Merging.

… source ### What changes were proposed in this pull request? Fixes `DataFrameReader` to use the default source. ### Why are the changes needed? ```py spark.read.load(path) ``` should work and use the default source without specifying the format. ### Does this PR introduce _any_ user-facing change? The `format` doesn't need to be specified. ### How was this patch tested? Enabled related tests. Closes #40166 from ueshin/issues/SPARK-42570/reader. Authored-by: Takuya UESHIN <ueshin@databricks.com> Signed-off-by: Herman van Hovell <herman@databricks.com> (cherry picked from commit ad35f35) Signed-off-by: Herman van Hovell <herman@databricks.com>

… source ### What changes were proposed in this pull request? Fixes `DataFrameReader` to use the default source. ### Why are the changes needed? ```py spark.read.load(path) ``` should work and use the default source without specifying the format. ### Does this PR introduce _any_ user-facing change? The `format` doesn't need to be specified. ### How was this patch tested? Enabled related tests. Closes apache#40166 from ueshin/issues/SPARK-42570/reader. Authored-by: Takuya UESHIN <ueshin@databricks.com> Signed-off-by: Herman van Hovell <herman@databricks.com> (cherry picked from commit ad35f35) Signed-off-by: Herman van Hovell <herman@databricks.com>

Fix DataFrameReader to use the default source.

e077354

ueshin requested review from HyukjinKwon and zhengruifeng February 25, 2023 00:14

github-actions bot added CONNECT CORE PYTHON SQL labels Feb 25, 2023

ueshin commented Feb 25, 2023

View reviewed changes

HyukjinKwon approved these changes Feb 25, 2023

View reviewed changes

amaliujia reviewed Feb 25, 2023

View reviewed changes

connector/connect/common/src/main/protobuf/spark/connect/relations.proto Show resolved Hide resolved

Fix.

58534c3

Fix.

5497332

hvanhovell closed this in ad35f35 Feb 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-42570][CONNECT][PYTHON] Fix DataFrameReader to use the default source #40166

[SPARK-42570][CONNECT][PYTHON] Fix DataFrameReader to use the default source #40166

Uh oh!

ueshin commented Feb 25, 2023

Uh oh!

ueshin Feb 25, 2023 •

edited

Loading

Uh oh!

HyukjinKwon left a comment

Uh oh!

amaliujia commented Feb 25, 2023

Uh oh!

Uh oh!

ueshin commented Feb 25, 2023

Uh oh!

hvanhovell commented Feb 25, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-42570][CONNECT][PYTHON] Fix DataFrameReader to use the default source #40166

[SPARK-42570][CONNECT][PYTHON] Fix DataFrameReader to use the default source #40166

Uh oh!

Conversation

ueshin commented Feb 25, 2023

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

ueshin Feb 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon left a comment

Choose a reason for hiding this comment

Uh oh!

amaliujia commented Feb 25, 2023

Uh oh!

Uh oh!

ueshin commented Feb 25, 2023

Uh oh!

hvanhovell commented Feb 25, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ueshin Feb 25, 2023 •

edited

Loading