Skip to content

Conversation

@zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

standardize the JoinType string

be consistent with PySpark

def apply(typ: String): JoinType = typ.toLowerCase(Locale.ROOT).replace("_", "") match {

Why are the changes needed?

>>> df = spark.range(1)
>>> df2 = spark.range(2)
>>> df.join(df2, how="left_outer")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/xinrong.meng/spark/python/pyspark/sql/connect/dataframe.py", line 438, in join
plan.Join(left=self._plan, right=other._plan, on=on, how=how),
File "/Users/xinrong.meng/spark/python/pyspark/sql/connect/plan.py", line 730, in _init_
raise NotImplementedError(
NotImplementedError:
Unsupported join type: left_outer. Supported join types include:
"inner", "outer", "full", "fullouter", "full_outer",
"leftouter", "left", "left_outer", "rightouter",
"right", "right_outer", "leftsemi", "left_semi",
"semi", "leftanti", "left_anti", "anti", "cross",

Does this PR introduce any user-facing change?

yes

How was this patch tested?

updated UT

init
@zhengruifeng
Copy link
Contributor Author

cc @xinrong-meng @HyukjinKwon

@HyukjinKwon
Copy link
Member

Merged to master and branch-3.4.

HyukjinKwon pushed a commit that referenced this pull request Feb 8, 2023
… JoinType string

### What changes were proposed in this pull request?
 standardize the JoinType string

be consistent with PySpark https://github.com/apache/spark/blob/05c0fa573881b49d8ead9a5e16071190e5841e1b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/joinTypes.scala#L25

### Why are the changes needed?
```
>>> df = spark.range(1)
>>> df2 = spark.range(2)
>>> df.join(df2, how="left_outer")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/xinrong.meng/spark/python/pyspark/sql/connect/dataframe.py", line 438, in join
plan.Join(left=self._plan, right=other._plan, on=on, how=how),
File "/Users/xinrong.meng/spark/python/pyspark/sql/connect/plan.py", line 730, in _init_
raise NotImplementedError(
NotImplementedError:
Unsupported join type: left_outer. Supported join types include:
"inner", "outer", "full", "fullouter", "full_outer",
"leftouter", "left", "left_outer", "rightouter",
"right", "right_outer", "leftsemi", "left_semi",
"semi", "leftanti", "left_anti", "anti", "cross",
```

### Does this PR introduce _any_ user-facing change?
yes

### How was this patch tested?
updated UT

Closes #39938 from zhengruifeng/connect_join_types.

Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit f24ce65)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
@zhengruifeng zhengruifeng deleted the connect_join_types branch February 8, 2023 08:30
snmvaughan pushed a commit to snmvaughan/spark that referenced this pull request Jun 20, 2023
… JoinType string

### What changes were proposed in this pull request?
 standardize the JoinType string

be consistent with PySpark https://github.com/apache/spark/blob/05c0fa573881b49d8ead9a5e16071190e5841e1b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/joinTypes.scala#L25

### Why are the changes needed?
```
>>> df = spark.range(1)
>>> df2 = spark.range(2)
>>> df.join(df2, how="left_outer")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/xinrong.meng/spark/python/pyspark/sql/connect/dataframe.py", line 438, in join
plan.Join(left=self._plan, right=other._plan, on=on, how=how),
File "/Users/xinrong.meng/spark/python/pyspark/sql/connect/plan.py", line 730, in _init_
raise NotImplementedError(
NotImplementedError:
Unsupported join type: left_outer. Supported join types include:
"inner", "outer", "full", "fullouter", "full_outer",
"leftouter", "left", "left_outer", "rightouter",
"right", "right_outer", "leftsemi", "left_semi",
"semi", "leftanti", "left_anti", "anti", "cross",
```

### Does this PR introduce _any_ user-facing change?
yes

### How was this patch tested?
updated UT

Closes apache#39938 from zhengruifeng/connect_join_types.

Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit f24ce65)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants