-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-24709][SQL][2.4] map basestring to str for python 3 #22858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
BTW the |
|
Test build #98126 has finished for PR 22858 at commit
|
|
Wenchen, this is because if sys.version >= '3':
basestring = strIs missing. Python 3 does not have |
python/pyspark/sql/functions.py
Outdated
| [Row(json=u'struct<a:bigint>')] | ||
| """ | ||
| if isinstance(json, basestring): | ||
| if isinstance(json, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem here is we will not support unicode in Python 2 ..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we apply it to 2.4? I'm not aware of the background, why we did not put
if sys.version >= '3':
basestring = str
in 2.4?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea we should. They are put only when it's needed because there are so many cases like that (for instance, imap in Python 2 and map in Python 3)
Looks that's added in another PR in master branch only.
felixcheung
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed with @HyukjinKwon - this instead #22858 (comment)
|
@HyukjinKwon thanks for the information! Shall we replace |
|
Yup, I think strictly we should change. Looks there are two occurrences at Another problem at PySpark is, inconsistent type comparison like Another problem is, some types like >>> isinstance(True, int)
TrueI was nervous about the cases above and didn't fix those changes so far. |
|
Test build #98151 has finished for PR 22858 at commit
|
|
Merged to branch-2.4. |
## What changes were proposed in this pull request? after backport #22775 to 2.4, the 2.4 sbt Jenkins QA job is broken, see https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-branch-2.4-test-sbt-hadoop-2.7/147/console This PR adds `if sys.version >= '3': basestring = str` which onlly exists in master. ## How was this patch tested? existing test Closes #22858 from cloud-fan/python. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: hyukjinkwon <gurwls223@apache.org>
|
@cloud-fan, thanks for doing this backport! |
|
Oops, mind fixing PR title too? |
|
title updated |
## What changes were proposed in this pull request? after backport apache#22775 to 2.4, the 2.4 sbt Jenkins QA job is broken, see https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-branch-2.4-test-sbt-hadoop-2.7/147/console This PR adds `if sys.version >= '3': basestring = str` which onlly exists in master. ## How was this patch tested? existing test Closes apache#22858 from cloud-fan/python. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: hyukjinkwon <gurwls223@apache.org>
What changes were proposed in this pull request?
after backport #22775 to 2.4, the 2.4 sbt Jenkins QA job is broken, see https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-branch-2.4-test-sbt-hadoop-2.7/147/console
This PR adds
if sys.version >= '3': basestring = strwhich onlly exists in master.How was this patch tested?
existing test