Skip to content

Conversation

@shaneknapp
Copy link
Contributor

@shaneknapp shaneknapp commented Oct 30, 2019

What changes were proposed in this pull request?

remove python2.7 tests and test infra for 3.0+

Why are the changes needed?

because python2.7 is finally going the way of the dodo.

Does this PR introduce any user-facing change?

newp.

How was this patch tested?

the build system will test this

@shaneknapp shaneknapp changed the title [WIP][SPARK-29672][PYSPARK] remove py27 support for jenkins tests [WIP][SPARK-29672][PYSPARK] DO NOT MERGE -- remove py27 support for jenkins tests Oct 30, 2019
@SparkQA
Copy link

SparkQA commented Oct 30, 2019

Test build #112965 has finished for PR 26330 at commit 92f7268.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 30, 2019

Test build #112966 has finished for PR 26330 at commit 7c0fe46.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Thank you for making this PR, @shaneknapp .
From the Jenkins log, PIP installation test seems to use Python 3.5 still. Could you take a look at that?

Testing pip installation with python 3.5
Using /tmp/tmp.MBOPq46hzq for virtualenv

@shaneknapp
Copy link
Contributor Author

Thank you for making this PR, @shaneknapp .
From the Jenkins log, PIP installation test seems to use Python 3.5 still. Could you take a look at that?

Testing pip installation with python 3.5
Using /tmp/tmp.MBOPq46hzq for virtualenv

that's actually in dev/run-pip-tests... i guess i can hack on that as it's kinda in the scope of this PR.

@shaneknapp
Copy link
Contributor Author

that's actually in dev/run-pip-tests... i guess i can hack on that as it's kinda in the scope of this PR.

btw i've never really looked too closely at run-pip-tests, so i may have a few extra commits as i mess around w/it trying to get it to use 3.6, and strip out the python2 stuff.

@SparkQA
Copy link

SparkQA commented Oct 31, 2019

Test build #112972 has finished for PR 26330 at commit f25c4b3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 31, 2019

Test build #112968 has finished for PR 26330 at commit 226c1fa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shaneknapp
Copy link
Contributor Author

test this please

@SparkQA
Copy link

SparkQA commented Oct 31, 2019

Test build #112977 has finished for PR 26330 at commit f25c4b3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shaneknapp
Copy link
Contributor Author

okie dokie, i'm confident that this change won't break anything, but i'll wait on merging until i get the word.

@HyukjinKwon @mengxr @srowen

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems OK to me, though I don't know this script well.

@shaneknapp
Copy link
Contributor Author

shaneknapp commented Oct 31, 2019

Seems OK to me, though I don't know this script well.

this is @holdenk 's contribution... should have added her to the review earlier.

also, if you look at the console output (https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112977/console), the following block of code in dev/run-pip-tests causes some really ugly log4j errors that i would LOVE to clean up (but i'm not entirely sure how easy that might be):

    echo "Run basic sanity check on pip installed version with spark-submit"
    spark-submit "$FWDIR"/dev/pip-sanity-check.py
    echo "Run basic sanity check with import based"
    python "$FWDIR"/dev/pip-sanity-check.py
    echo "Run the tests for context.py"
    python "$FWDIR"/python/pyspark/context.py
Run basic sanity check on pip installed version with spark-submit
log4j:ERROR setFile(null,true) call failed.
java.io.FileNotFoundException: target/unit-tests.log (No such file or directory)
	at java.io.FileOutputStream.open0(Native Method)
<SNIP>
Successfully ran pip sanity check

@holdenk
Copy link
Contributor

holdenk commented Oct 31, 2019

The code change LGTM provided it runs in Jenkins
As for the log4j errors clean up I think we can address that in a follow up. My guess is we need to set an env var and create a file but I think it's ok to move forward with this first.

@SparkQA
Copy link

SparkQA commented Oct 31, 2019

Test build #113043 has finished for PR 26330 at commit b57acd9.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shaneknapp
Copy link
Contributor Author

test this please

@SparkQA
Copy link

SparkQA commented Oct 31, 2019

Test build #113050 has finished for PR 26330 at commit b57acd9.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shaneknapp
Copy link
Contributor Author

Test build #113050 has finished for PR 26330 at commit b57acd9.

  • This patch fails PySpark unit tests.

i'll revert my pypy3 change and let the test run again... this is officially out of scope of this pr. :)

This reverts commit b57acd9.
@HyukjinKwon
Copy link
Member

Actually, @shaneknapp, I think we should remove pypy 2.5 too, which is compatible with Python 2 ... does it make things complicated?

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Nov 1, 2019

Ah, tests fails with:

py4j.protocol.Py4JError: org.apache.spark.SparkConf.__name__ does not exist in the JVM

Okay, this one I think I can try to investigate it in a separate PR. I will make a followup after this PR.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general. There is one side issue #26330 (comment)

@shaneknapp
Copy link
Contributor Author

Actually, @shaneknapp, I think we should remove pypy 2.5 too, which is compatible with Python 2 ... does it make things complicated?

nope, it makes things easier. :)

@SparkQA
Copy link

SparkQA commented Nov 1, 2019

Test build #113059 has finished for PR 26330 at commit 0b37b71.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Actually, my initial thought is that we drop everything right after Spark 3.0 branch is cut out (including upgrading R version to 3.4 as well to address intermittent CRAN check failure) which is strictly correct.

However, it's not always able to match the exact timing in practice and those infer/build stuff need all your efforts and time. So TBH I just don't want to say let's hold off at this moment until 3.0.
I will leave it to you and other committers here.

@shaneknapp
Copy link
Contributor Author

Actually, my initial thought is that we drop everything right after Spark 3.0 branch is cut out (including upgrading R version to 3.4 as well to address intermittent CRAN check failure) which is strictly correct.

However, it's not always able to match the exact timing in practice and those infer/build stuff need all your efforts and time. So TBH I just don't want to say let's hold off at this moment until 3.0.
I will leave it to you and other committers here.

imho it's better to get the test/build infra upgraded before new versions are cut... if we still want to 'unofficially' support python2.7/pypy2.5.1 right up until we release 3.0, i can add those back to python/run-tests.py.

in fact, i will do that now "just to see what happens". :)

@BryanCutler
Copy link
Member

My thought is that it's not that practical to test every PR against every Python version we support. We should focus on maybe 2 of the most widely used versions. Since we still need to support these versions, we can setup pre-release tasks to run the full tests and correct any issues that have come up. I think any issues will mostly be minor syntax problems and should be easily corrected.

@shaneknapp
Copy link
Contributor Author

My thought is that it's not that practical to test every PR against every Python version we support. We should focus on maybe 2 of the most widely used versions. Since we still need to support these versions, we can setup pre-release tasks to run the full tests and correct any issues that have come up. I think any issues will mostly be minor syntax problems and should be easily corrected.

so, which two versions would you suggest? 3.6? 2.7? some other minor version of 3?

@SparkQA
Copy link

SparkQA commented Nov 13, 2019

Test build #113722 has finished for PR 26330 at commit b5bada3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shaneknapp shaneknapp self-assigned this Nov 14, 2019
@shaneknapp
Copy link
Contributor Author

shaneknapp commented Nov 14, 2019

ok, i have an idea:

how about i change the scope of this PR (and associated JIRA) to solely update the test/build infrastructure to python3. i will leave the python2.7/pypy2.5.1 test calls in place (see b5bada3) for now.

once we cut the first 3.0 release, we can decide exactly which major/minor versions of python to test against on master and update python/run-tests.py accordingly.

sound reasonable? if so, then this PR is ready to merge as-is.

@BryanCutler
Copy link
Member

so, which two versions would you suggest? 3.6? 2.7? some other minor version of 3?

If we are choosing 2 versions, I think 3.6 and maybe 3.8.

once we cut the first 3.0 release, we can decide exactly which major/minor versions of python to test against on master and update python/run-tests.py accordingly.

That sounds like a good plan if it's not too much hassle for you. I would get a little nervous about some Python 2 issue holding up the release :)

@srowen
Copy link
Member

srowen commented Nov 14, 2019

Seems OK to me if it's forward progress. But we drop it entirely after branch 3.0 is cut?

@shaneknapp
Copy link
Contributor Author

If we are choosing 2 versions, I think 3.6 and maybe 3.8.

that works... it will take a little bit of work to get a 3.8 conda env set up, but i should be able to get that sorted by the end of january 2020.

once we cut the first 3.0 release, we can decide exactly which major/minor versions of python to test against on master and update python/run-tests.py accordingly.

That sounds like a good plan if it's not too much hassle for you. I would get a little nervous about some Python 2 issue holding up the release :)

well, tests are currently passing w/python2... which is great. if we see that python2 tests are going to hold up the 3.0 release, i can just yank the python2.7 (and pypy) executables from python/run-tests.py!

Seems OK to me if it's forward progress. But we drop it entirely after branch 3.0 is cut?

yeah, it's forward progress and i will be overjoyed to drop python2.7/pypy2.5.1 after 3.0.

i'll update the PR title/desc, as well as the JIRA and then merge later today.

@shaneknapp shaneknapp changed the title [SPARK-29672][PYSPARK] remove py27 support for jenkins tests and testing framework [SPARK-29672][PYSPARK] update spark testing framework to use python3 Nov 14, 2019
@shaneknapp shaneknapp deleted the remove-py27-tests branch November 14, 2019 18:21
@shaneknapp
Copy link
Contributor Author

thanks for the reviews everyone! much appreciated... :)

@HyukjinKwon
Copy link
Member

BTW, @shaneknapp quick question. did we change default python executable from Python 2 to Python 3? I was investigating some flaky tests (https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113824/testReport/org.apache.spark.sql.streaming.continuous/ContinuousSuite/continuous_mode_with_various_UDFs___Scalar_Pandas_UDF/history/) which uses python executable and wondering if there's any delta recently.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Nov 15, 2019

Hi, @shaneknapp .
Could you update the Jenkins job scripts? The following Jenkins job seems to be broken.

- ZINC_PORT=$(python -S -c "import random; print random.randrange(3030,4030)")
+ ZINC_PORT=$(python -S -c "import random; print(random.randrange(3030,4030))")

@HyukjinKwon
Copy link
Member

Oh, NVM about my comment. It was because #26133

@shaneknapp
Copy link
Contributor Author

Hi, @shaneknapp .
Could you update the Jenkins job scripts? The following Jenkins job seems to be broken.

- ZINC_PORT=$(python -S -c "import random; print random.randrange(3030,4030)")
+ ZINC_PORT=$(python -S -c "import random; print(random.randrange(3030,4030))")

yeah, i missed a couple of those. they're all fixed now.

@dongjoon-hyun
Copy link
Member

Thank you, @shaneknapp !

@AndrewLvov
Copy link

AndrewLvov commented Dec 23, 2019

Hey @HyukjinKwon , did you resolve the error
py4j.protocol.Py4JError: org.apache.spark.SparkConf.__name__ does not exist in the JVM

?
Could you please point me to changes if you did resolve it ?
I"m having the same error when trying to run PySpark using Python 3.

@HyukjinKwon
Copy link
Member

Can you file a JIRA with the reproducible steps?

dongjoon-hyun pushed a commit that referenced this pull request Jul 15, 2021
…encies.sh

### What changes were proposed in this pull request?

This PR is a followup of #26330. There is the last place to fix in `dev/test-dependencies.sh`

### Why are the changes needed?

To stick to Python 3 instead of using Python 2 mistakenly.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

Manually tested.

Closes #33368 from HyukjinKwon/change-python-3.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun pushed a commit that referenced this pull request Jul 15, 2021
…encies.sh

### What changes were proposed in this pull request?

This PR is a followup of #26330. There is the last place to fix in `dev/test-dependencies.sh`

### Why are the changes needed?

To stick to Python 3 instead of using Python 2 mistakenly.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

Manually tested.

Closes #33368 from HyukjinKwon/change-python-3.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 6bd385f)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun pushed a commit that referenced this pull request Jul 15, 2021
…encies.sh

### What changes were proposed in this pull request?

This PR is a followup of #26330. There is the last place to fix in `dev/test-dependencies.sh`

### Why are the changes needed?

To stick to Python 3 instead of using Python 2 mistakenly.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

Manually tested.

Closes #33368 from HyukjinKwon/change-python-3.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 6bd385f)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun pushed a commit that referenced this pull request Jul 15, 2021
…encies.sh

### What changes were proposed in this pull request?

This PR is a followup of #26330. There is the last place to fix in `dev/test-dependencies.sh`

### Why are the changes needed?

To stick to Python 3 instead of using Python 2 mistakenly.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

Manually tested.

Closes #33368 from HyukjinKwon/change-python-3.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 6bd385f)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
flyrain pushed a commit to flyrain/spark that referenced this pull request Sep 21, 2021
…encies.sh

### What changes were proposed in this pull request?

This PR is a followup of apache#26330. There is the last place to fix in `dev/test-dependencies.sh`

### Why are the changes needed?

To stick to Python 3 instead of using Python 2 mistakenly.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

Manually tested.

Closes apache#33368 from HyukjinKwon/change-python-3.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 6bd385f)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants