Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-41454][PYTHON] Support Python 3.11 #38987

Closed
wants to merge 1 commit into from
Closed

[SPARK-41454][PYTHON] Support Python 3.11 #38987

wants to merge 1 commit into from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Dec 8, 2022

What changes were proposed in this pull request?

This PR aims to support Python 3.11.

Why are the changes needed?

Python 3.11 is the newest major release of the Python programming language, and it contains many new features and optimizations and Python 3.11.1 is the latest version.

And, Spark is affected by one API removal (deprecated at 3.9 and removed at 3.11). Since this is handled by conditionally, there is no regression at the old Python versions.

Does this PR introduce any user-facing change?

No, previsouly, this is not supported.

How was this patch tested?

Manually run the following. Note that this is tested without optional dependencies.

$ python/run-tests.py --python-executables python3.11
Will test against the following Python executables: ['python3.11']
Will test the following Python modules: ['pyspark-connect', 'pyspark-core', 'pyspark-ml', 'pyspark-mllib', 'pyspark-pandas', 'pyspark-pandas-slow', 'pyspark-resource', 'pyspark-sql', 'pyspark-streaming']
python3.11 python_implementation is CPython
python3.11 version is: Python 3.11.1
Starting test(python3.11): pyspark.ml.tests.test_evaluation (temp output: /Users/dongjoon/APACHE/spark-merge/python/target/ff09022a-f3d3-413b-b15d-261c40d5b048/python3.11__pyspark.ml.tests.test_evaluation__wh9c4y5l.log)
...
Finished test(python3.11): pyspark.sql.streaming.readwriter (88s)
Tests passed in 1138 seconds

...
Skipped tests in pyspark.tests.test_worker with python3.11:
    test_memory_limit (pyspark.tests.test_worker.WorkerMemoryTest.test_memory_limit) ... skipped "Memory limit feature in Python worker is dependent on Python's 'resource' module on Linux; however, not found or not on Linux."

if sys.version_info < (3, 11):
# different order in different processes and instances
rnd = random.Random(os.getpid() + id(dirs))
random.shuffle(dirs, rnd.random)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is deprecated at 3.9 and removed at 3.11.

@dongjoon-hyun
Copy link
Member Author

Thank you, @HyukjinKwon

@dongjoon-hyun
Copy link
Member Author

All tests passed. Merged to master for Apache Spark 3.4.0.

@dongjoon-hyun dongjoon-hyun deleted the SPARK-41454 branch December 9, 2022 00:33
beliefer pushed a commit to beliefer/spark that referenced this pull request Dec 18, 2022
### What changes were proposed in this pull request?

This PR aims to support Python 3.11.

### Why are the changes needed?

Python 3.11 is the newest major release of the Python programming language, and it contains many new features and optimizations and Python 3.11.1 is the latest version.

- 2022-12-03 https://www.python.org/downloads/release/python-3111/

And, Spark is affected by one API removal (deprecated at 3.9 and removed at 3.11). Since this is handled by conditionally, there is no regression at the old Python versions.
- https://bugs.python.org/issue40465

### Does this PR introduce _any_ user-facing change?

No, previsouly, this is not supported.

### How was this patch tested?

Manually run the following. Note that this is tested without optional dependencies.
```
$ python/run-tests.py --python-executables python3.11
Will test against the following Python executables: ['python3.11']
Will test the following Python modules: ['pyspark-connect', 'pyspark-core', 'pyspark-ml', 'pyspark-mllib', 'pyspark-pandas', 'pyspark-pandas-slow', 'pyspark-resource', 'pyspark-sql', 'pyspark-streaming']
python3.11 python_implementation is CPython
python3.11 version is: Python 3.11.1
Starting test(python3.11): pyspark.ml.tests.test_evaluation (temp output: /Users/dongjoon/APACHE/spark-merge/python/target/ff09022a-f3d3-413b-b15d-261c40d5b048/python3.11__pyspark.ml.tests.test_evaluation__wh9c4y5l.log)
...
Finished test(python3.11): pyspark.sql.streaming.readwriter (88s)
Tests passed in 1138 seconds

...
Skipped tests in pyspark.tests.test_worker with python3.11:
    test_memory_limit (pyspark.tests.test_worker.WorkerMemoryTest.test_memory_limit) ... skipped "Memory limit feature in Python worker is dependent on Python's 'resource' module on Linux; however, not found or not on Linux."
```

Closes apache#38987 from dongjoon-hyun/SPARK-41454.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Axel-Naumann added a commit to Axel-Naumann/root that referenced this pull request Mar 24, 2023
Its Python 3.11 needs the upcoming Spark release, see apache/spark#38987
Axel-Naumann added a commit to root-project/root that referenced this pull request Mar 27, 2023
Its Python 3.11 needs the upcoming Spark release, see apache/spark#38987
omazapa pushed a commit to omazapa/root that referenced this pull request Apr 13, 2023
Its Python 3.11 needs the upcoming Spark release, see apache/spark#38987
@mdhont
Copy link

mdhont commented Nov 2, 2023

I've created a task for reverting that change for versions below 3.4. I will notify you in this thread with further information.

@dongjoon-hyun
Copy link
Member Author

I've created a task for reverting that change for versions below 3.4. I will notify you in this thread with further information.

Thank you for informing that, @mdhont . It's a great news and will be helpful in a way.

BTW, just FYI, Apache Spark 3.3 will reach the End-Of-Support next Month (2023-12-15). Apache Spark community currently focuses on Apache Spark 3.4.2 and 3.5.1 and 4.0.0 (next year).

darabos added a commit to lynxkite/lynxkite that referenced this pull request May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants