Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-41454][PYTHON] Support Python 3.11
### What changes were proposed in this pull request? This PR aims to support Python 3.11. ### Why are the changes needed? Python 3.11 is the newest major release of the Python programming language, and it contains many new features and optimizations and Python 3.11.1 is the latest version. - 2022-12-03 https://www.python.org/downloads/release/python-3111/ And, Spark is affected by one API removal (deprecated at 3.9 and removed at 3.11). Since this is handled by conditionally, there is no regression at the old Python versions. - https://bugs.python.org/issue40465 ### Does this PR introduce _any_ user-facing change? No, previsouly, this is not supported. ### How was this patch tested? Manually run the following. Note that this is tested without optional dependencies. ``` $ python/run-tests.py --python-executables python3.11 Will test against the following Python executables: ['python3.11'] Will test the following Python modules: ['pyspark-connect', 'pyspark-core', 'pyspark-ml', 'pyspark-mllib', 'pyspark-pandas', 'pyspark-pandas-slow', 'pyspark-resource', 'pyspark-sql', 'pyspark-streaming'] python3.11 python_implementation is CPython python3.11 version is: Python 3.11.1 Starting test(python3.11): pyspark.ml.tests.test_evaluation (temp output: /Users/dongjoon/APACHE/spark-merge/python/target/ff09022a-f3d3-413b-b15d-261c40d5b048/python3.11__pyspark.ml.tests.test_evaluation__wh9c4y5l.log) ... Finished test(python3.11): pyspark.sql.streaming.readwriter (88s) Tests passed in 1138 seconds ... Skipped tests in pyspark.tests.test_worker with python3.11: test_memory_limit (pyspark.tests.test_worker.WorkerMemoryTest.test_memory_limit) ... skipped "Memory limit feature in Python worker is dependent on Python's 'resource' module on Linux; however, not found or not on Linux." ``` Closes #38987 from dongjoon-hyun/SPARK-41454. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
- Loading branch information
b5a9e1f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the date of the new Pyspark release known? Handling Python 3.11 is a thing! I am looking forward to the release 👏
b5a9e1f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Michal-Kolomanski
Spark 3.4 release window
https://spark.apache.org/versioning-policy.html
b5a9e1f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I made this PR as a part of Apache Spark 3.4 preparation, @Michal-Kolomanski and @bjornjorgensen .
b5a9e1f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @bjornjorgensen and @dongjoon-hyun for the fast response.