-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-30211][INFRA] Use python3 in make-distribution.sh #26844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for preparing |
| # Delete the egg info file if it exists, this can cache older setup files. | ||
| rm -rf pyspark.egg-info || echo "No existing egg info file, skipping deletion" | ||
| python setup.py sdist | ||
| python3 setup.py sdist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that preview2 is the first PySpark distribution packaged by Python3?
|
cc @HyukjinKwon |
|
Hi, @wangyum . This may cause a corrupted Python package. Could you check your python3 on |
|
I didn't run the release script fully. So, please double-check your generated python package with this patch. At 2.2.3 release, I made this kind of mistake by myself. As you see in the following, it works, but looks ugly if there is no package description. So, I want that you can avoid this. If it works in your side, there is no problem. I'm just suspecting |
|
cc @gatorsmile |
|
Test build #115140 has finished for PR 26844 at commit
|
|
Retest this please |
|
@dongjoon-hyun It seems we also need to update our Docker image. May be it should be: diff --git a/dev/create-release/spark-rm/Dockerfile b/dev/create-release/spark-rm/Dockerfile
index cc7da152c7b..eabf3a3b6c8 100644
--- a/dev/create-release/spark-rm/Dockerfile
+++ b/dev/create-release/spark-rm/Dockerfile
@@ -62,14 +62,14 @@ RUN apt-get clean && apt-get update && $APT_INSTALL gnupg ca-certificates && \
curl -sL https://deb.nodesource.com/setup_11.x | bash && \
$APT_INSTALL nodejs && \
# Install needed python packages. Use pip for installing packages (for consistency).
- $APT_INSTALL libpython2.7-dev libpython3-dev python-pip python3-pip && \
- pip install $BASE_PIP_PKGS && \
- pip install $PIP_PKGS && \
+ $APT_INSTALL libpython3-dev python3-pip && \
+ pip3 install $BASE_PIP_PKGS && \
+ pip3 install $PIP_PKGS && \
cd && \
virtualenv -p python3 /opt/p35 && \
. /opt/p35/bin/activate && \
- pip install $BASE_PIP_PKGS && \
- pip install $PIP_PKGS && \
+ pip3 install $BASE_PIP_PKGS && \
+ pip3 install $PIP_PKGS && \
# Install R packages and dependencies used when building.
# R depends on pandoc*, libssl (which are installed above).
$APT_INSTALL r-base r-base-dev && \
or diff --git a/dev/create-release/spark-rm/Dockerfile b/dev/create-release/spark-rm/Dockerfile
index cc7da152c7b..a5af73cc25c 100644
--- a/dev/create-release/spark-rm/Dockerfile
+++ b/dev/create-release/spark-rm/Dockerfile
@@ -61,8 +61,16 @@ RUN apt-get clean && apt-get update && $APT_INSTALL gnupg ca-certificates && \
pandoc pandoc-citeproc libssl-dev libcurl4-openssl-dev libxml2-dev && \
curl -sL https://deb.nodesource.com/setup_11.x | bash && \
$APT_INSTALL nodejs && \
+ # Change default python to python3.6
+ update-alternatives --install /usr/bin/python python /usr/bin/python2.7 1 && \
+ update-alternatives --install /usr/bin/python python /usr/bin/python3.6 2 && \
+ update-alternatives --set python /usr/bin/python3.6 && \
+ # Change default pip to pip3
+ update-alternatives --install /usr/bin/pip pip /usr/bin/pip 1 && \
+ update-alternatives --install /usr/bin/pip pip /usr/bin/pip3 2 && \
+ update-alternatives --set pip /usr/bin/pip3 && \
# Install needed python packages. Use pip for installing packages (for consistency).
- $APT_INSTALL libpython2.7-dev libpython3-dev python-pip python3-pip && \
+ $APT_INSTALL libpython3-dev python3-pip && \
pip install $BASE_PIP_PKGS && \
pip install $PIP_PKGS && \
cd && \I will verify it later. |
|
Thank you, @wangyum . +1 for that! |
|
Do you want to do that separately? Then, I'll merge this first. |
|
BTW, I have a question. This is used for last |
|
cc @jiangxb1987 since he was the release manager of |
Yes. I'd like to do that separately. |
| ./dev/make-distribution.sh --name custom-spark --pip --r --tgz -Psparkr -Phive -Phive-thriftserver -Pmesos -Pyarn -Pkubernetes | ||
|
|
||
| This will build Spark distribution along with Python pip and R packages. For more information on usage, run `./dev/make-distribution.sh --help` | ||
| This will build Spark distribution along with Python pip and R packages. (Note that build with Python pip package requires Python 3.6). For more information on usage, run `./dev/make-distribution.sh --help` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ur, shall we revert this line since there is Python 3.7 and 3.8?
Oops, Never mind.
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Merged to master.
I updated the PR description because this doesn't fix Docker issue.
Instead, this will help dev/make-distribution first.
|
It's merged now. Please proceed to the docker issue and ping me if you need my help. |
|
Test build #115145 has finished for PR 26844 at commit
|
|
FYI, |
|
Test build #115154 has finished for PR 26844 at commit
|
HyukjinKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM too
| ./dev/make-distribution.sh --name custom-spark --pip --r --tgz -Psparkr -Phive -Phive-thriftserver -Pmesos -Pyarn -Pkubernetes | ||
|
|
||
| This will build Spark distribution along with Python pip and R packages. For more information on usage, run `./dev/make-distribution.sh --help` | ||
| This will build Spark distribution along with Python pip and R packages. (Note that build with Python pip package requires Python 3.6). For more information on usage, run `./dev/make-distribution.sh --help` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think we can just say "Note that build with Python pip package requires Python 3." without parentheses.
What changes were proposed in this pull request?
This PR switches python to python3 in
make-distribution.sh.Why are the changes needed?
SPARK-29672 changed this
Does this PR introduce any user-facing change?
No
How was this patch tested?
N/A