-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-33697: [CI][Python] Nightly test for PySpark 3.2.0 fail with AttributeError on numpy.bool #33714
Conversation
|
@github-actions crossbow submit test-conda-python-3.8-spark-v3.2.0 |
Revision: 8e3ab26 Submitted crossbow builds: ursacomputing/crossbow @ actions-94e388e574
|
@github-actions crossbow submit test-conda-python-3.8-spark-v3.2.0 |
Revision: 70430e9 Submitted crossbow builds: ursacomputing/crossbow @ actions-7fd30999ea
|
@github-actions crossbow submit test-conda-python-3.8-spark-v3.2.0 |
1 similar comment
@github-actions crossbow submit test-conda-python-3.8-spark-v3.2.0 |
Revision: 427191d Submitted crossbow builds: ursacomputing/crossbow @ actions-431f284696
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# https://github.com/apache/arrow/issues/33697 | ||
# numpy version pin should be removed with new apache spark release | ||
# that includes https://github.com/apache/spark/pull/37817 | ||
ARG numpy=1.23 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we already test spark master nightly, this is the current testing combination:
{% for python_version, spark_version, test_pyarrow_only in [("3.7", "v3.1.2", "false"),
("3.8", "v3.2.0", "false"),
("3.9", "master", "false")] %}
And the build for spark master is currently passing: https://github.com/ursacomputing/crossbow/actions/runs/3934958561/jobs/6730195747#step:5:10
Maybe we can add the numpy version to the task definition only for 3.2.0 and if it is different than latest install the pinned version. I am thinking on something like:
{% for python_version, spark_version, test_pyarrow_only, numpy_version in [("3.7", "v3.1.2", "false", "latest"),
("3.8", "v3.2.0", "false", "1.23"),
("3.9", "master", "false", "latest")] %}
And the corresponding if to validate if we have to install numpy or not?
@kiszk you are spark committer. I suppose this fix won't get backported to spark 3.2.0 and we have to pin numpy always for it? Should we update the tasks for our nightlies to test with spark 3.3.0 maybe remove 3.2.0?
…on-spark.dockerfile
427191d
to
d86c6a9
Compare
@github-actions crossbow submit test-conda-python-3.8-spark-v3.2.0 |
Revision: d86c6a9 Submitted crossbow builds: ursacomputing/crossbow @ actions-94d502001c
|
@github-actions crossbow submit test-conda-python-3.8-spark-v3.2.0 |
Revision: 7f76899 Submitted crossbow builds: ursacomputing/crossbow @ actions-b7b5dc6d5d
|
@github-actions crossbow submit test-conda-python--spark- |
Revision: 7f76899 Submitted crossbow builds: ursacomputing/crossbow @ actions-8c12467062
|
@github-actions crossbow submit test-conda-python--spark- |
Revision: b1b776d Submitted crossbow builds: ursacomputing/crossbow @ actions-5192a832e2
|
@github-actions crossbow submit test-conda-python--spark- |
Revision: 9743842 Submitted crossbow builds: ursacomputing/crossbow @ actions-d30511e746
|
@github-actions crossbow submit test-conda-python--spark- |
Revision: efdf9fc Submitted crossbow builds: ursacomputing/crossbow @ actions-268559d178
|
Looking at the logs, I think the line "#7 7.578 /bin/bash: /arrow/ci/scripts/install_numpy.sh: Permission denied" is the problem (the file wasn't copied, so installing numpy using the nonexisting file fails) However, I don't see any difference with what we already do in |
Yeah, I am trying out different things locally but none work 🤷♀️ Also asked Raul for help, if he has any ideas what could be the issue. |
@github-actions crossbow submit test-conda-python--spark- |
Revision: b1b3b99 Submitted crossbow builds: ursacomputing/crossbow @ actions-5f44f20302
|
@github-actions crossbow submit test-conda-python--spark- |
Revision: d56f4b7 Submitted crossbow builds: ursacomputing/crossbow @ actions-cce4baca43
|
@raulcd the fix is working now, thank you! |
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @AlenkaF I am going to trigger a final run just to validate and will merge once it finishes.
@github-actions crossbow submit test-conda-python--spark- |
Revision: 1ebe276 Submitted crossbow builds: ursacomputing/crossbow @ actions-fd4f9f54ae
|
Benchmark runs are scheduled for baseline = f9a1d19 and contender = 4c1448e. 4c1448e is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Rationale for this change
Fix for nightly integration tests with PySpark 3.2.0 failure.
What changes are included in this PR?
NumPy version pin in
docker-compose.yml
.Are these changes tested?
Will test on the open PR with the CI.
Are there any user-facing changes?
No.