Skip to content

Commit

Permalink
GH-33697: [CI][Python] Nightly test for PySpark 3.2.0 fail with Attri…
Browse files Browse the repository at this point in the history
…buteError on numpy.bool (#33714)

### Rationale for this change
Fix for nightly integration tests with PySpark 3.2.0 failure.

### What changes are included in this PR?
NumPy version pin in `docker-compose.yml`.

### Are these changes tested?
Will test on the open PR with the CI.

### Are there any user-facing changes?
No.
* Closes: #33697

Lead-authored-by: Alenka Frim <frim.alenka@gmail.com>
Co-authored-by: Alenka Frim <AlenkaF@users.noreply.github.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
  • Loading branch information
AlenkaF and kou authored Mar 1, 2023
1 parent f9a1d19 commit 4c1448e
Show file tree
Hide file tree
Showing 4 changed files with 44 additions and 4 deletions.
7 changes: 6 additions & 1 deletion ci/docker/conda-python-spark.dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,16 @@ FROM ${repo}:${arch}-conda-python-${python}
ARG jdk=8
ARG maven=3.5

ARG numpy=latest
COPY ci/scripts/install_numpy.sh /arrow/ci/scripts/

RUN mamba install -q -y \
openjdk=${jdk} \
maven=${maven} \
pandas && \
mamba clean --all
mamba clean --all && \
mamba uninstall -q -y numpy && \
/arrow/ci/scripts/install_numpy.sh ${numpy}

# installing specific version of spark
ARG spark=master
Expand Down
33 changes: 33 additions & 0 deletions ci/scripts/install_numpy.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

set -e

if [ $# -gt 1 ]; then
echo "Usage: $0 <optional numpy version = latest>"
exit 1
fi

numpy=${1:-"latest"}

if [ "${numpy}" = "latest" ]; then
pip install numpy
else
pip install numpy==${numpy}
fi
7 changes: 4 additions & 3 deletions dev/tasks/tasks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1589,9 +1589,9 @@ tasks:
image: conda-python-hdfs
{% endfor %}

{% for python_version, spark_version, test_pyarrow_only in [("3.7", "v3.1.2", "false"),
("3.8", "v3.2.0", "false"),
("3.9", "master", "false")] %}
{% for python_version, spark_version, test_pyarrow_only, numpy_version in [("3.7", "v3.1.2", "false", "latest"),
("3.8", "v3.2.0", "false", "1.23"),
("3.9", "master", "false", "latest")] %}
test-conda-python-{{ python_version }}-spark-{{ spark_version }}:
ci: github
template: docker-tests/github.linux.yml
Expand All @@ -1600,6 +1600,7 @@ tasks:
PYTHON: "{{ python_version }}"
SPARK: "{{ spark_version }}"
TEST_PYARROW_ONLY: "{{ test_pyarrow_only }}"
NUMPY: "{{ numpy_version }}"
# use the branch-3.0 of spark, so prevent reusing any layers
flags: --no-leaf-cache
image: conda-python-spark
Expand Down
1 change: 1 addition & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1788,6 +1788,7 @@ services:
# be set to ${MAVEN}
maven: 3.5
spark: ${SPARK}
numpy: ${NUMPY}
shm_size: *shm-size
environment:
<<: *ccache
Expand Down

0 comments on commit 4c1448e

Please sign in to comment.