Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELEASE] cudf v24.10 #16943

Merged
merged 338 commits into from
Oct 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
338 commits
Select commit Hold shift + click to select a range
6b0bff4
Disallow cudf.Series to accept column in favor of `._from_column` (#1…
mroeschke Aug 7, 2024
d11d2cf
Merge pull request #16505 from rapidsai/branch-24.08
GPUtester Aug 7, 2024
3fd8783
Add `stream` param to stream compaction APIs (#16295)
JayjeetAtGithub Aug 7, 2024
b933b54
Use tool.scikit-build.cmake.version, set scikit-build-core minimum-ve…
jameslamb Aug 8, 2024
c146eed
Expose `stream` param in transform APIs (#16452)
JayjeetAtGithub Aug 8, 2024
a94512a
Add interop example for `arrow::StringViewArray` to `cudf::column` (#…
JayjeetAtGithub Aug 8, 2024
cc75b05
Change IPv4 convert APIs to support UINT32 instead of INT64 (#16489)
davidwendt Aug 8, 2024
da51cad
Improve update-version.sh (#16506)
bdice Aug 8, 2024
792dd06
Update pre-commit hooks (#16510)
KyleFromNVIDIA Aug 8, 2024
1bbe440
Add keep option to distinct nvbench (#16497)
bdice Aug 8, 2024
2c8de62
enable list to be forced as string in JSON reader. (#16472)
karthikeyann Aug 9, 2024
9ec34ad
Remove a deprecated multibyte_split API (#16501)
davidwendt Aug 9, 2024
8009dc8
Update docs of the TPC-H derived examples (#16423)
JayjeetAtGithub Aug 9, 2024
4446cf0
Update json normalization to take device_buffer (#16520)
karthikeyann Aug 9, 2024
16aa0ea
Allow DataFrame.sort_values(by=) to select an index level (#16519)
mroeschke Aug 9, 2024
4cd87d3
Fix `date_range(start, end, freq)` when end-start is divisible by fre…
mroeschke Aug 9, 2024
45b20d1
Preserve array name in MultiIndex.from_arrays (#16515)
mroeschke Aug 9, 2024
a3dc14f
Disallow indexing by selecting duplicate labels (#16514)
mroeschke Aug 12, 2024
091cb72
Remove deprecated public APIs from libcudf (#16524)
davidwendt Aug 12, 2024
cce00c0
Pass batch size to JSON reader using environment variable (#16502)
shrshi Aug 12, 2024
e5f8dd3
Update the java code to properly deal with lists being returned as st…
revans2 Aug 12, 2024
7178bf2
Rework cudf::io::text::byte_range_info class member functions (#16518)
davidwendt Aug 13, 2024
419fb99
Fix all-empty input column for strings split APIs (#16466)
davidwendt Aug 13, 2024
3a791cb
Remove unneeded pair-iterator benchmark (#16511)
davidwendt Aug 13, 2024
3801f81
Remove hardcoded versions from workflows. (#16540)
bdice Aug 13, 2024
5780c4d
Register `read_parquet` and `read_csv` with dask-expr (#16535)
rjzamora Aug 13, 2024
cf3fabf
Ensure comparisons with pyints and integer series always succeed (#16…
seberg Aug 13, 2024
1f0d0c9
Change cudf::empty_like to not include offsets for empty strings colu…
davidwendt Aug 14, 2024
c20d6b3
Remove unneeded output size parameter from internal count_matches uti…
davidwendt Aug 14, 2024
bf3372b
Switch python version to `3.10` in `cudf.pandas` pandas test scripts …
galipremsagar Aug 14, 2024
d684ae0
Raise NotImplementedError for Series.rename that's not a scalar (#16525)
mroeschke Aug 14, 2024
0253e97
[FEA] Support named aggregations in `df.groupby().agg()` (#16528)
Matt711 Aug 15, 2024
19846b6
Disallow cudf.Index accepting column in favor of ._from_column (#16549)
mroeschke Aug 15, 2024
89863a3
Align public utility function signatures with pandas 2.x (#16565)
mroeschke Aug 15, 2024
2bcb7ec
Fix `.replace(Index, Index)` raising a TypeError (#16513)
mroeschke Aug 15, 2024
ac42bc8
Hide all gtest symbols in cudftestutil (#16546)
robertmaynard Aug 15, 2024
ed31523
Merge branch 'branch-24.08' into branch-24.10-merge-24.08
bdice Aug 15, 2024
6912246
Merge pull request #16571 from bdice/branch-24.10-merge-24.08
AyodeAwe Aug 15, 2024
f4a9b1c
Use more idomatic cudf APIs in dask_cudf meta generation (#16487)
mroeschke Aug 15, 2024
1e220b7
Return Interval object in pandas compat mode for IntervalIndex reduct…
mroeschke Aug 15, 2024
5084135
Make NumericalColumn.__init__ strict (#16457)
mroeschke Aug 15, 2024
155edde
Make Timedelta/DatetimeColumn.__init__ strict (#16464)
mroeschke Aug 16, 2024
f955dd7
Rewrite remaining Python Arrow interop conversions using the C Data I…
vyasr Aug 16, 2024
1c63e1e
Initial investigation into NumPy proxying in `cudf.pandas` (#16286)
Matt711 Aug 16, 2024
e690d9d
Ensure size is always passed to NumericalColumn (#16576)
mroeschke Aug 16, 2024
e197d72
Replace `NativeFile` dependency in dask-cudf Parquet reader (#16569)
rjzamora Aug 16, 2024
623dfce
[FEA] Add support for `cudf.unique` (#16554)
Matt711 Aug 16, 2024
e16c2f2
Make (Indexed)Frame.__init__ require data (and index) (#16430)
mroeschke Aug 16, 2024
30011c5
Clean up reshaping ops (#16553)
mroeschke Aug 16, 2024
bc8ca9b
Setup pylibcudf package (#16299)
lithomas1 Aug 16, 2024
10cdd5f
Reenable arrow tests (#16556)
vyasr Aug 16, 2024
cb843db
Fix DataFrame reductions with median returning scalar instead of Seri…
mroeschke Aug 16, 2024
fd44adc
Make CategoricalColumn.__init__ strict (#16456)
mroeschke Aug 16, 2024
b63ba70
Add build job for pylibcudf (#16587)
vyasr Aug 17, 2024
dd2c12d
Revert "Make proxy NumPy arrays pass isinstance check in `cudf.pandas…
Matt711 Aug 17, 2024
592342c
Remove invalid column_view usage in string-scalar-to-column function …
davidwendt Aug 19, 2024
1b18cbc
Add `ToCudfBackend` expression to dask-cudf (#16573)
rjzamora Aug 19, 2024
0491778
MAINT: Adapt to numpy hiding flagsobject away (#16593)
seberg Aug 19, 2024
c516fc4
Make ListColumn.__init__ strict (#16465)
mroeschke Aug 19, 2024
074abcc
Add `public` qualifier for some member functions in Java class `Schem…
ttnghia Aug 19, 2024
79a5a97
Remove NativeFile support from cudf Python (#16589)
vyasr Aug 19, 2024
6ccc2c2
standardize and consolidate wheel installations in testing scripts (#…
jameslamb Aug 19, 2024
f2d13c9
make more use of YAML anchors in dependencies.yaml (#16597)
jameslamb Aug 19, 2024
3f6dd14
Make StructColumn.__init__ strict (#16467)
mroeschke Aug 19, 2024
a45af4a
Remove arrow_io_source (#16607)
vyasr Aug 20, 2024
3ac409d
Fix C++ and Cython io types (#16610)
vyasr Aug 20, 2024
2f7d354
bug-fix: cudf/io/json.hpp use after move (#16609)
NicolasDenoyelle Aug 20, 2024
1cccf3e
Replace usages of `thrust::optional` with `std::optional` (#15091)
miscco Aug 20, 2024
555734d
Remove thrust::optional from expression evaluator (#16604)
bdice Aug 20, 2024
b32bc10
do not install cudf in cudf_polars wheel tests (#16612)
jameslamb Aug 20, 2024
e450baf
remove streamz git dependency, standardize build dependency names, co…
jameslamb Aug 20, 2024
28fee97
Enable gtests previously disabled for compute-sanitizer bug (#16581)
davidwendt Aug 20, 2024
58799d6
Add stricter typing and validation to ColumnAccessor (#16602)
mroeschke Aug 20, 2024
8ab553c
Move libcudf reduction google-benchmarks to nvbench (#16564)
davidwendt Aug 21, 2024
6a2f323
Fix function parameters with common dependency modified during their …
ttnghia Aug 21, 2024
bf2ee32
DOC: Refresh pylibcudf guide (#15856)
lithomas1 Aug 21, 2024
6c4905d
Remove legacy Arrow interop APIs (#16590)
vyasr Aug 22, 2024
1fd9675
Fix overflow bug in low-memory JSON reader (#16632)
shrshi Aug 22, 2024
00ff2ee
[FEA] Add filesystem argument to `cudf.read_parquet` (#16577)
rjzamora Aug 22, 2024
81d71fc
update-version.sh fix (#16629)
AyodeAwe Aug 22, 2024
e4e867a
Annotate `ColumnAccessor._data` labels as `Hashable` (#16623)
mroeschke Aug 22, 2024
8b20298
Move pragma once in rolling/jit/operation.hpp. (#16636)
bdice Aug 22, 2024
eaefcb4
Support DecimalDtype meta in dask_cudf (#16634)
mroeschke Aug 22, 2024
83f68c9
Revert "Hide all gtest symbols in cudftestutil (#16546)" (#16644)
robertmaynard Aug 22, 2024
91f304e
Enable testing `cudf.pandas` unit tests for all minor versions of pan…
galipremsagar Aug 23, 2024
8d6b261
adding wheel build for libcudf (#15483)
msarahan Aug 23, 2024
a7ca3af
Add the missing `num_aggregations` axis for `groupby_max_cardinality`…
PointKernel Aug 23, 2024
7bd14a5
Add pylibcudf build dir in build.sh for `clean` (#16648)
galipremsagar Aug 23, 2024
7ca6a8c
fix libcudf wheel publishing, make package-type explicit in wheel pub…
jameslamb Aug 23, 2024
508bdea
Rebuild for & Support NumPy 2 (#16300)
jakirkham Aug 24, 2024
96f2cc5
Remove CUDA whole compilation ODR violations (#16603)
robertmaynard Aug 26, 2024
a250391
Revise `get_reader_filepath_or_buffer` to handle a list of data sourc…
rjzamora Aug 26, 2024
d15d470
Preserve Series name in duplicated method. (#16655)
bdice Aug 26, 2024
f511322
bug-fix: Don't enable the CUDA language if testing was requested when…
cryos Aug 26, 2024
c4591c0
Use non-mangled type names in nvbench output (#16649)
davidwendt Aug 27, 2024
115ddce
Fix integer overflow in indexalator pointer logic (#16643)
davidwendt Aug 27, 2024
efa9770
Drop Python 3.9 support (#16637)
jameslamb Aug 27, 2024
f1cc962
Fix `cudf::rank` not getting enough params (#16666)
JayjeetAtGithub Aug 27, 2024
2d494ed
Add `num_multiprocessors` utility (#16628)
PointKernel Aug 27, 2024
dd585e8
Prune workflows based on changed files (#16642)
KyleFromNVIDIA Aug 27, 2024
6747d2d
Update rapidsai/pre-commit-hooks (#16669)
KyleFromNVIDIA Aug 27, 2024
1a2aad2
Remove arrow dependency (#16640)
vyasr Aug 27, 2024
d0e5cdf
Allow for binops between two differently sized DecimalDtypes (#16638)
mroeschke Aug 27, 2024
88de8dd
Fix interval_range right child non-zero offset (#16651)
mroeschke Aug 27, 2024
e2a15cb
Fix strings::detail::copy_range when target contains nulls (#16626)
davidwendt Aug 27, 2024
d1412e0
Rework strings::slice benchmark to use nvbench (#16563)
davidwendt Aug 27, 2024
60f30d8
Use `make_host_vector` instead of `make_std_vector` to facilitate pin…
vuule Aug 28, 2024
1a96e4c
Fix loc/iloc.__setitem__[:, loc] with non cupy types (#16677)
mroeschke Aug 28, 2024
569939f
Fix slowdown in DataFrame repr in jupyter notebook (#16656)
galipremsagar Aug 28, 2024
5491b39
switch from typing.Callable to collections.abc.Callable (#16670)
jameslamb Aug 28, 2024
c600a65
Update documentation for Dask cuDF (#16671)
rjzamora Aug 28, 2024
872e01e
Fix slowdown in `CategoricalIndex.__repr__` (#16665)
galipremsagar Aug 28, 2024
dba6c1f
Remove build_categorical_column in favor of CategoricalColumn constru…
mroeschke Aug 28, 2024
925530a
Relax Arrow pin (#16681)
vyasr Aug 28, 2024
fbd6114
Support reading matching projected and filter cols from Parquet files…
mhaseeb123 Aug 28, 2024
9e9efcc
Replace raw device_memory_resource pointer in pylibcudf Cython (#16674)
harrism Aug 28, 2024
f6e2355
Handle `ordered` parameter in `CategoricalIndex.__repr__` (#16683)
galipremsagar Aug 29, 2024
f2d153b
Have interval_range use IntervalIndex.from_breaks, remove column_empt…
mroeschke Aug 29, 2024
eca5108
Disable gtests/ERROR_TEST during compute-sanitizer memcheck test (#16…
davidwendt Aug 29, 2024
21d05d7
Move apply_boolean_mask benchmark to nvbench (#16616)
davidwendt Aug 29, 2024
8c7af08
Increase timeouts for couple of tests (#16692)
galipremsagar Aug 29, 2024
53f488b
Add type annotations to Index classes, utilize _from_column more (#16…
mroeschke Aug 30, 2024
8f2d687
Refactor dictionary encoding in PQ writer to migrate to the new `cuco…
mhaseeb123 Aug 30, 2024
f932bf9
Fix Series.to_frame(name=None) setting a None name (#16698)
mroeschke Aug 30, 2024
62a53b3
[FEA] Add third-party library integration testing of cudf.pandas to c…
Matt711 Aug 30, 2024
23fb31e
Add a libcudf/thrust-based TPC-H derived datagen (#16294)
JayjeetAtGithub Aug 30, 2024
5a81a80
[BUG] Add gpu node type to cudf-pandas 3rd-party integration nightly …
Matt711 Aug 30, 2024
2d6758f
Enable batched multi-source reading of JSONL files with large records…
shrshi Aug 30, 2024
c6c720f
Implement exposed null mask APIs in pylibcudf (#15908)
charlesbluca Aug 30, 2024
5e420ff
Use merge base when calculating changed files (#16709)
KyleFromNVIDIA Aug 30, 2024
4ad4b23
remove some unnecessary libcudf nightly builds (#16714)
jameslamb Aug 31, 2024
7605958
Remove java ColumnView.copyWithBooleanColumnAsValidity (#16660)
revans2 Sep 1, 2024
557aabf
Ensure we pass the has_nulls tparam to mixed_join kernels (#16708)
abellina Sep 3, 2024
25779d9
Add boost-devel to Java CI Docker image (#16707)
jlowe Sep 3, 2024
0097b45
Fix typo in column_factories.hpp comment from 'depth 1' to 'depth 2' …
a-hirota Sep 3, 2024
e18b537
Use Series._from_column more consistently to avoid validation (#16716)
mroeschke Sep 3, 2024
a83ac6f
Add return type annotations to MultiIndex (#16696)
mroeschke Sep 3, 2024
fa1486e
Remove ERROR_TEST gtest from libcudf (#16722)
davidwendt Sep 3, 2024
26091a4
Refactor cudf pandas integration tests CI (#16728)
Matt711 Sep 4, 2024
1b6f02d
Multi-file and Parquet-aware prefetching from remote storage (#16657)
rjzamora Sep 4, 2024
ad1369d
CI: Test against old versions of key dependencies (#16570)
seberg Sep 4, 2024
e1ab1e7
Make isinstance check pass for proxy ndarrays (#16601)
Matt711 Sep 5, 2024
949f171
Performance improvement for strings::slice for wide strings (#16574)
davidwendt Sep 5, 2024
0cc059f
Upgrade to nvcomp 4.0.1 (#16076)
vuule Sep 5, 2024
0e86f62
Add performance tips to cudf.pandas FAQ. (#16693)
bdice Sep 5, 2024
715677e
Add libcudf example with large strings (#15983)
davidwendt Sep 5, 2024
7018a33
Add support for Python 3.12, update Kafka dependencies to 2.5.x (#16745)
jameslamb Sep 6, 2024
8d8faef
allow pandas patch version to float in cudf-pandas unit tests (#16763)
jameslamb Sep 6, 2024
f97f61c
Remove xfail from torch-cudf.pandas integration test (#16705)
Matt711 Sep 6, 2024
aa08fdb
[DOC] Remove out of date section from cudf.pandas docs (#16697)
Matt711 Sep 6, 2024
4784067
Check index bounds in compact protocol reader. (#16493)
bdice Sep 7, 2024
26a81b6
Allow read_csv(header=None) to return int column labels in `mode.pand…
mroeschke Sep 9, 2024
150f1b1
Add labeling APIs to pylibcudf (#16761)
mroeschke Sep 9, 2024
92f0197
Simplify the nvCOMP adapter (#16762)
vuule Sep 9, 2024
f21979e
Extend the Parquet writer's dictionary encoding benchmark. (#16591)
mhaseeb123 Sep 10, 2024
afd3a4b
Add libcudf wrappers around current_device_resource functions. (#16679)
harrism Sep 10, 2024
afc9f4f
Add labeling pylibcudf doc pages (#16779)
mroeschke Sep 10, 2024
6dd5689
use libkvikio wheels in wheel builds (#16778)
jameslamb Sep 10, 2024
5192b88
Fix empty cluster handling in tdigest merge (#16675)
jihoonson Sep 10, 2024
c3d323d
Move NDS-H examples into benchmarks (#16663)
JayjeetAtGithub Sep 11, 2024
4cdb1bf
[FEA] Add support for `cudf.NamedAgg` (#16744)
Matt711 Sep 11, 2024
750adca
nvCOMP GZIP integration (#16770)
vuule Sep 11, 2024
9acbaf8
JSON reader validation of values (#15968)
karthikeyann Sep 11, 2024
985f671
Fix slice_strings wide strings logic with multi-byte characters (#16777)
davidwendt Sep 11, 2024
0b32f55
Fix nvbench output for sha512 (#16773)
davidwendt Sep 11, 2024
e063baa
Support reading multiple PQ sources with mismatching nullability for …
mhaseeb123 Sep 11, 2024
1b402df
Recommending `miniforge` for conda install (#16782)
mmccarty Sep 11, 2024
3dbc33a
Revert "Fix empty cluster handling in tdigest merge (#16675)" (#16800)
jihoonson Sep 12, 2024
124d3e3
Migrate dask-cudf README improvements to dask-cudf sphinx docs (#16765)
rjzamora Sep 16, 2024
4033385
Java: Make ColumnVector.fromViewWithContiguousAllocation public (#16784)
jlowe Sep 16, 2024
86861e0
Fix `cov`/`corr` bug in dask-cudf (#16786)
rjzamora Sep 17, 2024
1f40520
Merge branch-24.08 into branch-24.10
bdice Sep 17, 2024
f8d5063
Add ability to set parquet row group max #rows and #bytes in java (#1…
pmattione-nvidia Sep 17, 2024
7285efb
Support drop_first in get_dummies (#16795)
mroeschke Sep 17, 2024
250a73a
Fix pylibcudf imports, branches, and more.
bdice Sep 17, 2024
27c29eb
Use cupy 12.2.0 as oldest dependency pinning on CUDA 12 ARM (#16808)
bdice Sep 17, 2024
23351aa
Word-based nvtext::minhash function (#15368)
davidwendt Sep 17, 2024
e98e109
Support multiple new-line characters in regex APIs (#15961)
davidwendt Sep 17, 2024
a112f68
Add io_type axis with default `PINNED_BUFFER` to nvbench PQ multithre…
mhaseeb123 Sep 17, 2024
4291f26
Clean up cudf dependency in cudf_polars.__init__.
bdice Sep 17, 2024
57ae3e3
Enable cudf.pandas REPL and -c command support (#16428)
bdice Sep 18, 2024
44a9c10
Add a benchmark to study Parquet reader's performance for wide tables…
mhaseeb123 Sep 18, 2024
2a9a8f5
use get-pr-info from nv-gha-runners (#16819)
AyodeAwe Sep 18, 2024
2a3026d
Change the Parquet writer's `default_row_group_size_bytes` from 128MB…
mhaseeb123 Sep 18, 2024
e68f55c
Refactor mixed_semi_join using cuco::static_set (#16230)
srinivasyadav18 Sep 18, 2024
42c5324
Use CI workflow branch 'branch-24.10' again (#16832)
jameslamb Sep 18, 2024
a0c6fc8
Rename the NDS-H benchmark binaries (#16831)
JayjeetAtGithub Sep 19, 2024
30e3946
Whitespace normalization of nested column coerced as string column in…
shrshi Sep 19, 2024
dafb3e7
Generate GPU vs CPU usage metrics per pytest file in pandas testsuite…
galipremsagar Sep 19, 2024
3886c7c
Download pylibcudf wheel when testing polars itself
wence- Sep 19, 2024
9df13d1
No cover for 1.6 IR changes
wence- Sep 19, 2024
8782a1d
Improve aggregation documentation (#16822)
PointKernel Sep 19, 2024
e9b5b53
Add string.repeats API to pylibcudf (#16834)
mroeschke Sep 19, 2024
51c2dd6
Add string.contains APIs to pylibcudf (#16814)
mroeschke Sep 19, 2024
7233da9
Remove `MultiIndex._poplevel` inplace implementation. (#16767)
mroeschke Sep 19, 2024
3a1b718
Merge branch 'branch-24.10' into branch-24.10-merge-24.08
bdice Sep 19, 2024
272a703
Add string.extract APIs to pylibcudf (#16823)
mroeschke Sep 19, 2024
cb38c19
Merge branch 'branch-24.10' into branch-24.10-merge-24.08
bdice Sep 19, 2024
944312d
Fix package name.
bdice Sep 19, 2024
5809bfc
Merge branch 'branch-24.10-merge-24.08' of github.com:bdice/cudf into…
bdice Sep 19, 2024
8e1345f
Intentionally leak thread_local CUDA resources to avoid crash (part 1…
kingcrimsontianyu Sep 19, 2024
d63ca6a
Access Frame attributes instead of ColumnAccessor attributes when ava…
mroeschke Sep 19, 2024
dc57c1b
Revert "Refactor mixed_semi_join using cuco::static_set" (#16855)
mhaseeb123 Sep 20, 2024
2676924
Switch to using native `traceback` (#16851)
galipremsagar Sep 20, 2024
866b642
Merge branch 'branch-24.10' into branch-24.10-merge-24.08
wence- Sep 20, 2024
8a1d652
Fix branch in shared-workflow pointer
wence- Sep 20, 2024
e278018
cmake-format
wence- Sep 20, 2024
434f99b
Pacify ruff
wence- Sep 20, 2024
2fb0186
Add cudf.pandas dependencies.yaml to update-version.sh (#16840)
raydouglass Sep 20, 2024
9834a3a
Update xfailing tests in polars test suite
wence- Sep 20, 2024
f71f53a
JSON tree algorithm code reorg (#16836)
karthikeyann Sep 20, 2024
eeb4bae
Merge pull request #16813 from bdice/branch-24.10-merge-24.08
AyodeAwe Sep 20, 2024
69ab988
Exposed stream-ordering to join API (#16793)
lamarrr Sep 20, 2024
b165210
Add best practices page to Dask cuDF docs (#16821)
rjzamora Sep 20, 2024
ed2f9f6
Add transform APIs to pylibcudf (#16760)
mroeschke Sep 21, 2024
96d2f81
Update labeler for pylibcudf (#16868)
vyasr Sep 21, 2024
9b4c4c7
Exposed stream-ordering to datetime API (#16774)
lamarrr Sep 21, 2024
0870051
Improve Polars docs (#16820)
bdice Sep 23, 2024
6255906
Merge pull request #16873 from rapidsai/branch-24.08
GPUtester Sep 23, 2024
389208c
Ignore numba warning specific to ARM runners (#16872)
galipremsagar Sep 23, 2024
8b12cf4
Update fmt (to 11.0.2) and spdlog (to 1.14.1). (#16806)
jameslamb Sep 23, 2024
6badd6b
Add in support for setting delim when parsing JSON through java (#168…
revans2 Sep 24, 2024
b3518ab
Add in option for Java JSON APIs to do column pruning in CUDF (#16796)
revans2 Sep 24, 2024
f8db575
Update update-version.sh to use packaging lib (#16891)
AyodeAwe Sep 24, 2024
73fa557
Update oldest deps for `pyarrow` & `numpy` (#16883)
galipremsagar Sep 24, 2024
22cefc9
Fix metadata after implicit array conversion from Dask cuDF (#16842)
rjzamora Sep 25, 2024
9316309
Remove unnecessary flag from build.sh (#16879)
vyasr Sep 25, 2024
03c77c2
Add string.findall APIs to pylibcudf (#16825)
mroeschke Sep 25, 2024
dbe5528
[FEA] Add an environment variable to fail on fallback in `cudf.pandas…
Matt711 Sep 25, 2024
75c5c83
Add dask-cudf workaround for missing `rename_axis` support in cudf (#…
rjzamora Sep 25, 2024
4160423
Pin polars for 24.10 and update polars test suite xfail list (#16886)
wence- Sep 25, 2024
ef27082
Build `cudf-polars` with `build.sh` (#16898)
brandon-b-miller Sep 25, 2024
b92d008
Fix DataFrame.drop(columns=cudf.Series/Index, axis=1) (#16712)
mroeschke Sep 25, 2024
d11ec7a
[DOC] Update Pylibcudf doc strings (#16810)
Matt711 Sep 25, 2024
8e78424
Optimization of tdigest merge aggregation. (#16780)
nvdbaranec Sep 25, 2024
f7c5d32
Display deltas for `cudf.pandas` test summary (#16864)
galipremsagar Sep 25, 2024
987fea3
JSON tree algorithms refactor I: CSR data structure for column tree (…
shrshi Sep 25, 2024
ba4afae
Make tests deterministic (#16910)
galipremsagar Sep 25, 2024
e42b91b
Add polars to "all" dependency list. (#16875)
bdice Sep 25, 2024
c1f377a
Migrate ORC reader to pylibcudf (#16042)
lithomas1 Sep 25, 2024
503ce03
Add transpose API to pylibcudf (#16749)
mroeschke Sep 25, 2024
0425963
Add experimental `filesystem="arrow"` support in `dask_cudf.read_parq…
rjzamora Sep 25, 2024
c7f6a22
Add string.attributes APIs to pylibcudf (#16785)
mroeschke Sep 25, 2024
12ee360
[REVIEW] JSON host tree algorithms (#16545)
shrshi Sep 26, 2024
61af769
Add io/timezone APIs to pylibcudf (#16771)
mroeschke Sep 26, 2024
b00a718
Add partitioning APIs to pylibcudf (#16781)
mroeschke Sep 26, 2024
742eaad
Fix links in Dask cuDF documentation (#16929)
rjzamora Sep 26, 2024
f20491d
Parse newline as whitespace character while tokenizing JSONL inputs w…
shrshi Sep 30, 2024
8a9df04
Add license to the pylibcudf wheel (#16976)
raydouglass Oct 2, 2024
319a533
Update Changelog [skip ci]
raydouglass Oct 9, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
6 changes: 3 additions & 3 deletions .devcontainer/cuda11.8-conda/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,17 @@
"args": {
"CUDA": "11.8",
"PYTHON_PACKAGE_MANAGER": "conda",
"BASE": "rapidsai/devcontainers:24.08-cpp-cuda11.8-mambaforge-ubuntu22.04"
"BASE": "rapidsai/devcontainers:24.10-cpp-cuda11.8-mambaforge-ubuntu22.04"
}
},
"runArgs": [
"--rm",
"--name",
"${localEnv:USER:anon}-rapids-${localWorkspaceFolderBasename}-24.08-cuda11.8-conda"
"${localEnv:USER:anon}-rapids-${localWorkspaceFolderBasename}-24.10-cuda11.8-conda"
],
"hostRequirements": {"gpu": "optional"},
"features": {
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils:24.8": {}
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils:24.10": {}
},
"overrideFeatureInstallOrder": [
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils"
Expand Down
6 changes: 3 additions & 3 deletions .devcontainer/cuda11.8-pip/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,17 @@
"args": {
"CUDA": "11.8",
"PYTHON_PACKAGE_MANAGER": "pip",
"BASE": "rapidsai/devcontainers:24.08-cpp-cuda11.8-ubuntu22.04"
"BASE": "rapidsai/devcontainers:24.10-cpp-cuda11.8-ubuntu22.04"
}
},
"runArgs": [
"--rm",
"--name",
"${localEnv:USER:anon}-rapids-${localWorkspaceFolderBasename}-24.08-cuda11.8-pip"
"${localEnv:USER:anon}-rapids-${localWorkspaceFolderBasename}-24.10-cuda11.8-pip"
],
"hostRequirements": {"gpu": "optional"},
"features": {
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils:24.8": {}
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils:24.10": {}
},
"overrideFeatureInstallOrder": [
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils"
Expand Down
6 changes: 3 additions & 3 deletions .devcontainer/cuda12.5-conda/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,17 @@
"args": {
"CUDA": "12.5",
"PYTHON_PACKAGE_MANAGER": "conda",
"BASE": "rapidsai/devcontainers:24.08-cpp-mambaforge-ubuntu22.04"
"BASE": "rapidsai/devcontainers:24.10-cpp-mambaforge-ubuntu22.04"
}
},
"runArgs": [
"--rm",
"--name",
"${localEnv:USER:anon}-rapids-${localWorkspaceFolderBasename}-24.08-cuda12.5-conda"
"${localEnv:USER:anon}-rapids-${localWorkspaceFolderBasename}-24.10-cuda12.5-conda"
],
"hostRequirements": {"gpu": "optional"},
"features": {
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils:24.8": {}
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils:24.10": {}
},
"overrideFeatureInstallOrder": [
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils"
Expand Down
6 changes: 3 additions & 3 deletions .devcontainer/cuda12.5-pip/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,17 @@
"args": {
"CUDA": "12.5",
"PYTHON_PACKAGE_MANAGER": "pip",
"BASE": "rapidsai/devcontainers:24.08-cpp-cuda12.5-ubuntu22.04"
"BASE": "rapidsai/devcontainers:24.10-cpp-cuda12.5-ubuntu22.04"
}
},
"runArgs": [
"--rm",
"--name",
"${localEnv:USER:anon}-rapids-${localWorkspaceFolderBasename}-24.08-cuda12.5-pip"
"${localEnv:USER:anon}-rapids-${localWorkspaceFolderBasename}-24.10-cuda12.5-pip"
],
"hostRequirements": {"gpu": "optional"},
"features": {
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils:24.8": {}
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils:24.10": {}
},
"overrideFeatureInstallOrder": [
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils"
Expand Down
2 changes: 1 addition & 1 deletion .github/labeler.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ cudf.polars:
- 'python/cudf_polars/**'

pylibcudf:
- 'python/cudf/cudf/_lib/pylibcudf/**'
- 'python/pylibcudf/**'

libcudf:
- 'cpp/**'
Expand Down
71 changes: 59 additions & 12 deletions .github/workflows/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ concurrency:
jobs:
cpp-build:
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-build.yaml@branch-24.08
uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-build.yaml@branch-24.10
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
Expand All @@ -37,7 +37,7 @@ jobs:
python-build:
needs: [cpp-build]
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-python-build.yaml@branch-24.08
uses: rapidsai/shared-workflows/.github/workflows/conda-python-build.yaml@branch-24.10
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
Expand All @@ -46,7 +46,7 @@ jobs:
upload-conda:
needs: [cpp-build, python-build]
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-upload-packages.yaml@branch-24.08
uses: rapidsai/shared-workflows/.github/workflows/conda-upload-packages.yaml@branch-24.10
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
Expand All @@ -57,19 +57,63 @@ jobs:
if: github.ref_type == 'branch'
needs: python-build
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/custom-job.yaml@branch-24.08
uses: rapidsai/shared-workflows/.github/workflows/custom-job.yaml@branch-24.10
with:
arch: "amd64"
branch: ${{ inputs.branch }}
build_type: ${{ inputs.build_type || 'branch' }}
container_image: "rapidsai/ci-conda:latest"
container_image: "rapidsai/ci-conda:cuda12.5.1-ubuntu22.04-py3.11"
date: ${{ inputs.date }}
node_type: "gpu-v100-latest-1"
run_script: "ci/build_docs.sh"
sha: ${{ inputs.sha }}
wheel-build-libcudf:
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.10
with:
# build for every combination of arch and CUDA version, but only for the latest Python
matrix_filter: group_by([.ARCH, (.CUDA_VER|split(".")|map(tonumber)|.[0])]) | map(max_by(.PY_VER|split(".")|map(tonumber)))
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
sha: ${{ inputs.sha }}
date: ${{ inputs.date }}
script: ci/build_wheel_libcudf.sh
wheel-publish-libcudf:
needs: wheel-build-libcudf
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.10
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
sha: ${{ inputs.sha }}
date: ${{ inputs.date }}
package-name: libcudf
package-type: cpp
wheel-build-pylibcudf:
needs: [wheel-publish-libcudf]
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.10
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
sha: ${{ inputs.sha }}
date: ${{ inputs.date }}
script: ci/build_wheel_pylibcudf.sh
wheel-publish-pylibcudf:
needs: wheel-build-pylibcudf
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.10
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
sha: ${{ inputs.sha }}
date: ${{ inputs.date }}
package-name: pylibcudf
package-type: python
wheel-build-cudf:
needs: wheel-publish-pylibcudf
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.08
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.10
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
Expand All @@ -79,17 +123,18 @@ jobs:
wheel-publish-cudf:
needs: wheel-build-cudf
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.08
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.10
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
sha: ${{ inputs.sha }}
date: ${{ inputs.date }}
package-name: cudf
package-type: python
wheel-build-dask-cudf:
needs: wheel-publish-cudf
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.08
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.10
with:
# This selects "ARCH=amd64 + the latest supported Python + CUDA".
matrix_filter: map(select(.ARCH == "amd64")) | group_by(.CUDA_VER|split(".")|map(tonumber)|.[0]) | map(max_by([(.PY_VER|split(".")|map(tonumber)), (.CUDA_VER|split(".")|map(tonumber))]))
Expand All @@ -101,17 +146,18 @@ jobs:
wheel-publish-dask-cudf:
needs: wheel-build-dask-cudf
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.08
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.10
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
sha: ${{ inputs.sha }}
date: ${{ inputs.date }}
package-name: dask_cudf
package-type: python
wheel-build-cudf-polars:
needs: wheel-publish-cudf
needs: wheel-publish-pylibcudf
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.08
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.10
with:
# This selects "ARCH=amd64 + the latest supported Python + CUDA".
matrix_filter: map(select(.ARCH == "amd64")) | group_by(.CUDA_VER|split(".")|map(tonumber)|.[0]) | map(max_by([(.PY_VER|split(".")|map(tonumber)), (.CUDA_VER|split(".")|map(tonumber))]))
Expand All @@ -123,13 +169,14 @@ jobs:
wheel-publish-cudf-polars:
needs: wheel-build-cudf-polars
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.08
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.10
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
sha: ${{ inputs.sha }}
date: ${{ inputs.date }}
package-name: cudf_polars
package-type: python
trigger-pandas-tests:
if: inputs.build_type == 'nightly'
needs: wheel-build-cudf
Expand Down
5 changes: 3 additions & 2 deletions .github/workflows/pandas-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,10 @@ jobs:
pandas-tests:
# run the Pandas unit tests
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-test.yaml@branch-24.08
uses: rapidsai/shared-workflows/.github/workflows/wheels-test.yaml@branch-24.10
with:
matrix_filter: map(select(.ARCH == "amd64" and .PY_VER == "3.9" and (.CUDA_VER | startswith("12.5.")) ))
# This selects "ARCH=amd64 + the latest supported Python + CUDA".
matrix_filter: map(select(.ARCH == "amd64")) | group_by(.CUDA_VER|split(".")|map(tonumber)|.[0]) | map(max_by([(.PY_VER|split(".")|map(tonumber)), (.CUDA_VER|split(".")|map(tonumber))]))
build_type: nightly
branch: ${{ inputs.branch }}
date: ${{ inputs.date }}
Expand Down
Loading
Loading