Releases: rapidsai/cuml
Releases · rapidsai/cuml
v24.12.00
🚨 Breaking Changes
- Forward merge Branch 24.10 into 24.12 (#6106) @divyegala
🐛 Bug Fixes
- Fix SSL error. (#6177) @bdice
- Fix
scikit-learn
version specifier (#6158) @trxcllnt - Correctly handle missing categorical data in experimental FIL (#6132) @wphicks
- Put a ceiling on cuda-python (#6131) @bdice
- Don't presume pointers are mutually exclusive for device/host. (#6128) @robertmaynard
- cuml SINGLEGPU now tells cuvs to not build with nccl/mg support (#6127) @robertmaynard
- Remove type from pickle header for CumlArray (#6120) @wphicks
- Forward merge Branch 24.10 into 24.12 (#6106) @divyegala
- Fix Dask estimators serialization prior to training (#6065) @viclafargue
🚀 New Features
- Enable HDBSCAN
gpu
training andcpu
inference (#6108) @divyegala
🛠️ Improvements
- Update FIL tests to use XGBoost UBJSON instead of binary (#6153) @hcho3
- Use sparse knn / distances from cuvs (#6143) @benfred
- Ensure MG to have the same number of allreduce calls in mean_stddev for sparse matrix to avoid hanging (#6141) @lijinf2
- Stop excluding cutlass from symbol exclusion check (#6140) @vyasr
- Optimize MG variance calculation for dataset standardization for logistic regression (#6138) @lijinf2
- enforce wheel size limits, README formatting in CI (#6136) @jameslamb
- Experimental command line interface UX (#6135) @dantegd
- add telemetry (#6126) @msarahan
- Make cuVS optional if CUML_ALGORITHMS is set (#6125) @hcho3
- devcontainer: replace
VAULT_HOST
withAWS_ROLE_ARN
(#6118) @jjacobelli - print sccache stats in builds (#6111) @jameslamb
- fix version in Doxygen docs (#6104) @jameslamb
- make conda installs in CI stricter (#6103) @jameslamb
- Make
get_param_names
a class method on single GPU estimators to match Scikit-learn closer (#6101) @dantegd - Prune workflows based on changed files (#6094) @KyleFromNVIDIA
- Update all rmm imports to use pylibrmm/librmm (#6084) @Matt711
- Merge branch 24.10 into branch 24.12 (#6083) @jameslamb
[NIGHTLY] v25.02.00
🔗 Links
🐛 Bug Fixes
- Adjust test_kmeans to avoid false positive failures (#6193) @dantegd
- Adjust margin of logistic regression
log_proba
pytest to avoid false positive failures (#6188) @dantegd - Skip flaky test of kernel_density in nightly job of CUDA 12.0.1 (#6184) @dantegd
- Try to reduce network usage in cuML tests. (#6174) @bdice
- cuML dask fixes to unblock CI (#6170) @dantegd
🛠️ Improvements
- Remove sphinx pinning (#6195) @vyasr
- update telemetry actions to fluent-bit friendly style (#6186) @msarahan
- Update version references in workflow (#6172) @AyodeAwe
- gate telemetry dispatch calls on TELEMETRY_ENABLED env var (#6171) @msarahan
- Use estimator tags to improve sparse error handling (#6151) @dantegd
- prefer system install of UCX in pip devcontainers, update outdated RAPIDS references (#6149) @jameslamb
- Improve infrastructure for experimental dispatching of non existing methods in cuML (#6148) @dantegd
- Adapt to rmm logger changes (#6147) @vyasr
- Require approval to run CI on draft PRs (#6145) @bdice
- Add breaking change workflow trigger (#6130) @AyodeAwe
- Switch to native traceback in
cuml
(#6078) @galipremsagar
v24.10.00
🚨 Breaking Changes
🐛 Bug Fixes
- Fix train_test_split for string columns (#6088) @dantegd
- Stop shadowing free function (#6076) @vyasr
- Set default values for conftest options. (#6067) @bdice
- Add license file to conda packages (#6061) @raydouglass
- Fix np.NAN to np.nan. (#6056) @bdice
- Reenable
pytest cuml-dask
for CUDA 12.5 wheel CI tests (#6051) @divyegala - Fix for
simplicial_set_embedding
(#6043) @viclafargue - MAINT: Allow for error message to contain
np.float32(1.0)
(#6030) @seberg - Stop exporting fill_k kernel as that causes ODR violations (#6021) @robertmaynard
- Avoid cudf column APIs after cudf.Series disallows column inputs (#6019) @mroeschke
- Use HDBSCAN package pin to
0.8.38
(#5906) @divyegala
📖 Documentation
- Update UMAP doc (#6064) @viclafargue
- Update README in experimental FIL (#6052) @hcho3
- add docs for simplicial_set (#6042) @Intron7
🚀 New Features
- TSNE CPU/GPU Interop (#6063) @divyegala
- Enable GPU
fit
and CPUtransform
in UMAP (#6032) @divyegala
🛠️ Improvements
- Migrate to use cuVS for vector search (#6085) @benfred
- Support all-zeroes feature vectors for MG sparse logistic regression (#6082) @lijinf2
- Update update-version.sh to use packaging lib (#6081) @AyodeAwe
- Use CI workflow branch 'branch-24.10' again (#6072) @jameslamb
- Update fmt (to 11.0.2) and spdlog (to 1.14.1), add those libraries to libcuml conda host dependencies (#6071) @jameslamb
- Update flake8 to 7.1.1. (#6070) @bdice
- Add support for Python 3.12, update to umap-learn==0.5.6 (#6060) @jameslamb
- Fix compiler warning about signed vs unsigned ints (#6053) @hcho3
- Update rapidsai/pre-commit-hooks (#6048) @KyleFromNVIDIA
- Drop Python 3.9 support (#6040) @jameslamb
- Add use_cuda_wheels matrix entry (#6038) @KyleFromNVIDIA
- Switch debug build to RelWithDebInfo (#6033) @rongou
- Remove NumPy <2 pin (#6031) @seberg
- Remove old dask-glm based logistic regression (#6028) @dantegd
- [FEA] UMAP API for building with batched NN Descent (#6022) @jinsolp
- Enabling CPU/GPU interop for SVM, DBSCAN and KMeans (#6020) @viclafargue
- Update pre-commit hooks (#6016) @KyleFromNVIDIA
- Improve update-version.sh (#6014) @bdice
- Use tool.scikit-build.cmake.version, set scikit-build-core minimum-version (#6012) @jameslamb
- Merge branch-24.08 into branch-24.10 (#5981) @jameslamb
- Use CUDA math wheels (#5966) @KyleFromNVIDIA
v24.08.00
🐛 Bug Fixes
- Fixes for encoders/transformers for cudf.pandas (#5990) @dantegd
- BUG: remove sample parameter from pca call to mean (#5980) @mfoerste4
- Fix segfault and other errors in ForestInference.load_from_sklearn (#5973) @hcho3
- Rename
.devcontainer
s for CUDA 12.5 (#5967) @jakirkham - [MNT] Small NumPy 2 related fixes (#5954) @seberg
- CI Fix: use ld_preload to avoid libgomp issue on ARM jobs (#5949) @dantegd
- Fix for benchmark runner to handle parameter sweeps of multiple data types (#5938) @dantegd
- Avoid extra memory copy when using cp.concatenate in cuml.dask kmeans (#5937) @dantegd
- Assign correct
labels_
incuml.dask.kmeans
(#5931) @dantegd - Fix nightly jobs by updating hypothesis strategies to account for sklearn change (#5925) @dantegd
- Fix for SVC fit_proba not using class weights (#5912) @pablotanner
- Fix
cudf.pandas
failure ontest_convert_input_dtype
(#5885) @dantegd - Fix
cudf.pandas
failure ontest_convert_matrix_order_cuml_array
(#5882) @dantegd - Simplify cuml array (#5166) @wence-
🚀 New Features
- [FEA] Enable UMAP to build knn graph using NN Descent (#5910) @jinsolp
- Allow estimators to accept any dtype (#5888) @dantegd
🛠️ Improvements
- Add support for XGBoost UBJSON in FIL (#6009) @hcho3
- split up CUDA-suffixed dependencies in dependencies.yaml (#5974) @jameslamb
- Use workflow branch 24.08 again (#5970) @KyleFromNVIDIA
- Bump Treelite to 4.3.0 (#5968) @hcho3
- reduce memory_footprint for sparse PCA transform (#5964) @Intron7
- Build and test with CUDA 12.5.1 (#5963) @KyleFromNVIDIA
- Support int64 index type in MG sparse LogisticRegression (#5962) @lijinf2
- Add CUDA_STATIC_MATH_LIBRARIES (#5959) @KyleFromNVIDIA
- skip CMake 3.30.0 (#5956) @jameslamb
- Make
ci/run_cuml_dask_pytests.sh
environment-agnostic again (#5950) @trxcllnt - Use verify-alpha-spec hook (#5948) @KyleFromNVIDIA
- nest cuml one level deeper in python (#5944) @msarahan
- resolve dependency-file-generator warning, other rapids-build-backend followup (#5928) @jameslamb
- Adopt CI/packaging codeowners (#5923) @bdice
- Remove text builds of documentation (#5921) @vyasr
- Fix conflict of forward-merge #5905 of branch-24.06 into branch-24.08 (#5911) @dantegd
- Bump Treelite to 4.2.1 (#5908) @hcho3
- remove unnecessary 'setuptools' dependency (#5901) @jameslamb
- [FEA] PCA Initialization for TSNE (#5897) @aamijar
- Use rapids-build-backend (#5804) @KyleFromNVIDIA
v24.06.01
🐛 Bug Fixes
- [HOTFIX] Fix import of sklearn by using cpu_only_import (#5914) @dantegd
- Fix label binarize for binary class (#5900) @jinsolp
- Fix RandomForestClassifier return type (#5896) @jinsolp
- Fix nightly CI: remove deprecated creation of columns by using explicit dtype (#5880) @dantegd
- Fix DBSCAN allocates rbc index even if deactivated (#5859) @mfoerste4
- Remove gtest from dependencies.yaml (#5854) @robertmaynard
- Support expression-based Dask Dataframe API (#5835) @rjzamora
- Mark all kernels with internal linkage (#5764) @robertmaynard
- Fix build.sh clean command (#5730) @csadorf
📖 Documentation
- Update the developer's guide with new copyright hook (#5848) @KyleFromNVIDIA
🚀 New Features
- Always use a static gtest and gbench (#5847) @robertmaynard
🛠️ Improvements
- [HOTFIX] Add compatibility of imports with multiple Scikit-learn versions (#5922) @dantegd
- Support double precision in MNMG Logistic Regression (#5898) @lijinf2
- Reduce and rename cudf.pandas integrations jobs (#5890) @dantegd
- Fix building cuml with CCCL main (#5886) @trxcllnt
- Add optional CI job for integration tests with cudf.pandas (#5881) @dantegd
- Enable pytest failures on FutureWarnings/DeprecationWarnings (#5877) @mroeschke
- Remove return in test_lbfgs (#5875) @mroeschke
- Avoid dask_cudf.core imports (#5874) @bdice
- Support CPU object for
train_test_split
(#5873) @isVoid - Only use functions in the limited API (#5871) @vyasr
- Replace deprecated disutils.version with packaging.version (#5868) @mroeschke
- Adjust deprecated cupy.sparse usage (#5867) @mroeschke
- Fix numpy 2.0 deprecations (#5866) @mroeschke
- Fix deprecated positional arg usage (#5865) @mroeschke
- Use int instead of float in random.randint (#5864) @mroeschke
- Migrate to
{{ stdlib("c") }}
(#5863) @hcho3 - Avoid deprecated API in notebook (#5862) @rjzamora
- Add dedicated handling for cudf.pandas wrapped Numpy arrays (#5861) @betatim
- Prepend devcontainer name with the username (#5860) @trxcllnt
- add --rm and --name to devcontainer run args (#5857) @trxcllnt
- Update pip devcontainers to UCX v1.15.0 (#5856) @trxcllnt
- Replace rmm::mr::device_memory_resource* with rmm::device_async_resource_ref (#5853) @harrism
- Update scikit-learn to 1.4 (#5851) @betatim
- Prevent undefined behavior when passing handle from Treelite to cuML FIL (#5849) @hcho3
- Adds missing files to
update-version.sh
(#5830) @AyodeAwe - Enable all tests for
arm
arch (#5824) @galipremsagar - Address PytestReturnNotNoneWarning in cuml tests (#5819) @mroeschke
- Handle binary classifier with all-0 labels (#5810) @hcho3
- Use pytest_cases.fixture to fix warnings. (#5798) @bdice
- Enable Dask tests with UCX-Py/UCXX in CI (#5697) @pentschev
v24.06.00
🐛 Bug Fixes
- [HOTFIX] Fix import of sklearn by using cpu_only_import (#5914) @dantegd
- Fix label binarize for binary class (#5900) @jinsolp
- Fix RandomForestClassifier return type (#5896) @jinsolp
- Fix nightly CI: remove deprecated creation of columns by using explicit dtype (#5880) @dantegd
- Fix DBSCAN allocates rbc index even if deactivated (#5859) @mfoerste4
- Remove gtest from dependencies.yaml (#5854) @robertmaynard
- Support expression-based Dask Dataframe API (#5835) @rjzamora
- Mark all kernels with internal linkage (#5764) @robertmaynard
- Fix build.sh clean command (#5730) @csadorf
📖 Documentation
- Update the developer's guide with new copyright hook (#5848) @KyleFromNVIDIA
🚀 New Features
- Always use a static gtest and gbench (#5847) @robertmaynard
🛠️ Improvements
- Support double precision in MNMG Logistic Regression (#5898) @lijinf2
- Reduce and rename cudf.pandas integrations jobs (#5890) @dantegd
- Fix building cuml with CCCL main (#5886) @trxcllnt
- Add optional CI job for integration tests with cudf.pandas (#5881) @dantegd
- Enable pytest failures on FutureWarnings/DeprecationWarnings (#5877) @mroeschke
- Remove return in test_lbfgs (#5875) @mroeschke
- Avoid dask_cudf.core imports (#5874) @bdice
- Support CPU object for
train_test_split
(#5873) @isVoid - Only use functions in the limited API (#5871) @vyasr
- Replace deprecated disutils.version with packaging.version (#5868) @mroeschke
- Adjust deprecated cupy.sparse usage (#5867) @mroeschke
- Fix numpy 2.0 deprecations (#5866) @mroeschke
- Fix deprecated positional arg usage (#5865) @mroeschke
- Use int instead of float in random.randint (#5864) @mroeschke
- Migrate to
{{ stdlib("c") }}
(#5863) @hcho3 - Avoid deprecated API in notebook (#5862) @rjzamora
- Add dedicated handling for cudf.pandas wrapped Numpy arrays (#5861) @betatim
- Prepend devcontainer name with the username (#5860) @trxcllnt
- add --rm and --name to devcontainer run args (#5857) @trxcllnt
- Update pip devcontainers to UCX v1.15.0 (#5856) @trxcllnt
- Replace rmm::mr::device_memory_resource* with rmm::device_async_resource_ref (#5853) @harrism
- Update scikit-learn to 1.4 (#5851) @betatim
- Prevent undefined behavior when passing handle from Treelite to cuML FIL (#5849) @hcho3
- Adds missing files to
update-version.sh
(#5830) @AyodeAwe - Enable all tests for
arm
arch (#5824) @galipremsagar - Address PytestReturnNotNoneWarning in cuml tests (#5819) @mroeschke
- Handle binary classifier with all-0 labels (#5810) @hcho3
- Use pytest_cases.fixture to fix warnings. (#5798) @bdice
- Enable Dask tests with UCX-Py/UCXX in CI (#5697) @pentschev
v24.04.00
🐛 Bug Fixes
- Update pre-commit-hooks to v0.0.3 (#5816) @KyleFromNVIDIA
- Correct and adjust tolerances of mnmg logreg pytests (#5812) @dantegd
- Remove use of cudf.core.column.full. (#5794) @bdice
- Suppress all HealthChecks on test_split_datasets. (#5791) @bdice
- Suppress a hypothesis HealthCheck on test_split_datasets that fails in nightly CI. (#5790) @bdice
- [BUG] Fix
MAX_THREADS_PER_SM
on sm 89. (#5785) @trivialfis - fix device to host copy not sync stream in logistic regression mg (#5766) @lijinf2
- Use cudf.Index instead of cudf.GenericIndex. (#5738) @bdice
- update RAPIDS dependencies to 24.4, refactor dependencies.yaml (#5726) @jameslamb
🚀 New Features
- Support CUDA 12.2 (#5711) @jameslamb
🛠️ Improvements
- Use
conda env create --yes
instead of--force
(#5822) @bdice - Bump Treelite to 4.1.2 (#5814) @hcho3
- Support standardization for sparse vectors in logistic regression MG (#5806) @lijinf2
- Update script input name (#5802) @AyodeAwe
- Add upper bound to prevent usage of NumPy 2 (#5797) @bdice
- Enable pytest failures on warnings from cudf (#5796) @mroeschke
- Use public cudf APIs where possible (#5795) @mroeschke
- Remove hard-coding of RAPIDS version where possible (#5793) @KyleFromNVIDIA
- Switch
pytest-xdist
algorithm toworksteal
(#5792) @bdice - Automate C++ include file grouping and ordering using clang-format (#5787) @harrism
- Add support for Python 3.11, require NumPy 1.23+ (#5786) @jameslamb
- [ENH] Let cuDF handle input types for label encoder. (#5783) @trivialfis
- Install test dependencies at the same time as cuml packages. (#5781) @bdice
- Update devcontainers to CUDA Toolkit 12.2 (#5778) @trxcllnt
- target branch-24.04 for GitHub Actions workflows (#5776) @jameslamb
- Add environment-agnostic scripts for running ctests and pytests (#5761) @trxcllnt
- Pandas 2.x support (#5758) @dantegd
- Update ops-bot.yaml (#5752) @AyodeAwe
- Forward-merge branch-24.02 to branch-24.04 (#5735) @bdice
- Replace local copyright check with pre-commit-hooks verify-copyright (#5732) @KyleFromNVIDIA
- DBSCAN utilize rbc eps_neighbors (#5728) @mfoerste4
v24.02.00
🚨 Breaking Changes
🐛 Bug Fixes
- [Hotfix] Fix FIL gtest (#5755) @hcho3
- Exclude tests from builds (#5754) @vyasr
- Fix ctest directory to ensure tests are executed (#5753) @bdice
- Synchronize stream in SVC memory test (#5729) @wphicks
- Fix shared-workflows repo name (#5723) @raydouglass
- Fix cupy dependency in pyproject.toml (#5705) @vyasr
- Only cufft offers a static_nocallback version of the library (#5703) @robertmaynard
🛠️ Improvements
- [Hotfix] Update GPUTreeSHAP to fix ARM build (#5747) @hcho3
- Disable HistGradientBoosting support for now (#5744) @hcho3
- Disable hnswlib feature in RAFT; pin pytest (#5733) @hcho3
- [LogisticRegressionMG] Support standardization with no data modification (#5724) @lijinf2
- Remove usages of rapids-env-update (#5716) @KyleFromNVIDIA
- Remove extraneous SKBUILD_BUILD_OPTIONS (#5714) @vyasr
- refactor CUDA versions in dependencies.yaml (#5712) @jameslamb
- Update to CCCL 2.2.0. (#5702) @bdice
- Migrate to Treelite 4.0 (#5701) @hcho3
- Use cuda::proclaim_return_type on device lambdas. (#5696) @bdice
- move _process_generic to base_return_types, avoid circular import (#5695) @dcolinmorgan
- Switch to scikit-build-core (#5693) @vyasr
- Fix all deprecated function calls in TUs where warnings are errors (#5692) @vyasr
- Remove CUML_BUILD_WHEELS and standardize Python builds (#5689) @vyasr
- Forward-merge branch-23.12 to branch-24.02 (#5657) @bdice
- Add cuML devcontainers (#5568) @trxcllnt
v23.12.00
🚨 Breaking Changes
🐛 Bug Fixes
- Update actions/labeler to v4 (#5686) @raydouglass
- updated docs around
make_column_transformer
change from.preprocessing
to.compose
(#5680) @taureandyernv - Skip dask pytest NN hang in CUDA 11.4 CI (#5665) @dantegd
- Avoid hard import of sklearn in base module. (#5663) @csadorf
- CI: Pin clang-tidy to 15.0.7. (#5661) @csadorf
- Adjust assumption regarding valid cudf.Series dimensional input. (#5654) @csadorf
- Flatten cupy array before feeding to cudf.Series (#5651) @vyasr
- CI: Fix expected ValueError and dask-glm incompatibility (#5644) @csadorf
- Use drop_duplicates instead of unique for cudf's pandas compatibility mode (#5639) @vyasr
- Temporarily avoid pydata-sphinx-theme version 0.14.2. (#5629) @csadorf
- Fix type hint in split function. (#5625) @trivialfis
- Fix trying to get pointer to None in svm/linear.pyx (#5615) @yosider
- Reduce parallelism to avoid OOMs in wheel tests (#5611) @vyasr
📖 Documentation
- Update interoperability docs (#5633) @beckernick
- Update instructions for creating a conda build environment (#5628) @csadorf
🚀 New Features
- Basic implementation of
OrdinalEncoder
. (#5646) @trivialfis
🛠️ Improvements
- Build concurrency for nightly and merge triggers (#5658) @bdice
- [LogisticRegressionMG][FEA] Support training when dataset contains only one class (#5655) @lijinf2
- Use new
rapids-dask-dependency
metapackage for managingdask
versions (#5649) @galipremsagar - Simplify some logic in LabelEncoder (#5648) @vyasr
- Increase
Nanny
close timeout inLocalCUDACluster
tests (#5636) @pentschev - [LogisticRegressionMG] Support sparse vectors (#5632) @lijinf2
- Add rich HTML representation to estimators (#5630) @betatim
- Unpin
dask
anddistributed
for23.12
development (#5627) @galipremsagar - Update
shared-action-workflows
references (#5621) @AyodeAwe - Use branch-23.12 workflows. (#5618) @bdice
- Update rapids-cmake functions to non-deprecated signatures (#5616) @robertmaynard
- Allow nightly dependencies and set up consistent nightly versions for conda and pip packages (#5607) @vyasr
- Forward-merge branch-23.10 to branch-23.12 (#5596) @bdice
- Build CUDA 12.0 ARM conda packages. (#5595) @bdice
- Enable multiclass svm for sparse input (#5588) @mfoerste4
v23.10.00
🚨 Breaking Changes
- add sample_weight parameter to dbscan.fit (#5574) @mfoerste4
- Update to Cython 3.0.0 (#5506) @vyasr
🐛 Bug Fixes
- Fix accidental unsafe cupy import (#5613) @dantegd
- Fixes for CPU package (#5599) @dantegd
- Fixes for timeouts in tests (#5598) @dantegd
🚀 New Features
- Enable cuml-cpu nightly (#5585) @dantegd
- add sample_weight parameter to dbscan.fit (#5574) @mfoerste4
🛠️ Improvements
- cuml-cpu notebook, docs and cluster models (#5597) @dantegd
- Pin
dask
anddistributed
for23.10
release (#5592) @galipremsagar - Add changes for early experimental support for dataframe interchange protocol API (#5591) @dantegd
- [FEA] Support L1 regularization and ElasticNet in MNMG Dask LogisticRegression (#5587) @lijinf2
- Update image names (#5586) @AyodeAwe
- Update to clang 16.0.6. (#5583) @bdice
- Upgrade to Treelite 3.9.1 (#5581) @hcho3
- Update to doxygen 1.9.1. (#5580) @bdice
- [REVIEW] Adding a few of datasets for benchmarking (#5573) @vinaydes
- Allow cuML MNMG estimators to be serialized (#5571) @viclafargue
- [FEA] Support multiple classes in multi-node-multi-gpu logistic regression, from C++, Cython, to Dask Python class (#5565) @lijinf2
- Use
copy-pr-bot
(#5563) @ajschmidt8 - Unblock CI for branch-23.10 (#5561) @csadorf
- Fix CPU-only build for new FIL (#5559) @hcho3
- [FEA] Support no regularization in MNMG LogisticRegression (#5558) @lijinf2
- Unpin
dask
anddistributed
for23.10
development (#5557) @galipremsagar - Branch 23.10 merge 23.08 (#5547) @vyasr
- Use Python builtins to prep benchmark
tmp_dir
(#5537) @jakirkham - Branch 23.10 merge 23.08 (#5522) @vyasr
- Update to Cython 3.0.0 (#5506) @vyasr