YanxuanLiu
released this
19 Sep 07:54
·
34 commits
to branch-24.10
since this release
Release notes:
- Removed MAXINT limit on number of non-zero inputs per GPU for sparse logistic regression.
- IVF-PQ and Cagra were added to the suite of supported approximate nearest neighbor algorithms.
- Extended benchmarking scripts to be compatible with Databricks runtime 13.3 with the spark-rapids plugin and 14.3 and 15.4 without the plugin.
- Included an experimental CLI for no-import-statement-change acceleration of pyspark.ml applications.
- Fixed a slow down for inputs having a large number of columns when type conversion is required.
- Updated RAPIDS dependencies to 24.08.
- Known issues to be fixed in next release:
- for sparse logistic regression fit a low-level C++/CUDA exception is raised if a partition has no non-zero data.
- array type inputs with int dtypes are not converted to float leading to errors in some algorithms (e.g. cagra ann)
- in ivf-pq based Cagra the intermediate graph degree must <= 128 or a low-level C++ exception is raised
- test_sparse_int64 test requires 256GB host memory to run and not 128GB stated in the comments
pip package available at https://pypi.org/project/spark-rapids-ml/24.08.0/