merge with upstream #27

daxiongshu · 2021-05-19T08:51:18Z

No description provided.

@ajschmidt8

Prepare Changelog for Automation (#3570) This PR prepares the changelog to be automatically updated during releases. The contents of the pre-release body linked in this PR will be copied into CHANGELOG.md at release time. Authors: - AJ Schmidt (@ajschmidt8) Approvers: - Dillon Cullinan (@dillon-cullinan) URL: #3570

@tfeher

closes #947 If the input data for SVM is not normalized correctly, then convergence can be very slow. The solver can even fail to converge. This PR detects such cases and prints a debug message with suggestions how to fix this problem. Such problems were reported in #947, #1664, #2857, #3233. The threshold for reporting is set so that the message is printed in those cases. I have tested several properly normalized cases to confirm that the message is not shown. Still, the threshold for printing the message does not have a proper theoretical justification, and false positives might occur. Therefore only a debug message is shown instead of a warning. Authors: - Tamas Bela Feher (@tfeher) Approvers: - Dante Gama Dessavre (@dantegd) URL: #3562

@lowener

Closes #3319. This PR will replace the distance type from ML::MetricType to raft::distance::DistanceType. Since Raft DistanceType makes the distinction between the expanded and non-expanded distances in the name, I changed the C++ API to remove the boolean parameter `expanded` which becomes useless. Authors: - Micka (@lowener) Approvers: - Corey J. Nolet (@cjnolet) URL: #3389

@cjnolet

Authors: - Corey J. Nolet (@cjnolet) Approvers: - Divye Gala (@divyegala) - Dante Gama Dessavre (@dantegd) - John Zedlewski (@JohnZed) URL: #3579

@viclafargue

Answers #3459 Authors: - Victor Lafargue (@viclafargue) - Dante Gama Dessavre (@dantegd) Approvers: - AJ Schmidt (@ajschmidt8) - Dante Gama Dessavre (@dantegd) URL: #3509

@dantegd

`dask` and `distributed` are changing their default branches name from `master` to `main`, this will break our dev environments and CI, this PR updates the required files. `distributed` already merged the PR that does the change, `dask` will probably do the same very soon so a PR that updates both seems to be the best approach. Authors: - Dante Gama Dessavre (@dantegd) Approvers: - William Hicks (@wphicks) - AJ Schmidt (@ajschmidt8) - @jakirkham - Dillon Cullinan (@dillon-cullinan) URL: #3593

@levsnv

Probabilities are limited between [0.0, 1.0]. Also, we generally care more about large probabilities which are `O(1/n_classes)`. The largest relative probability errors are usually caused by a small ground truth probability (e.g. 1e-3), as opposed to a large absolute error. Hence, relative probability error is not the best metric. Absolute probability error is more relevant. Moreover, absolute probability error is more stable, as relative errors have a long tail. When training or even inferring on many rows, the chance of getting a ground truth probability sized 1e-3 or 1e-4 grows. In some cases, there is no reasonable and reliable threshold. Last, if the number of predicted probabilities (clipped values) per input row grows, so does the long tail of relative probability errors, due to less undersampling. This unfairly compares binary classification with regression, and multiclass classification with binary classification. The changes below are based on collecting absolute errors under `--run_unit`, `--run_quality` and `--run_stress`. These thresholds are violated at most a couple times per million samples, in most cases never. Authors: - @levsnv Approvers: - John Zedlewski (@JohnZed) - Andy Adinets (@canonizer) URL: #3582

@wphicks

Provide flag to enable ccache for building C, C++, and CUDA code Authors: - William Hicks (@wphicks) Approvers: - Dante Gama Dessavre (@dantegd) URL: #3566

@lowener

Just a small fix of duplicates in KMeans doc. Authors: - Micka (@lowener) Approvers: - Dante Gama Dessavre (@dantegd) URL: #3595

@levsnv

Authors: - @levsnv Approvers: - Andy Adinets (@canonizer) - John Zedlewski (@JohnZed) URL: #2894

@wphicks

Provide forward declarations where possible to reduce unnecessary includes Eliminate unneeded or redundant includes Authors: - William Hicks (@wphicks) Approvers: - Dante Gama Dessavre (@dantegd) URL: #3564

@ajschmidt8

The tag used for pre-releases was recently changed, so this PR updates the link in the changelog. Authors: - AJ Schmidt (@ajschmidt8) Approvers: - Jordan Jacobelli (@Ethyling) URL: #3601

@hcho3

* Add link to https://docs.rapids.ai/notices/rdn0002/, which lays out instructions for building cuML with GCC 7.5 * Update GCC requirement to 7.5 or later Closes #3604 Authors: - Philip Hyunsu Cho (@hcho3) Approvers: - Dante Gama Dessavre (@dantegd) URL: #3605

@lowener

Closes #3598. Authors: - Micka (@lowener) Approvers: - Dante Gama Dessavre (@dantegd) URL: #3600

@lowener

Closes #3576. This PR is a fix for CUDA arrays that don't have the `strides` property or that set this property to `None`. Since it seems to be compliant with CUDA array interface v2 we should be able to support it. Authors: - Micka (@lowener) Approvers: - Dante Gama Dessavre (@dantegd) URL: #3594

@mike-wendt

- Updates the stale GHA to enable more operations per run to account for the large number of issues in this repo - Prevents `inactive-30d` labels from being applied to issues/PRs that have a `inactive-90d` label Authors: - Mike Wendt (@mike-wendt) Approvers: - Ray Douglass (@raydouglass) URL: #3613

@mike-wendt

Reverts #3613 The changes made to the number of operations resulted in using all available GH API calls across the org which prevents other GHAs from running in other repos. This reverts the change until a better solution can be determined on how to proceed Authors: - Mike Wendt (@mike-wendt) Approvers: - Ray Douglass (@raydouglass) URL: #3614

@hcho3

Closes #3347. Make the `predict()` and `predict_proba()` functions of RF to match those in the scikit-learn RF. * Eliminate the parameter `output_class`. Instead, `predict()` will always produce class prediction, and `predict_proba()` will always produce probability prediction. (This applies to binary and multi-class classifiers. Regressors will only have `predict()`.) * Remove the `threshold` parameter from `predict_proba()`. Authors: - Philip Hyunsu Cho (@hcho3) Approvers: - John Zedlewski (@JohnZed) URL: #3609

@cjnolet

Closes #3610 Authors: - Corey J. Nolet (@cjnolet) Approvers: - Micka (@lowener) - John Zedlewski (@JohnZed) URL: #3612

@lowener

Closes #1596. The documentation already existed, it was just not listed in the api.rst for Sphinx. Authors: - Micka (@lowener) Approvers: - Dante Gama Dessavre (@dantegd) URL: #3617

@divyegala

closes #3584 Authors: - Divye Gala (@divyegala) Approvers: - Dante Gama Dessavre (@dantegd) URL: #3619

@viclafargue

Closes #3484 Imports sklearn's Pipeline and GridSearch meta-estimators into cuML namespace for ease-of-use. Authors: - Victor Lafargue (@viclafargue) Approvers: - William Hicks (@wphicks) - John Zedlewski (@JohnZed) URL: #3493

* Removing sparse prims since they've been moved to raft * Updating copyrights * Updating raft hash * Setting libcumprims to 0.18 for now * Using fused l2 nn from raft * Fixing style * Updating copyright * Using raft hash to make CI build * Moving cumlprims conda recipe back to minor_version * Updating style * Updating raft hash to point to my branch until raft pr is merged * Removing tests that are no longer needed * Updating raft hash to branch-0.19 * Updating raft hash * Updating nccl version * Updating includes * Removing files from bad merge

@vinaydes

…e computations (#3586) Previous to this PR, when new/experimental backend is used for training, the temporary memory needed by old backend is also getting allocated. This PR fixes the issue. The temporary memory is allocated conditionally now. This PR also changes the computation of quantiles for new backend. The old way of computing quantiles may leave last few samples due to incorrect quantile thresholds. Impact on accuracy is still to be evaluated thoroughly. Authors: - Vinay Deshpande (@vinaydes) Approvers: - Thejaswi. N. S (@teju85) - Philip Hyunsu Cho (@hcho3) URL: #3586

@venkywonka

* This PR partially solves the issue raised [here](#3089 (comment)). * Removes unused `DecisionTreeParams` struct in `randomforest_shared.pxd`. * Unifies the different APIs (namely `set_rf_params`, `set_all_rf_params`, `set_rf_class_obj`) into a single point of parameter initialization (as `set_rf_params`) in the C++ layer; and propagating the changes. Authors: - Venkat (@venkywonka) - John Zedlewski (@JohnZed) Approvers: - Philip Hyunsu Cho (@hcho3) - John Zedlewski (@JohnZed) - Thejaswi. N. S (@teju85) URL: #3358

@cjnolet

Closes #3518 I've closed the original PR (#3308) which included both SLHC & HDBSCAN and opened this PR to only include the SLHC changes. This PR contains an implementation of SLHC which is currently broken across RAFT & cuML. Once we move the dense pairwise distance primitive over to RAFT the entire SLHC algorithm can live in RAFT so it can be shared w/ cugraph, and will just be exposed through cuml. If reviewing this PR, please also review the corresponding RAFT PR: rapidsai/raft#140 Authors: - Corey J. Nolet (@cjnolet) Approvers: - Divye Gala (@divyegala) - Dante Gama Dessavre (@dantegd) - Mike Wendt (@mike-wendt) URL: #3545

@hcho3

Closes #3627 Authors: - Philip Hyunsu Cho (@hcho3) Approvers: - Dante Gama Dessavre (@dantegd) URL: #3634

@viclafargue

Answers #2868 Authors: - Victor Lafargue (@viclafargue) Approvers: - Micka (@lowener) - Dante Gama Dessavre (@dantegd) URL: #3533

@hcho3

Closes #3064. Treelite supports ExtraTreeRegressor and ExtraTreeClassifier starting from version 1.0.0, so this is just a matter of exposing the capability to FIL. Also add ExtraTreeRegressor / ExtraTreeClassifier to the FIL test matrix. Authors: - Philip Hyunsu Cho (@hcho3) Approvers: - Dante Gama Dessavre (@dantegd) URL: #3635

@divyegala

closes #3590 Authors: - Divye Gala (@divyegala) Approvers: - William Hicks (@wphicks) - John Zedlewski (@JohnZed) URL: #3642

…3762) This commit fixes the Dockerfile reference to libcumlcomms and other references inside the Python README.md document. Signed-off-by: Julio Faracco <jcfaracco@gmail.com> Authors: - Julio Faracco (https://github.com/jcfaracco) Approvers: - Ray Douglass (https://github.com/raydouglass) - Dante Gama Dessavre (https://github.com/dantegd) URL: #3762

…arameter threading (#3800) As we keep adding new API arguments, there are necessary changes to the public function signatures, but also unnecessary changes to every `load*` call to propagate the new option through all code paths. This change will simplify all future changes. Authors: - https://github.com/levsnv Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: #3800

Following the update to cupy 8.5.0, the bad read in the `cupy.percentile` kernel should no longer be an issue, allowing us to remove the xfail on this test. Close #2933 Authors: - William Hicks (https://github.com/wphicks) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: #3804

Remove `defaults` channel from conda build Authors: - Jordan Jacobelli (https://github.com/Ethyling) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) - AJ Schmidt (https://github.com/ajschmidt8) URL: #3803

Since CuPy 9.0 update, the CuPy module doesn't have the `core` namespace anymore. This PR updates the cuML code accordingly. Authors: - Victor Lafargue (https://github.com/viclafargue) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: #3806

Remove conda defaults channel in builddocs environment Authors: - Jordan Jacobelli (https://github.com/Ethyling) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) URL: #3815

Before this, running a call to autoarima.forecast would run into the following issue: ```bash Traceback (most recent call last): File "bla2.py", line 17, in <module> model.search(s=12, d=1) File "/home/galahad/miniconda3/envs/cumlbench-019/lib/python3.8/site-packages/cuml/internals/api_decorators.py", line 360, in inner return func(*args, **kwargs) File "cuml/tsa/auto_arima.pyx", line 388, in cuml.tsa.auto_arima.AutoARIMA.search AttributeError: 'CumlArray' object has no attribute 'reshape' ``` which was not caught since there is no forecast pytest for autoarima, only for arima Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Louis Sugy (https://github.com/Nyrio) - John Zedlewski (https://github.com/JohnZed) URL: #3811

closes #3792 Authors: - Divye Gala (https://github.com/divyegala) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: #3808

Closes #3705. This PR is fixing the naive implementation of Hellinger distance for the tests. I also simplified the code of the tests, but the fix is in the raft update. Authors: - Micka (https://github.com/lowener) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #3736

…merge-0.19

Merge `branch-0.19` into `branch-0.20`

…#3825) Remove `rapidsai-nightly` conda channel when building main branch Authors: - Jordan Jacobelli (https://github.com/Ethyling) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) URL: #3825

Remove progress output on conda packages upload Authors: - Jordan Jacobelli (https://github.com/Ethyling) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) - AJ Schmidt (https://github.com/ajschmidt8) URL: #3828

) Match the sklearn train_test_split to accept any input column. It has to be from the input X. Previously only accepted bool. Closes #3623 Authors: - Nanthini (https://github.com/Nanthini10) Approvers: - Devin Robison (https://github.com/drobison00) - Dante Gama Dessavre (https://github.com/dantegd) URL: #3749

@teju85

Closes issue #3832 Related to #3767 cc @teju85 and @venkywonka and @vinaydes who are working on RF Will be profiling the solution before flipping the PR to ready to review Quick profiling, on a 2070S laptop,average of 10 runs of a simple LinearRegression.fit (that expects data in `F` format), with a `X` matrix of 500 columns with 100000 rows shows: - Before the fix: ``` common.input_utils.input_to_cuml_array : 0.1795 s ``` - After the fix: ``` common.input_utils.input_to_cuml_array : 0.0632 s ``` Authors: - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - William Hicks (https://github.com/wphicks) - Corey J. Nolet (https://github.com/cjnolet) URL: #3835

This PR enables benchmarking cuML with the help of NVTX ranges. The `nvtx_benchmark.py` script produces a simple display of runtime measurements. To produce the measurements, run `nvtx_benchmark.py <command>`. e.g. : `python nvtx_benchmark.py "python test.py"`, the command should be given as first argument (see quotes). Currently, the following runtimes will be displayed: - The `fit`, `transform`, `predict`, `fit_transform`, and `fit_predict` functions of cuML estimators - Random dataset generator functions such as `make_classification`, `make_regression`, and `make_blobs` - Utilities such as : the `input_to_cuml_array`, `input_to_cupy_array` and `input_to_host_array` functions and some of the methods in the `CumlArray` and `SparseCumlArray` classes - NVTX ranges from the C++ layer - cuDF NVTX ranges Here is an example with the following script: ``` from cuml.datasets import make_blobs from cuml.manifold import UMAP X, y = make_blobs(n_samples=1000, n_features=30) model = UMAP() model.fit(X) embeddngs = model.transform(X) ``` that produces this profiling result: ``` datasets.make_blobs : 1.3571 s Utils summary: common.input_utils.input_to_cuml_array : 0.0002 s common.CumlArray.__init__ : 0.0000 s common.CumlArray.to_output : 0.0000 s manifold.umap.fit [0x7f10eb69d4f0] : 0.6629 s |> umap::unsupervised::fit : 0.6611 s |==> umap::knnGraph : 0.4693 s |==> umap::simplicial_set : 0.0015 s |==> umap::embedding : 0.1902 s Utils summary: common.input_utils.input_to_cuml_array : 0.0015 s common.CumlArray.__init__ : 0.0001 s common.CumlArray.zeros : 0.0001 s common.CumlArray.full : 0.0001 s manifold.umap.transform [0x7f10eb69d4f0] : 0.0934 s |> umap::transform : 0.0925 s |==> umap::knnGraph : 0.0909 s |==> umap::smooth_knn : 0.0002 s |==> umap::optimization : 0.0011 s Utils summary: common.input_utils.input_to_cuml_array : 0.0005 s common.CumlArray.__init__ : 0.0001 s common.CumlArray.zeros : 0.0001 s common.CumlArray.full : 0.0001 s common.CumlArray.to_output : 0.0000 s ``` Authors: - Victor Lafargue (https://github.com/viclafargue) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: #3770

Answers #3820. This PR fixes the broadcast feature of the Random Forest estimator. The weights used by the reduction step were generated incorrectly. Indeed, the right values are to be deducted, for each chunk to be predicted, by the number estimators trained by the specific worker holding that chunk. The values wrongly used previously were the number of estimators held by each worker in the order of their construction. Authors: - Victor Lafargue (https://github.com/viclafargue) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: #3833

- Shuffle `stratify` column for consistent shuffle - Add `random_state` for reproducible results in test Closes #3839 Authors: - Nanthini (https://github.com/Nanthini10) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: #3841

Answers #3813 Authors: - Victor Lafargue (https://github.com/viclafargue) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: #3836

Closes #798. Authors: - Micka (https://github.com/lowener) Approvers: - Victor Lafargue (https://github.com/viclafargue) - Divye Gala (https://github.com/divyegala) - Dante Gama Dessavre (https://github.com/dantegd) URL: #3757

Update the release script to take a parameter with the new version instead of calculating the new version. Authors: - Ray Douglass (https://github.com/raydouglass) Approvers: - Dillon Cullinan (https://github.com/dillon-cullinan) - AJ Schmidt (https://github.com/ajschmidt8) URL: #3852

Related: rapidsai/raft#229 We recently discovered a memory error in the `devArrMatch()` function: https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci-v0.20/job/cuml/job/prb/job/cuml-gpu-test/CUDA=11.0,GPU_LABEL=gpu,OS=ubuntu16.04,PYTHON=3.7/161/console ``` 13:09:34 [----------] 44 tests from FilTests/TreeliteDenseFilTest 13:09:34 [ RUN ] FilTests/TreeliteDenseFilTest.Import/0 13:09:34 *** Error in `./test/ml': free(): invalid pointer: 0x00007f632b691fe8 *** 13:09:34 ======= Backtrace: ========= 13:09:34 /lib/x86_64-linux-gnu/libc.so.6(+0x777f5)[0x7f632b3447f5] 13:09:34 /lib/x86_64-linux-gnu/libc.so.6(+0x8038a)[0x7f632b34d38a] 13:09:34 /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f632b35158c] 13:09:34 /workspace/ci/artifacts/cuml/cpu/conda_work/cpp/build/libcuml++.so(_ZN8treelite9ModelImplIffED0Ev+0xf5)[0x7f632c556405] ``` which was traced to the `devArrMatch()` function as follows: ``` $ valgrind ./cpp/build/test/ml --gtest_filter=FilTests/TreeliteDenseFilTest.Import/0 ==6398== Mismatched free() / delete / delete [] ==6398== at 0x483D1CF: operator delete(void*, unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==6398== by 0x209287: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (in /home/ubuntu/cuml/cpp/build/test/ml) ==6398== by 0x2AC253: testing::AssertionResult raft::devArrMatch<float, raft::CompareApprox<float> >(float const*, float const*, unsigned long, raft::CompareApprox<float>, CUstream_st*) (in /home/ubuntu/cuml/cpp/build/test/ml) ==6398== by 0x2AC51C: ML::BaseFilTest::compare() (in /home/ubuntu/cuml/cpp/build/test/ml) ==6398== by 0x4858098D: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (in /home/ubuntu/miniconda3/envs/cuml_dev/lib/libgtest.so) ==6398== by 0x48580BE0: testing::Test::Run() (in /home/ubuntu/miniconda3/envs/cuml_dev/lib/libgtest.so) ==6398== by 0x48580F0E: testing::TestInfo::Run() (in /home/ubuntu/miniconda3/envs/cuml_dev/lib/libgtest.so) ==6398== by 0x48581035: testing::TestSuite::Run() (in /home/ubuntu/miniconda3/envs/cuml_dev/lib/libgtest.so) ==6398== by 0x485815EB: testing::internal::UnitTestImpl::RunAllTests() (in /home/ubuntu/miniconda3/envs/cuml_dev/lib/libgtest.so) ==6398== by 0x48581858: testing::UnitTest::Run() (in /home/ubuntu/miniconda3/envs/cuml_dev/lib/libgtest.so) ==6398== by 0x4853007E: main (in /home/ubuntu/miniconda3/envs/cuml_dev/lib/libgtest_main.so) ==6398== Address 0x232bfa860 is 0 bytes inside a block of size 160,000 alloc'd ==6398== at 0x483C583: operator new[](unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==6398== by 0x2ABFF8: testing::AssertionResult raft::devArrMatch<float, raft::CompareApprox<float> >(float const*, float const*, unsigned long, raft::CompareApprox<float>, CUstream_st*) (in /home/ubuntu/cuml/cpp/build/test/ml) ==6398== by 0x2AC51C: ML::BaseFilTest::compare() (in /home/ubuntu/cuml/cpp/build/test/ml) ==6398== by 0x4858098D: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (in /home/ubuntu/miniconda3/envs/cuml_dev/lib/libgtest.so) ==6398== by 0x48580BE0: testing::Test::Run() (in /home/ubuntu/miniconda3/envs/cuml_dev/lib/libgtest.so) ==6398== by 0x48580F0E: testing::TestInfo::Run() (in /home/ubuntu/miniconda3/envs/cuml_dev/lib/libgtest.so) ==6398== by 0x48581035: testing::TestSuite::Run() (in /home/ubuntu/miniconda3/envs/cuml_dev/lib/libgtest.so) ==6398== by 0x485815EB: testing::internal::UnitTestImpl::RunAllTests() (in /home/ubuntu/miniconda3/envs/cuml_dev/lib/libgtest.so) ==6398== by 0x48581858: testing::UnitTest::Run() (in/home/ubuntu/miniconda3/envs/cuml_dev/lib/libgtest.so) ==6398== by 0x4853007E: main (in /home/ubuntu/miniconda3/envs/cuml_dev/lib/libgtest_main.so) ``` **Diagnosis**. The `devArrMatch` functions are allocating a temporary buffer using `new[]` and then assigning it to a `shared_ptr<T>`. This is not valid because the destructor of `shared_ptr<T>` will invoke `delete`, not `delete[]`. Calling `delete` with a buffer allocated by `new[]` leads to an undefined behavior. See https://docs.microsoft.com/en-us/cpp/code-quality/c6278?view=msvc-160. **Proposed fix**. Use `std:unique_ptr<T[]>` instead to store temporary buffers. Authors: - Philip Hyunsu Cho (https://github.com/hcho3) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #3860

Upgrade to Treelite 1.3.0 to take advantage of the following new features: * Faster model import for scikit-learn tree models (dmlc/treelite#264). Fixes #3768 * Binary serializer to a file stream (dmlc/treelite#270, dmlc/treelite#273) * [EXPERIMENTAL] Add GTIL, reference inference backend (dmlc/treelite#274) Make progress towards #3853 Depends on rapidsai/integration#270 Authors: - Philip Hyunsu Cho (https://github.com/hcho3) Approvers: - William Hicks (https://github.com/wphicks) - AJ Schmidt (https://github.com/ajschmidt8) - Dante Gama Dessavre (https://github.com/dantegd) URL: #3855

ajschmidt8 and others added 30 commits March 1, 2021 15:34

Adding haversine to brute force knn (#3579)

a3bfb36

Authors: - Corey J. Nolet (@cjnolet) Approvers: - Divye Gala (@divyegala) - Dante Gama Dessavre (@dantegd) - John Zedlewski (@JohnZed) URL: #3579

Upgrade FAISS to 1.7.x (#3509)

bacb05e

Answers #3459 Authors: - Victor Lafargue (@viclafargue) - Dante Gama Dessavre (@dantegd) Approvers: - AJ Schmidt (@ajschmidt8) - Dante Gama Dessavre (@dantegd) URL: #3509

Provide "--ccache" flag for build.sh (#3566)

ce02064

Provide flag to enable ccache for building C, C++, and CUDA code Authors: - William Hicks (@wphicks) Approvers: - Dante Gama Dessavre (@dantegd) URL: #3566

Fix documentation of KMeans (#3595)

0967a00

Just a small fix of duplicates in KMeans doc. Authors: - Micka (@lowener) Approvers: - Dante Gama Dessavre (@dantegd) URL: #3595

Add predict_proba() to XGBoost-style models in FIL C++ (#2894)

8b78fa3

Authors: - @levsnv Approvers: - Andy Adinets (@canonizer) - John Zedlewski (@JohnZed) URL: #2894

Eliminate unnecessary includes discovered by cppclean (#3564)

0a7a0fa

Provide forward declarations where possible to reduce unnecessary includes Eliminate unneeded or redundant includes Authors: - William Hicks (@wphicks) Approvers: - Dante Gama Dessavre (@dantegd) URL: #3564

Update Changelog Link (#3601)

8645de5

The tag used for pre-releases was recently changed, so this PR updates the link in the changelog. Authors: - AJ Schmidt (@ajschmidt8) Approvers: - Jordan Jacobelli (@Ethyling) URL: #3601

Update One-Hot Encoder doc (#3600)

4aaec3b

Closes #3598. Authors: - Micka (@lowener) Approvers: - Dante Gama Dessavre (@dantegd) URL: #3600

Fixing support for empty rows in sparse Jaccard / Cosine (#3612)

d45bd81

Closes #3610 Authors: - Corey J. Nolet (@cjnolet) Approvers: - Micka (@lowener) - John Zedlewski (@JohnZed) URL: #3612

Including log loss metric to the documentation website (#3617)

452f92d

Closes #1596. The documentation already existed, it was just not listed in the api.rst for Sphinx. Authors: - Micka (@lowener) Approvers: - Dante Gama Dessavre (@dantegd) URL: #3617

Silhouette Score make_monotonic for non-monotonic label set (#3619)

96eaf62

closes #3584 Authors: - Divye Gala (@divyegala) Approvers: - Dante Gama Dessavre (@dantegd) URL: #3619

Sklearn meta-estimators into namespace (#3493)

28c3e39

Closes #3484 Imports sklearn's Pipeline and GridSearch meta-estimators into cuML namespace for ease-of-use. Authors: - Victor Lafargue (@viclafargue) Approvers: - William Hicks (@wphicks) - John Zedlewski (@JohnZed) URL: #3493

Update doc, now that FIL supports multi-class classification (#3634)

b388b86

Closes #3627 Authors: - Philip Hyunsu Cho (@hcho3) Approvers: - Dante Gama Dessavre (@dantegd) URL: #3634

Additional distance metrics for ANN (#3533)

53a622b

Answers #2868 Authors: - Victor Lafargue (@viclafargue) Approvers: - Micka (@lowener) - Dante Gama Dessavre (@dantegd) URL: #3533

OOB access in GLM SoftMax (#3642)

af0863d

closes #3590 Authors: - Divye Gala (@divyegala) Approvers: - William Hicks (@wphicks) - John Zedlewski (@JohnZed) URL: #3642

jcfaracco and others added 24 commits April 27, 2021 21:03

ENH Remove conda defaults channel in builddocs environment (#3815)

49ed9a8

Remove conda defaults channel in builddocs environment Authors: - Jordan Jacobelli (https://github.com/Ethyling) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) URL: #3815

Exact TSNE fewer log statements (#3808)

9a74d20

closes #3792 Authors: - Divye Gala (https://github.com/divyegala) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: #3808

Merge remote-tracking branch 'upstream/branch-0.19' into branch-0.20-…

932f41c

…merge-0.19

Merge pull request #3823 from ajschmidt8/branch-0.20-merge-0.19

e125c62

Merge `branch-0.19` into `branch-0.20`

Fix SimpleImputer testing (#3836)

09f6fbe

Answers #3813 Authors: - Victor Lafargue (https://github.com/viclafargue) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: #3836

DOC Update to v21.06.00

909ee29

github-actions bot added CMake conda CUDA/C++ Cython / Python gpuCI labels May 19, 2021

daxiongshu merged commit 6366e9e into daxiongshu:branch-21.06 May 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

merge with upstream #27

merge with upstream #27

daxiongshu commented May 19, 2021

merge with upstream #27

merge with upstream #27

Conversation

daxiongshu commented May 19, 2021