Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync with upstream #18

Merged
merged 1,179 commits into from
Nov 1, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1179 commits
Select commit Hold shift + click to select a range
d6ef74e
Style fixes
wphicks Sep 2, 2020
6ab1221
Couple minor updates
cjnolet Sep 2, 2020
9e21f9a
Adding test for changing dtype
cjnolet Sep 2, 2020
6211c8e
FIX skip lightgbm test for version 3 and above temporarily
dantegd Sep 2, 2020
df028f9
DOC Added entry to changelog
dantegd Sep 2, 2020
ec374ed
Adding more tests. Guaranteeing sparse input. Forcing CSR
cjnolet Sep 2, 2020
93f4606
Adding to docs
cjnolet Sep 2, 2020
c3747f0
Bugfix for 2767 - fix rf path trying to sample 0 columns
drobison00 Sep 2, 2020
8f570ee
Update changelog
drobison00 Sep 2, 2020
6574dd4
Adding more documentation and comments based on PR feedback. Also add…
mdemoret-nv Sep 2, 2020
41655e4
Merge branch 'branch-0.16' into fea-improve-cython-build_ext
mdemoret-nv Sep 2, 2020
a62815d
Removing the compiler directives from each file since its set globall…
mdemoret-nv Sep 2, 2020
f110e3d
Merge pull request #2787 from dantegd/016-fix-skip-lgb
dantegd Sep 2, 2020
7bcaed3
Merge branch 'branch-0.16' into bug-redirected-log-sink
wphicks Sep 3, 2020
e341d00
Merge pull request #2785 from trxcllnt/fix/dev-conda-envs
dantegd Sep 3, 2020
571c7c2
adding has_sorted_indices to sparse cuml array
cjnolet Sep 3, 2020
89f6c4f
Add FIL-specific README
JohnZed Aug 31, 2020
a687090
README changes from review
JohnZed Sep 1, 2020
7de6a22
Update README.md
JohnZed Sep 3, 2020
277b075
Review comments
JohnZed Sep 3, 2020
3316718
Add readme
JohnZed Sep 3, 2020
567a852
Merge pull request #2778 from JohnZed/enh-fil-readme
JohnZed Sep 3, 2020
212813d
Merge pull request #2788 from drobison00/cuml_2767
beckernick Sep 3, 2020
9b27aa0
Merge branch 'branch-0.16' of https://github.com/rapidsai/cuml into b…
Salonijain27 Sep 3, 2020
21291e1
Merge branch 'branch-0.16' into json_dump
hcho3 Sep 3, 2020
10e84f7
alias for handle
divyegala Sep 3, 2020
9288a04
more cleanup of docs
divyegala Sep 3, 2020
9cb6cf2
merge upstream
divyegala Sep 3, 2020
376f950
style fix
divyegala Sep 3, 2020
f77235e
Merge branch 'branch-0.16' into fea-016-sparse-cuml-array
cjnolet Sep 3, 2020
1d09757
Making changes
cjnolet Sep 3, 2020
2077850
Adding ability to convert to different sparse formats in output
cjnolet Sep 3, 2020
e7352f4
Fixing flake8 style
cjnolet Sep 3, 2020
4b813c9
Remove tokens of legth 1 from cuML Vectorizers
VibhuJawa Sep 4, 2020
a78500e
added changelog
VibhuJawa Sep 4, 2020
7825d05
fixed comment
VibhuJawa Sep 4, 2020
3a8fc4e
Merge pull request #2796 from VibhuJawa/enh-vectorizer-pre-processing
dantegd Sep 4, 2020
456724f
Add docstrings to logging callbacks
wphicks Sep 4, 2020
7d15f7d
Move callback registration to end of logger.pyx
wphicks Sep 4, 2020
039c933
Expose logger flush method and test it
wphicks Sep 4, 2020
f4e2088
Expose and test flush in Python
wphicks Sep 4, 2020
2758d25
Correctly reset logging callbacks after tests
wphicks Sep 4, 2020
2c6b338
Style fixes
wphicks Sep 4, 2020
7838add
Remove extraneous import
wphicks Sep 4, 2020
d11c62b
only the proba was wrong, and the error is <1% even on this few colum…
levsnv Sep 4, 2020
bdb8aa5
changelog
levsnv Sep 4, 2020
ff3608c
align cuML's spdlog version with RMM's
trxcllnt Sep 5, 2020
fbafbe6
changelog
trxcllnt Sep 5, 2020
672517d
Merge pull request #2800 from trxcllnt/fix/use-rmm-spdlog-version
dantegd Sep 8, 2020
cbf78dd
Merge branch 'branch-0.16' into reenable-lightgbm-test
dantegd Sep 8, 2020
402140a
Merge branch 'branch-0.16' into bug-redirected-log-sink
wphicks Sep 8, 2020
f7ddf28
updating libcumlprims version
divyegala Sep 9, 2020
26de311
Merge branch 'branch-0.16' of https://github.com/rapidsai/cuml into f…
divyegala Sep 9, 2020
80a86b7
Merge pull request #2799 from levsnv/reenable-lightgbm-test
dantegd Sep 9, 2020
65dde3a
Merge remote-tracking branch 'upstream/branch-0.16' into fea-improve-…
mdemoret-nv Sep 9, 2020
81ffd4a
Fixing copyright and removing cython directives from .py and .pxd fil…
mdemoret-nv Sep 9, 2020
0d5de18
KMeans gtest still failing
divyegala Sep 9, 2020
34fa7a7
Removing `# distutils:` from *.pxd files
mdemoret-nv Sep 9, 2020
db52b62
Separating impls from interface layer in kmeans
cjnolet Sep 9, 2020
e943252
namespacing mg kmeans as well as gtests working
divyegala Sep 9, 2020
e58b238
style fixes
divyegala Sep 9, 2020
2de09c6
deleting nccl and ucx pytests
divyegala Sep 9, 2020
a8d75bd
temporary point to my raft branch
divyegala Sep 10, 2020
d84283e
Merge pull request #2638 from mdemoret-nv/fea-improve-cython-build_ext
dantegd Sep 10, 2020
34000d7
FIX Relax Doxygen version required in CMake to coincide with rapids-b…
dantegd Sep 10, 2020
e610720
DOC Added entry to changelog
dantegd Sep 10, 2020
3feb834
merging upstream
divyegala Sep 10, 2020
7aed74a
updating libcumlprims package for testing
divyegala Sep 10, 2020
6195585
Merge branch 'branch-0.16' of https://github.com/rapidsai/cuml into b…
Salonijain27 Sep 10, 2020
5a0990e
update rf mnmg threshold
Salonijain27 Sep 10, 2020
5f28e3a
update CHNAGELOG.md
Salonijain27 Sep 10, 2020
272f79c
typo in testing channel
divyegala Sep 10, 2020
dad67ed
conda channel priority
divyegala Sep 10, 2020
103e334
Fix memory access in non-row_major blob creation
wphicks Sep 10, 2020
9651141
trying to get conda build working
divyegala Sep 11, 2020
72a97b0
Getting tests to pass for now. Creating issue for remaining pieces.
cjnolet Sep 11, 2020
050d677
Merge branch 'fea-016-raft_handle' of github.com:divyegala/cuml into …
cjnolet Sep 11, 2020
4af35e3
Skipping score instead of predict_proba (less deltas for now)
cjnolet Sep 11, 2020
6947f90
add comment to include issue number
Salonijain27 Sep 11, 2020
a027ecc
Update changelog
wphicks Sep 11, 2020
1cc53d6
Merge pull request #2808 from dantegd/016-fix-relax-doxycmake
JohnZed Sep 11, 2020
8b9c0e7
Merge branch 'branch-0.16' into rf-mnmg-threshold
Salonijain27 Sep 11, 2020
ce41e33
Merge pull request #2810 from Salonijain27/rf-mnmg-threshold
JohnZed Sep 11, 2020
c472038
Merge branch 'branch-0.16' into bug-get_mu_sigma-memory
dantegd Sep 11, 2020
87ff7b4
Merge pull request #2813 from wphicks/bug-get_mu_sigma-memory
dantegd Sep 12, 2020
c3cf48d
FIX Fix parsing of singlegpu option in build command
dantegd Sep 12, 2020
552489a
DOC Added entry to changelog
dantegd Sep 12, 2020
a4cc40a
Merge pull request #2818 from dantegd/fix-singlegpu-newpybuild
JohnZed Sep 14, 2020
364fdce
Fixing flake8 style
cjnolet Sep 14, 2020
bd27343
ENH Make data conversions warnings be debug level
dantegd Sep 14, 2020
96cf53b
DOC Added entry to changelog
dantegd Sep 14, 2020
6f008c9
Update CHANGELOG.md
dantegd Sep 14, 2020
6b8f561
removing cuml Exception class
divyegala Sep 14, 2020
6c1bba4
Merge branch 'fea-016-raft_handle' of https://github.com/divyegala/cu…
divyegala Sep 14, 2020
16ec4d3
Merge branch 'branch-0.16' of https://github.com/rapidsai/cuml into f…
divyegala Sep 14, 2020
b3795ce
removing testing label
divyegala Sep 15, 2020
ab0982d
Merge pull request #2747 from divyegala/fea-016-raft_handle
dantegd Sep 15, 2020
47f6e73
Responding to review feedback
cjnolet Sep 15, 2020
33bfca3
Finishing pydocs and fixing flake8
cjnolet Sep 15, 2020
c41af7e
Making sure users who don't have scipy installed don't get a nasty im…
cjnolet Sep 15, 2020
4dca61a
More review feedback
cjnolet Sep 15, 2020
b4b6eb5
Finishing remaining review feedback
cjnolet Sep 15, 2020
370261a
Adding missing capture
cjnolet Sep 15, 2020
c061615
Merge branch 'branch-0.16' into fea-016-sparse-cuml-array
cjnolet Sep 15, 2020
ec20eaf
Using debug instead of warn
cjnolet Sep 15, 2020
66ae577
Merge branch 'branch-0.16' into bug-redirected-log-sink
JohnZed Sep 15, 2020
ee64f4e
Fixing flake8
cjnolet Sep 15, 2020
904c50a
Merge pull request #2824 from dantegd/016-enh-conv-warning
beckernick Sep 15, 2020
5485089
Merge pull request #2784 from cjnolet/fea-016-sparse-cuml-array
cjnolet Sep 16, 2020
bd80fc6
Merge branch 'branch-0.16' into json_dump
hcho3 Sep 16, 2020
9645fa6
fixing stream capture in lambda
divyegala Sep 16, 2020
063eb7e
changelog
divyegala Sep 16, 2020
f28d4d5
skipping svd solver stress tests
divyegala Sep 16, 2020
4fffc2c
Merge pull request #2677 from hcho3/json_dump
JohnZed Sep 16, 2020
52664c3
Merge pull request #2781 from wphicks/bug-redirected-log-sink
JohnZed Sep 16, 2020
3935cb3
fixing pickle test
divyegala Sep 16, 2020
6eef0c7
changelog
divyegala Sep 16, 2020
5f1e306
style fixes
divyegala Sep 16, 2020
8914568
Merge pull request #2831 from divyegala/bug-016-gcc_9
cjnolet Sep 16, 2020
8261b2f
better pytest skip doc
divyegala Sep 17, 2020
d80186b
merge upstream
divyegala Sep 17, 2020
81e85cf
Merge pull request #2832 from divyegala/bug-016-stress_tests
JohnZed Sep 17, 2020
fe7fd99
[REVIEW] add C++ FIL benchmark code (#2152)
levsnv Sep 18, 2020
57e7feb
[REVIEW] Small fixes for mid-release bug squash (#2829)
cjnolet Sep 18, 2020
461cbf2
[REVIEW] KNN index preprocessors were using incorrect n_samples (#2842)
cjnolet Sep 19, 2020
53b65b7
Fix typo in Python doc string for UMAP fit_transform (#2848)
zbjornson Sep 21, 2020
44bbcd7
[REVIEW] Enabling MG Gtests w/ RAFT MPI Comms (#2775)
cjnolet Sep 21, 2020
29245dd
[REVIEW] Updates for RMM being header only (#2855)
dantegd Sep 21, 2020
1ded565
[REVIEW] Clean up paramsPCA (#2850)
zbjornson Sep 22, 2020
cb7d083
This fix forces to use whole dataset when sample bootstrapping is dis…
vinaydes Sep 23, 2020
6a93762
[REVIEW] Resolves deadlock behaviour in Barnes Hut's summarization ke…
drobison00 Sep 23, 2020
ce14638
[REVIEW] Dask LabelEncoder (#2789)
Nanthini10 Sep 23, 2020
a1f9d33
Project Flash script changes (#2792)
raydouglass Sep 24, 2020
a8fe745
[REVIEW]Porter Stemmer (#2476)
VibhuJawa Sep 24, 2020
d55b9a2
[REVIEW] make num_classes significant in FLOAT_SCALAR case (#2849)
levsnv Sep 25, 2020
1871502
[REVIEW] rename leaf_value_t names to reflect new convention for mult…
levsnv Sep 25, 2020
48fed6a
[REVIEW] make the node reorg loop more obvious (#2837)
levsnv Sep 25, 2020
808c1f6
[REVIEW] Fix LabelEncoder for filtered input (#2856)
Nanthini10 Sep 25, 2020
bd65c15
[REVIEW] Retain index in stratified splitting for dataframes (#2805)
Nanthini10 Sep 25, 2020
5300ce4
[REVIEW] Improve Documentation Examples and Source Linking (#2541)
mdemoret-nv Sep 25, 2020
17b55cb
Removing empty marker kernel code (#2873)
venkywonka Sep 26, 2020
fae1cba
[REVIEW] Add float64 warning when loading LightGBM model (#2874)
wphicks Sep 26, 2020
088763c
Bug fix to enable colorful NVTX markers (#2875)
venkywonka Sep 28, 2020
268d905
[REVIEW] Sklearn-based preprocessing (#2645)
viclafargue Sep 30, 2020
7268f27
Prevent import-time numba.cuda jit compiling (#2882)
wphicks Sep 30, 2020
9671c04
[REVIEW] Rng prims and dependencies in RAFT format (#2835)
divyegala Sep 30, 2020
7e72203
[REVIEW] Update ci/local/README.md (#2892)
ajschmidt8 Oct 1, 2020
28dd00e
UPDATE masked label encoder unit test (#2879)
aerdem4 Oct 1, 2020
a6f766d
[REVIEW] TSNE exception for n_components > 2 (#2877)
Nanthini10 Oct 1, 2020
a4bcdf1
[REVIEW] Add timing function to utils (#2871)
Nanthini10 Oct 1, 2020
2675471
[REVIEW] Fix bugs in Auto-ARIMA when s==None (#2880)
Nyrio Oct 1, 2020
ff48614
DOC v0.17 Updates
raydouglass Oct 2, 2020
bc989e2
[REVIEW] improve FIL benchmark stability (#2867)
levsnv Oct 2, 2020
1a72787
Merge pull request #2913 from rapidsai/branch-0.16
GPUtester Oct 2, 2020
5b2d4df
[REVIEW] Update allgatherv types for RAFT compatibility (#2909)
wphicks Oct 2, 2020
ff259eb
Merge pull request #2915 from rapidsai/branch-0.16
GPUtester Oct 2, 2020
207129b
[REVIEW] Adding Support for CuPy 8.x and Fixing Tests (#2910)
mdemoret-nv Oct 5, 2020
239a360
Merge pull request #2919 from rapidsai/branch-0.16
GPUtester Oct 5, 2020
d7e1c46
[REVIEW] support xgboost multi-class models in C/C++ layer in FIL (#2…
levsnv Oct 5, 2020
e3e8cd4
Merge pull request #2921 from rapidsai/branch-0.16
GPUtester Oct 5, 2020
726096a
[REVIEW] Fix for OPG KNN Classifier & Regressor (#2844)
viclafargue Oct 6, 2020
8020bde
[REVIEW] add lightgbm test for multiclass (#2798)
levsnv Oct 6, 2020
73b5fd0
Merge pull request #2923 from rapidsai/branch-0.16
GPUtester Oct 6, 2020
c3c8f4b
[REVIEW] Add tests for XGBoost multi-class classification in python (…
levsnv Oct 6, 2020
43317fe
Merge pull request #2927 from rapidsai/branch-0.16
GPUtester Oct 6, 2020
264a0d2
[REVIEW] Fixing Owner Bug When Slicing CumlArray Objects (#2925)
mdemoret-nv Oct 7, 2020
0679739
Merge pull request #2929 from rapidsai/branch-0.16
GPUtester Oct 7, 2020
3b5c924
[REVIEW] FIX Fix notebook error handling in gpuCI (#2931)
dillon-cullinan Oct 7, 2020
a08a688
Merge pull request #2934 from rapidsai/branch-0.16
GPUtester Oct 7, 2020
ff50133
[REVIEW] Add sklearn GBDT support (without predict_proba) (#2916)
levsnv Oct 8, 2020
61125eb
Merge pull request #2935 from rapidsai/branch-0.16
GPUtester Oct 8, 2020
304aa4b
[REVIEW] xfail for KBinsDiscretizer pytests (#2932)
divyegala Oct 8, 2020
e6bee2b
Merge pull request #2936 from rapidsai/branch-0.16
GPUtester Oct 8, 2020
ff41f49
changing test target for NVTX wrapper test (#2885)
venkywonka Oct 8, 2020
a46024e
Merge pull request #2937 from rapidsai/branch-0.16
GPUtester Oct 8, 2020
04aff8d
[REVIEW] Introduces experimental batched backend for random forest. (…
vinaydes Oct 8, 2020
6c653c3
Merge pull request #2938 from rapidsai/branch-0.16
GPUtester Oct 8, 2020
d6ff833
[REVIEW] pin libfaiss to <=1.6.3 (#2930)
dantegd Oct 8, 2020
55d3535
Merge pull request #2939 from rapidsai/branch-0.16
GPUtester Oct 8, 2020
fc0bcb1
[REVIEW] Moving `matrix/matrix.cuh` to RAFT namespaces (#2902)
divyegala Oct 9, 2020
bcc5bc3
[REVIEW] Correcting labels meta dtype for `cuml.dask.make_classificat…
divyegala Oct 9, 2020
ea9ba3e
Merge pull request #2946 from rapidsai/branch-0.16
GPUtester Oct 9, 2020
151c474
[REVIEW] Install RAFT header files (#2922)
wphicks Oct 10, 2020
001b676
Merge pull request #2948 from rapidsai/branch-0.16
GPUtester Oct 10, 2020
f294908
[REVIEW] Removing unused shuffle_features parameter (#2943)
vinaydes Oct 11, 2020
c1da169
Merge pull request #2949 from rapidsai/branch-0.16
GPUtester Oct 11, 2020
3ea117d
[REVIEW] Updating Estimators Derived from Base for Consistency (#2928)
mdemoret-nv Oct 12, 2020
81984fb
Merge pull request #2958 from rapidsai/branch-0.16
GPUtester Oct 12, 2020
dd50b75
[REVIEW] Adding `cuml.experimental` to the Docs (#2942)
mdemoret-nv Oct 13, 2020
d1708c3
Merge pull request #2964 from rapidsai/branch-0.16
GPUtester Oct 13, 2020
6babcdf
[REVIEW] Fix ols test size for stability (#2957)
dantegd Oct 13, 2020
43b1181
Merge pull request #2971 from rapidsai/branch-0.16
GPUtester Oct 13, 2020
0dc6a72
[REVIEW] removing shuffle_features from RF param names (#2968)
divyegala Oct 14, 2020
4cc0490
Merge pull request #2974 from rapidsai/branch-0.16
GPUtester Oct 14, 2020
fbe6272
[REVIEW] Upgrade Treelite to 0.93 (#2972)
hcho3 Oct 14, 2020
ce82e2e
Merge pull request #2978 from rapidsai/branch-0.16
GPUtester Oct 14, 2020
d2ad378
[REVIEW] Moving linalg basic prims to RAFT namespaces and enable code…
divyegala Oct 14, 2020
c6ba8fb
[REVIEW] Some `stats` prims to RAFT namespaces (#2905)
divyegala Oct 15, 2020
49d0ed2
FIX Move codecov updates from #2903 into `branch-0.16` (#2984)
mike-wendt Oct 15, 2020
2347ef1
Merge pull request #2985 from rapidsai/branch-0.16
GPUtester Oct 15, 2020
619f595
Reduce kneighbors classifier test threshold (#2982)
wphicks Oct 15, 2020
138191c
Merge pull request #2986 from rapidsai/branch-0.16
GPUtester Oct 15, 2020
7b0c09f
[Review] Add additional warnings for float64 models (#2947)
wphicks Oct 15, 2020
402b8d7
[REVIEW] Fix for conftest for singlegpu build (#2955)
dantegd Oct 15, 2020
b869880
Simplify tSNE perplexity search (#2622)
zbjornson Oct 15, 2020
65fbef5
Merge pull request #2987 from rapidsai/branch-0.16
GPUtester Oct 15, 2020
ab6498b
[REVIEW] Fix seeding of KISS99 RNG (#2983)
vinaydes Oct 15, 2020
6bc325d
[REVIEW] Allow data imputation for nan values (#2973)
wphicks Oct 15, 2020
9954f8c
Merge pull request #2992 from rapidsai/branch-0.16
GPUtester Oct 15, 2020
3a16dbd
[FIX] Reduce MNMG kneighbors regressor test threshold (#2990)
viclafargue Oct 15, 2020
b840bfd
Merge pull request #2993 from rapidsai/branch-0.16
GPUtester Oct 15, 2020
fa4d31f
[REVIEW] Moving `linalg` basic math ops (#2904)
divyegala Oct 15, 2020
0dc18e0
[FIX] Notebooks update (#2965)
viclafargue Oct 15, 2020
5aa17ac
[REVIEW] Fixing dask tsvd stress test failure (#2941)
Nanthini10 Oct 15, 2020
95a7305
Merge pull request #2994 from rapidsai/branch-0.16
GPUtester Oct 15, 2020
70d2c64
[REVIEW] Fix un-guarded sklearn import in SVC (#2981)
JohnZed Oct 15, 2020
e8e18f3
Merge pull request #2999 from rapidsai/branch-0.16
GPUtester Oct 15, 2020
48a5416
[REVIEW] Changing ARIMA `get/set_params` to `get/set_fit_params` (#2997)
mdemoret-nv Oct 15, 2020
146d90b
Merge pull request #3001 from rapidsai/branch-0.16
GPUtester Oct 15, 2020
02188d9
ENH gpuCI script updates
dillon-cullinan Oct 16, 2020
ebe3e5b
DOC Changelog update
dillon-cullinan Oct 16, 2020
d3f2efa
FIX Remove ellipses from loggers
dillon-cullinan Oct 16, 2020
49bb9ec
FIX Replace conda build commands with gpuci commands
dillon-cullinan Oct 16, 2020
78b0d51
FIX Fix environment activation
dillon-cullinan Oct 16, 2020
544ec2f
Merge pull request #3010 from dillon-cullinan/enh-gpuci
dillon-cullinan Oct 18, 2020
eb1b950
Merge pull request #3013 from rapidsai/branch-0.16
GPUtester Oct 18, 2020
5beaaae
[REVIEW] Adding `power_t` param to SGD failing pytests (#3012)
divyegala Oct 18, 2020
1295202
[REVIEW] Removing the max_depth restriction for switching to the batc…
vinaydes Oct 19, 2020
afdb6c0
[REVIEW] Moving `linalg` decomp to RAFT namespaces (#2906)
divyegala Oct 19, 2020
2a2698e
Validate number of columns in check_array (#3008)
wphicks Oct 20, 2020
50afa17
[REVIEW] Pin cmake policies to cmake 3.17 version, bump project versi…
Oct 20, 2020
83d072d
[REVIEW] remove Single Process Multi GPU (SPMG) code (fixes #2979) (#…
jameslamb Oct 20, 2020
b943d74
Obey initialize_embeddings parameter in B-H tSNE (#3011)
zbjornson Oct 21, 2020
6ed34e1
Prevent cuML from hanging when cuML RF falls back from experimental b…
hcho3 Oct 21, 2020
6690073
FIX Correct if syntax in upload script (#3034)
mike-wendt Oct 21, 2020
3647695
Merge pull request #3035 from rapidsai/branch-0.16
GPUtester Oct 21, 2020
4151520
[REVIEW] Fixes issues with benchmark codes due to improper initializa…
vinaydes Oct 23, 2020
ae223f1
Suppress automatic GIL acquire to avoid deadlock (#3037)
wphicks Oct 26, 2020
3095587
[REVIEW] Moving some `linalg` and `stats` prims to RAFT (#3044)
divyegala Oct 27, 2020
70302e3
[REVIEW] Bumping xgboost version to match cuml version (#3062)
mdemoret-nv Oct 28, 2020
bb2e608
Update mathjax CDN URL to prevent mixed content warnings [skip-ci] (#…
ajschmidt8 Oct 29, 2020
9e86cc7
[REVIEW] Speed up test_incremental_pca (#3078)
wphicks Oct 29, 2020
8330557
[REVIEW] [BUG] Fusing metrics and score directories in src_prims (#3072)
venkywonka Oct 29, 2020
5fff222
[REVIEW] Speed up test_linear_model (#3075)
wphicks Oct 29, 2020
7c6b142
[REVIEW] Reducing dask coordinate descent test runtime (#3074)
Nanthini10 Oct 29, 2020
3fa3cdf
[REVIEW] Handle C++ exception thrown from FIL predict (#3061)
hcho3 Oct 29, 2020
7dbd33d
[REVIEW] Speeding up test_make_blobs (#3083)
divyegala Oct 30, 2020
2f75228
[REVIEW] Reverting FIL Notebook Testing (#3086)
mdemoret-nv Oct 30, 2020
30644b4
Reducing dask/test_datasets.py pytests from 7.5 mins to ~1 min (#3070)
miroenev Oct 30, 2020
7e6d413
[REVIEW] deleting prims and updating paths (#3067)
divyegala Oct 31, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ __pycache__
htmlcov
build/
build_prims/
cmake-build*
cuml.egg-info/
dist/
python/cuml/**/*.cpp
Expand All @@ -29,6 +30,9 @@ log
dask-worker-space/
tmp/

## files pickled in notebook when ran during python docstring generation
docs/source/*.model

## eclipse
.project
.cproject
Expand Down
11 changes: 9 additions & 2 deletions BUILD.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,17 @@ To install cuML from source, ensure the following dependencies are met:
9. NCCL (>=2.4)
10. UCX [optional] (>= 1.7) - enables point-to-point messaging in the cuML standard communicator. This is necessary for many multi-node multi-GPU cuML algorithms to function.

It is recommended to use conda for environment/package management. If doing so, a convenience environment .yml file is located in `conda/environments/cuml_dec_cudax.y.yml` (replace x.y for your CUDA version). This file contains most of the dependencies mentioned above (notable exceptions are `gcc` and `zlib`). To use it, for example to create an environment named `cuml_dev` for CUDA 10.0 and Python 3.7, you can use the follow command:
It is recommended to use conda for environment/package management. If doing so, a convenience environment .yml file is located in `conda/environments/cuml_dec_cudax.y.yml` (replace x.y for your CUDA version). This file contains most of the dependencies mentioned above (notable exceptions are `gcc` and `zlib`). To use it, for example to create an environment named `cuml_dev` for CUDA 10.2 and Python 3.7, you can use the follow command:

```bash
conda create -n cuml_dev python=3.7
conda env update -n cuml_dev --file=conda/environments/cuml_dev_cuda10.2.yml
```
conda env create -n cuml_dev python=3.7 --file=conda/environments/cuml_dev_cuda10.0.yml

These conda environments are based on the general RAPIDS meta packages that install common dependencies for RAPIDS projects. To install different versions of packages contained in those meta packages after creating the environment, it is recommended to remove those meta packages (without removing the actual packages contained in the environment) with the following command (having the environment active):

```bash
conda remove --force rapids-build-env rapids-notebook-env rapids-doc-env
```

## Installing from Source:
Expand Down
204 changes: 204 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,145 @@
# cuML 0.17.0 (Date TBD)

## New Features

## Improvements
- PR #3070: Speed up dask/test_datasets tests
- PR #3075: Speed up test_linear_model tests
- PR #3078: Speed up test_incremental_pca tests
- PR #2902: `matrix/matrix.cuh` in RAFT namespacing
- PR #2903: Moving linalg's gemm, gemv, transpose to RAFT namespaces
- PR #2905: `stats` prims `mean_center`, `sum` to RAFT namespaces
- PR #2904: Moving `linalg` basic math ops to RAFT namespaces
- PR #3000: Pin cmake policies to cmake 3.17 version, bump project version to 0.17
- PR #3083: Improving test_make_blobs testing time
- PR #2906: Moving `linalg` decomp to RAFT namespaces
- PR #2996: Removing the max_depth restriction for switching to the batched backend
- PR #3004: Remove Single Process Multi GPU (SPMG) code
- PR #3044: Move leftover `linalg` and `stats` to RAFT namespaces
- PR #3067: Deleting prims moved to RAFT and updating header paths
- PR #3074: Reducing dask coordinate descent test runtime

## Bug Fixes
- PR #3072: Fusing metrics and score directories in src_prims
- PR #3037: Avoid logging deadlock in multi-threaded C code
- PR #2983: Fix seeding of KISS99 RNG
- PR #3011: Fix unused initialize_embeddings parameter in Barnes-Hut t-SNE
- PR #3008: Check number of columns in check_array validator
- PR #3012: Increasing learning rate for SGD log loss and invscaling pytests
- PR #3021: Fix a hang in cuML RF experimental backend
- PR #3039: Update RF and decision tree parameter initializations in benchmark codes
- PR #3061: Handle C++ exception thrown from FIL predict
- PR #3073: Update mathjax CDN URL for documentation
- PR #3062: Bumping xgboost version to match cuml version
- PR #3086: Reverting FIL Notebook Testing

# cuML 0.16.0 (Date TBD)

## New Features
- PR #2922: Install RAFT headers with cuML
- PR #2909: Update allgatherv for compatibility with latest RAFT
- PR #2677: Ability to export RF trees as JSON
- PR #2698: Distributed TF-IDF transformer
- PR #2476: Porter Stemmer
- PR #2789: Dask LabelEncoder
- PR #2152: add FIL C++ benchmark
- PR #2638: Improve cython build with custom `build_ext`
- PR #2866: Support XGBoost-style multiclass models (gradient boosted decision trees) in FIL C++
- PR #2874: Issue warning for degraded accuracy with float64 models in Treelite
- PR #2881: Introduces experimental batched backend for random forest
- PR #2916: Add SKLearn multi-class GBDT model support in FIL

## Improvements
- PR #2947: Add more warnings for accuracy degradation with 64-bit models
- PR #2873: Remove empty marker kernel code for NVTX markers
- PR #2796: Remove tokens of length 1 by default for text vectorizers
- PR #2741: Use rapids build packages in conda environments
- PR #2735: Update seed to random_state in random forest and associated tests
- PR #2739: Use cusparse_wrappers.h from RAFT
- PR #2729: Replace `cupy.sparse` with `cupyx.scipy.sparse`
- PR #2749: Correct docs for python version used in cuml_dev conda environment
- PR #2747: Adopting raft::handle_t and raft::comms::comms_t in cuML
- PR #2762: Fix broken links and provide minor edits to docs
- PR #2723: Support and enable convert_dtype in estimator predict
- PR #2758: Match sklearn's default n_components behavior for PCA
- PR #2770: Fix doxygen version during cmake
- PR #2766: Update default RandomForestRegressor score function to use r2
- PR #2775: Enablinbg mg gtests w/ raft mpi comms
- PR #2783: Add pytest that will fail when GPU IDs in Dask cluster are not unique
- PR #2784: Add SparseCumlArray container for sparse index/data arrays
- PR #2785: Add in cuML-specific dev conda dependencies
- PR #2778: Add README for FIL
- PR #2799: Reenable lightgbm test with lower (1%) proba accuracy
- PR #2800: Align cuML's spdlog version with RMM's
- PR #2824: Make data conversions warnings be debug level
- PR #2835: Rng prims, utils, and dependencies in RAFT
- PR #2541: Improve Documentation Examples and Source Linking
- PR #2837: Make the FIL node reorder loop more obvious
- PR #2849: make num_classes significant in FLOAT_SCALAR case
- PR #2792: Project flash (new build process) script changes
- PR #2850: Clean up unused params in paramsPCA
- PR #2871: Add timing function to utils
- PR #2863: in FIL, rename leaf_value_t enums to more descriptive
- PR #2867: improve stability of FIL benchmark measurements
- PR #2798: Add python tests for FIL multiclass classification of lightgbm models
- PR #2892: Update ci/local/README.md
- PR #2910: Adding Support for CuPy 8.x
- PR #2914: Add tests for XGBoost multi-class models in FIL
- PR #2622: Simplify tSNE perplexity search
- PR #2930: Pin libfaiss to <=1.6.3
- PR #2928: Updating Estimators Derived from Base for Consistency
- PR #2942: Adding `cuml.experimental` to the Docs
- PR #3010: Improve gpuCI Scripts

## Bug Fixes
- PR #2973: Allow data imputation for nan values
- PR #2982: Adjust kneighbors classifier test threshold to avoid intermittent failure
- PR #2885: Changing test target for NVTX wrapper test
- PR #2882: Allow import on machines without GPUs
- PR #2875: Bug fix to enable colorful NVTX markers
- PR #2744: Supporting larger number of classes in KNeighborsClassifier
- PR #2769: Remove outdated doxygen options for 1.8.20
- PR #2787: Skip lightgbm test for version 3 and above temporarily
- PR #2805: Retain index in stratified splitting for dataframes
- PR #2781: Use Python print to correctly redirect spdlogs when sys.stdout is changed
- PR #2787: Skip lightgbm test for version 3 and above temporarily
- PR #2813: Fix memory access in generation of non-row-major random blobs
- PR #2810: Update Rf MNMG threshold to prevent sporadic test failure
- PR #2808: Relax Doxygen version required in CMake to coincide with integration repo
- PR #2818: Fix parsing of singlegpu option in build command
- PR #2827: Force use of whole dataset when sample bootstrapping is disabled
- PR #2829: Fixing description for labels in docs and removing row number constraint from PCA xform/inverse_xform
- PR #2832: Updating stress tests that fail with OOM
- PR #2831: Removing repeated capture and parameter in lambda function
- PR #2847: Workaround for TSNE lockup, change caching preference.
- PR #2842: KNN index preprocessors were using incorrect n_samples
- PR #2848: Fix typo in Python docstring for UMAP
- PR #2856: Fix LabelEncoder for filtered input
- PR #2855: Updates for RMM being header only
- PR #2844: Fix for OPG KNN Classifier & Regressor
- PR #2880: Fix bugs in Auto-ARIMA when s==None
- PR #2877: TSNE exception for n_components > 2
- PR #2879: Update unit test for LabelEncoder on filtered input
- PR #2932: Marking KBinsDiscretizer pytests as xfail
- PR #2925: Fixing Owner Bug When Slicing CumlArray Objects
- PR #2931: Fix notebook error handling in gpuCI
- PR #2941: Fixing dask tsvd stress test failure
- PR #2943: Remove unused shuffle_features parameter
- PR #2940: Correcting labels meta dtype for `cuml.dask.make_classification`
- PR #2965: Notebooks update
- PR #2955: Fix for conftest for singlegpu build
- PR #2968: Remove shuffle_features from RF param names
- PR #2957: Fix ols test size for stability
- PR #2972: Upgrade Treelite to 0.93
- PR #2981: Prevent unguarded import of sklearn in SVC
- PR #2984: Fix GPU test scripts gcov error
- PR #2990: Reduce MNMG kneighbors regressor test threshold
- PR #2997: Changing ARIMA `get/set_params` to `get/set_fit_params`

# cuML 0.15.0 (Date TBD)

## New Features
- PR #2581: Added model persistence via joblib in each section of estimator_intro.ipynb
- PR #2554: Hashing Vectorizer and general vectorizer improvements
- PR #2240: Making Dask models pickleable
- PR #2267: CountVectorizer estimator
Expand All @@ -12,11 +151,23 @@
- PR #2394: Adding cosine & correlation distance for KNN
- PR #2392: PCA can accept sparse inputs, and sparse prim for computing covariance
- PR #2465: Support pandas 1.0+
- PR #2550: Single GPU Target Encoder
- PR #2519: Precision recall curve using cupy
- PR #2500: Replace UMAP functionality dependency on nvgraph with RAFT Spectral Clustering
- PR #2502: cuML Implementation of `sklearn.metrics.pairwise_distances`
- PR #2520: TfidfVectorizer estimator
- PR #2211: MNMG KNN Classifier & Regressor
- PR #2461: Add KNN Sparse Output Functionality
- PR #2615: Incremental PCA
- PR #2594: Confidence intervals for ARIMA forecasts
- PR #2607: Add support for probability estimates in SVC
- PR #2618: SVM class and sample weights
- PR #2635: Decorator to generate docstrings with autodetection of parameters
- PR #2270: Multi class MNMG RF
- PR #2661: CUDA-11 support for single-gpu code
- PR #2322: Sparse FIL forests with 8-byte nodes
- PR #2675: Update conda recipes to support CUDA 11
- PR #2645: Add experimental, sklearn-based preprocessing

## Improvements
- PR #2336: Eliminate `rmm.device_array` usage
Expand Down Expand Up @@ -46,6 +197,7 @@
- PR #2403: Support for input and output type consistency in logistic regression predict_proba
- PR #2473: Add metrics.roc_auc_score to API docs. Additional readability and minor docs bug fixes
- PR #2468: Add `_n_features_in_` attribute to all single GPU estimators that implement fit
- PR #2489: Removing explicit FAISS build and adding dependency on libfaiss conda package
- PR #2480: Moving MNMG glm and solvers to cuml
- PR #2490: Moving MNMG KMeans to cuml
- PR #2483: Moving MNMG KNN to cuml
Expand All @@ -55,6 +207,7 @@
- PR #2237: Refactor RF cython code
- PR #2513: Fixing LGTM Analysis Issues
- PR #2099: Raise an error when float64 data is used with dask RF
- PR #2522: Renaming a few arguments in KNeighbors* to be more readable
- PR #2499: Provide access to `cuml.DBSCAN` core samples
- PR #2526: Removing PCA TSQR as a solver due to scalability issues
- PR #2536: Update conda upload versions for new supported CUDA/Python
Expand All @@ -69,8 +222,25 @@
- PR #2591: Generate benchmark datsets using `cuml.datasets`
- PR #2548: Fix limitation on number of rows usable with tSNE and refactor memory allocation
- PR #2589: including cuda-11 build fixes into raft
- PR #2599: Add Stratified train_test_split
- PR #2487: Set classes_ attribute during classifier fit
- PR #2605: Reduce memory usage in tSNE
- PR #2611: Adding building doxygen docs to gpu ci
- PR #2631: Enabling use of gtest conda package for build
- PR #2623: Fixing kmeans score() API to be compatible with Scikit-learn
- PR #2629: Add naive_bayes api docs
- PR #2643: 'dense' and 'sparse' values of `storage_type` for FIL
- PR #2691: Generic Base class attribute setter
- PR #2666: Update MBSGD documentation to mention that the model is experimental
- PR #2687: Update xgboost version to 1.2.0dev.rapidsai0.15
- PR #2684: CUDA 11 conda development environment yml and faiss patch
- PR #2648: Replace CNMeM with `rmm::mr::pool_memory_resource`.
- PR #2686: Improve SVM tests
- PR #2692: Changin LBFGS log level
- PR #2705: Add sum operator and base operator overloader functions to cumlarray
- PR #2701: Updating README + Adding ref to UMAP paper
- PR #2721: Update API docs
- PR #2730: Unpin cumlprims in conda recipes for release

## Bug Fixes
- PR #2369: Update RF code to fix set_params memory leak
Expand All @@ -94,6 +264,8 @@
- PR #2497: Changes to accomodate cuDF unsigned categorical changes
- PR #2209: Fix FIL benchmark for gpuarray-c input
- PR #2507: Import `treelite.sklearn`
- PR #2521: Fixing invalid smem calculation in KNeighborsCLassifier
- PR #2515: Increase tolerance for LogisticRegression test
- PR #2532: Updating doxygen in new MG headers
- PR #2521: Fixing invalid smem calculation in KNeighborsCLassifier
- PR #2515: Increase tolerance for LogisticRegression test
Expand All @@ -105,12 +277,41 @@
- PR #2535: Fix issue with incorrect docker image being used in local build script
- PR #2542: Fix small memory leak in TSNE
- PR #2552: Fixed the length argument of updateDevice calls in RF test
- PR #2565: Fix cell allocation code to avoid loops in quad-tree. Prevent NaNs causing infinite descent
- PR #2563: Update scipy call for arima gradient test
- PR #2569: Fix for cuDF update
- PR #2508: Use keyword parameters in sklearn.datasets.make_* functions
- PR #2587: Attributes for estimators relying on solvers
- PR #2586: Fix SVC decision function data type
- PR #2573: Considering managed memory as device type on checking for KMeans
- PR #2574: Fixing include path in `tsvd_mg.pyx`
- PR #2506: Fix usage of CumlArray attributes on `cuml.common.base.Base`
- PR #2593: Fix inconsistency in train_test_split
- PR #2609: Fix small doxygen issues
- PR #2610: Remove cuDF tolist call
- PR #2613: Removing thresholds from kmeans score tests (SG+MG)
- PR #2616: Small test code fix for pandas dtype tests
- PR #2617: Fix floating point precision error in tSNE
- PR #2625: Update Estimator notebook to resolve errors
- PR #2634: singlegpu build option fixes
- PR #2641: [Breaking] Make `max_depth` in RF compatible with scikit-learn
- PR #2650: Make max_depth behave consistently for max_depth > 14
- PR #2651: AutoARIMA Python bug fix
- PR #2654: Fix for vectorizer concatenations
- PR #2655: Fix C++ RF predict function access of rows/samples array
- PR #2649: Cleanup sphinx doc warnings for 0.15
- PR #2668: Order conversion improvements to account for cupy behavior changes
- PR #2669: Revert PR 2655 Revert "Fixes C++ RF predict function"
- PR #2683: Fix incorrect "Bad CumlArray Use" error messages on test failures
- PR #2695: Fix debug build issue due to incorrect host/device method setup
- PR #2709: Fixing OneHotEncoder Overflow Error
- PR #2710: Fix SVC doc statement about predic_proba
- PR #2726: Return correct output type in QN
- PR #2711: Fix Dask RF failure intermittently
- PR #2718: Fix temp directory for py.test
- PR #2719: Set KNeighborsRegressor output dtype according to training target dtype
- PR #2720: Updates to outdated links
- PR #2722: Getting cuML covariance test passing w/ Cupy 7.8 & CUDA 11

# cuML 0.14.0 (03 Jun 2020)

Expand All @@ -132,6 +333,7 @@
- PR #2256: Add a `make_arima` generator
- PR #2245: ElasticNet, Lasso and Coordinate Descent MNMG
- PR #2242: Pandas input support with output as NumPy arrays by default
- PR #2551: Add cuML RF multiclass prediction using FIL from python
- PR #1728: Added notebook testing to gpuCI gpu build

## Improvements
Expand Down Expand Up @@ -283,6 +485,8 @@
- PR #2295: Fix convert_to_dtype copy even with same dtype
- PR #2305: Fixed race condition in DBScan
- PR #2354: Fix broken links in README
- PR #2619: Explicitly skip raft test folder for pytest 6.0.0
- PR #2788: Set the minimum number of columns that can be sampled to 1 to fix 0 mem allocation error

# cuML 0.13.0 (31 Mar 2020)

Expand Down
6 changes: 3 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ into three categories:

### Your first issue

1. Read the project's [README.md](https://github.com/rapidsai/cuml/blob/master/README.md)
1. Read the project's [README.md](https://github.com/rapidsai/cuml/blob/main/README.md)
to learn how to setup the development environment.
2. Find an issue to work on. The best way is to look for the [good first issue](https://github.com/rapidsai/cuml/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22)
or [help wanted](https://github.com/rapidsai/cuml/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22) labels
Expand Down Expand Up @@ -62,12 +62,12 @@ implementation of the issue, ask them in the issue instead of the PR.

The cuML repository has two main branches:

1. `master` branch: it contains the last released version. Only hotfixes are targeted and merged into it.
1. `main` branch: it contains the last released version. Only hotfixes are targeted and merged into it.
2. `branch-x.y`: it is the development branch which contains the upcoming release. All the new features should be based on this branch and Merge/Pull request should target this branch (with the exception of hotfixes).

### Additional details

For every new version `x.y` of cuML there is a corresponding branch called `branch-x.y`, from where new feature development starts and PRs will be targeted and merged before its release. The exceptions to this are the 'hotfixes' that target the `master` branch, which target critical issues raised by Github users and are directly merged to `master` branch, and create a new subversion of the project. While trying to patch an issue which requires a 'hotfix', please state the intent in the PR.
For every new version `x.y` of cuML there is a corresponding branch called `branch-x.y`, from where new feature development starts and PRs will be targeted and merged before its release. The exceptions to this are the 'hotfixes' that target the `main` branch, which target critical issues raised by Github users and are directly merged to `main` branch, and create a new subversion of the project. While trying to patch an issue which requires a 'hotfix', please state the intent in the PR.

For all development, your changes should be pushed into a branch (created using the naming instructions below) in your own fork of cuML and then create a pull request when the code is ready.

Expand Down
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# From: https://github.com/rapidsai/cudf/blob/master/Dockerfile
# From: https://github.com/rapidsai/cudf/blob/main/Dockerfile
FROM cudf

ENV CONDA_ENV=cudf
Expand Down
Loading