[Review] Precision recall curve using cupy #2519

daxiongshu · 2020-07-07T23:18:29Z

In this PR, I will refactor _ranking.py so that basic functions can be reused among existing and upcoming metrics.

sync with upstream

merge with upstream

GPUtester · 2020-07-07T23:21:22Z

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

dantegd

Just a few comments on a first look

python/cuml/metrics/_ranking.py

dantegd · 2020-07-14T01:46:47Z

python/cuml/metrics/_ranking.py

+    ids = cp.argsort(-y_score)
+    sorted_score = y_score[ids]
+
+    ones = y_true[ids].astype('float32')  # for calculating true positives


why float32?

I just wanted to convert it to float since later on it is used in the RawKernel which expects float data type. float64 should also be fine.

I think float then is probably fine. Technically we could use

cuml/python/cuml/common/kernel_utils.py

Line 55 in 489a7d8

def cuda_kernel_factory(nvrtc_kernel_str, dtypes, kernel_name=None):

but probably not worth it here, float shouldn't pose any problems, correct?

I just tried several variations and it is really surprising:

# trial 1 ones = y_true[ids] # trial 2 ones = y_true[ids].astype('float') #trial 3 ones = y_true[ids].astype('float64')

All of them have assertion error due to mismatched results against sklearn.
Only ones = y_true[ids].astype('float32') works. Please note that in each trial I not only changed this line but every line with astype('float32').

@dantegd Please let me know if this is acceptable. And if there are other changes to be made. Thank you!

Co-authored-by: Dante Gama Dessavre <dante.gamadessavre@gmail.com>

dantegd · 2020-07-14T16:22:24Z

rerun tests

dantegd · 2020-07-14T20:12:39Z

@daxiongshu could you solve the conflicts? The PR is very close, I think it looks good so far!

daxiongshu · 2020-07-23T18:36:51Z

@dantegd merge conflicts fixed. Thank you!

daxiongshu and others added 3 commits June 24, 2020 01:44

Merge pull request #10 from rapidsai/branch-0.15

404ff66

sync with upstream

Merge pull request #11 from rapidsai/branch-0.15

4b518e3

merge with upstream

first commit

84fbce7

daxiongshu requested a review from a team as a code owner July 7, 2020 23:18

Jiwei Liu added 6 commits July 7, 2020 16:42

common routines

64ee85c

first implementation of precision recall curve

d3060fc

test passed

b4ea22e

refactor with _binary_clf_curve

f57adc8

refactor _binary_roc_auc_score

d66a42c

docstring & flake8

226854a

daxiongshu changed the title ~~[WIP] Precision recall curve using cupy~~ [Review] Precision recall curve using cupy Jul 8, 2020

daxiongshu added 3 - Ready for Review Ready for review by team 4 - Waiting on Reviewer Waiting for reviewer to review or respond labels Jul 8, 2020

dantegd requested changes Jul 14, 2020

View reviewed changes

dantegd added 4 - Waiting on Author Waiting for author to respond to review and removed 3 - Ready for Review Ready for review by team 4 - Waiting on Reviewer Waiting for reviewer to review or respond labels Jul 14, 2020

daxiongshu and others added 5 commits July 13, 2020 22:00

Update python/cuml/metrics/_ranking.py

5bae56d

Co-authored-by: Dante Gama Dessavre <dante.gamadessavre@gmail.com>

Update python/cuml/metrics/_ranking.py

ae49143

Co-authored-by: Dante Gama Dessavre <dante.gamadessavre@gmail.com>

requested changes

9bc1f8e

resolve merge conflicts

25f795b

Merge branch 'rapidsai-branch-0.15' into fea-precision-recall-curve-cupy

8cc9e31

Jiwei Liu added 2 commits July 23, 2020 07:55

fix conflicts

8f0f74b

Merge branch 'rapidsai-branch-0.15' into fea-precision-recall-curve-cupy

b1d4f81

daxiongshu added 4 - Waiting on Reviewer Waiting for reviewer to review or respond and removed 4 - Waiting on Author Waiting for author to respond to review labels Jul 23, 2020

dantegd approved these changes Jul 23, 2020

View reviewed changes

dantegd merged commit b6ca7a9 into rapidsai:branch-0.15 Jul 23, 2020

daxiongshu deleted the fea-precision-recall-curve-cupy branch August 31, 2021 21:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Review] Precision recall curve using cupy #2519

[Review] Precision recall curve using cupy #2519

daxiongshu commented Jul 7, 2020 •

edited

Loading

GPUtester commented Jul 7, 2020

dantegd left a comment

dantegd Jul 14, 2020

daxiongshu Jul 14, 2020

dantegd Jul 14, 2020

daxiongshu Jul 16, 2020

daxiongshu Jul 20, 2020

dantegd commented Jul 14, 2020

dantegd commented Jul 14, 2020

daxiongshu commented Jul 23, 2020

[Review] Precision recall curve using cupy #2519

[Review] Precision recall curve using cupy #2519

Conversation

daxiongshu commented Jul 7, 2020 • edited Loading

GPUtester commented Jul 7, 2020

dantegd left a comment

Choose a reason for hiding this comment

dantegd Jul 14, 2020

Choose a reason for hiding this comment

daxiongshu Jul 14, 2020

Choose a reason for hiding this comment

dantegd Jul 14, 2020

Choose a reason for hiding this comment

daxiongshu Jul 16, 2020

Choose a reason for hiding this comment

daxiongshu Jul 20, 2020

Choose a reason for hiding this comment

dantegd commented Jul 14, 2020

dantegd commented Jul 14, 2020

daxiongshu commented Jul 23, 2020

daxiongshu commented Jul 7, 2020 •

edited

Loading