Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Review] Precision recall curve using cupy #2519

Merged

Conversation

daxiongshu
Copy link
Contributor

@daxiongshu daxiongshu commented Jul 7, 2020

In this PR, I will refactor _ranking.py so that basic functions can be reused among existing and upcoming metrics.

@daxiongshu daxiongshu requested a review from a team as a code owner July 7, 2020 23:18
@GPUtester
Copy link
Contributor

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

@daxiongshu daxiongshu changed the title [WIP] Precision recall curve using cupy [Review] Precision recall curve using cupy Jul 8, 2020
@daxiongshu daxiongshu added 3 - Ready for Review Ready for review by team 4 - Waiting on Reviewer Waiting for reviewer to review or respond labels Jul 8, 2020
Copy link
Member

@dantegd dantegd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments on a first look

python/cuml/metrics/_ranking.py Outdated Show resolved Hide resolved
python/cuml/metrics/_ranking.py Show resolved Hide resolved
python/cuml/metrics/_ranking.py Show resolved Hide resolved
python/cuml/metrics/_ranking.py Show resolved Hide resolved
python/cuml/metrics/_ranking.py Show resolved Hide resolved
python/cuml/metrics/_ranking.py Show resolved Hide resolved
python/cuml/metrics/_ranking.py Outdated Show resolved Hide resolved
python/cuml/metrics/_ranking.py Outdated Show resolved Hide resolved
ids = cp.argsort(-y_score)
sorted_score = y_score[ids]

ones = y_true[ids].astype('float32') # for calculating true positives
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why float32?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wanted to convert it to float since later on it is used in the RawKernel which expects float data type. float64 should also be fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think float then is probably fine. Technically we could use

def cuda_kernel_factory(nvrtc_kernel_str, dtypes, kernel_name=None):
but probably not worth it here, float shouldn't pose any problems, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried several variations and it is really surprising:

# trial 1
ones = y_true[ids]

# trial 2
ones = y_true[ids].astype('float')

#trial 3
ones = y_true[ids].astype('float64')

All of them have assertion error due to mismatched results against sklearn.
Only ones = y_true[ids].astype('float32') works. Please note that in each trial I not only changed this line but every line with astype('float32').

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dantegd Please let me know if this is acceptable. And if there are other changes to be made. Thank you!

@dantegd dantegd added 4 - Waiting on Author Waiting for author to respond to review and removed 3 - Ready for Review Ready for review by team 4 - Waiting on Reviewer Waiting for reviewer to review or respond labels Jul 14, 2020
daxiongshu and others added 5 commits July 13, 2020 22:00
@dantegd
Copy link
Member

dantegd commented Jul 14, 2020

rerun tests

@dantegd
Copy link
Member

dantegd commented Jul 14, 2020

@daxiongshu could you solve the conflicts? The PR is very close, I think it looks good so far!

@daxiongshu
Copy link
Contributor Author

@dantegd merge conflicts fixed. Thank you!

@daxiongshu daxiongshu added 4 - Waiting on Reviewer Waiting for reviewer to review or respond and removed 4 - Waiting on Author Waiting for author to respond to review labels Jul 23, 2020
@dantegd dantegd merged commit b6ca7a9 into rapidsai:branch-0.15 Jul 23, 2020
@daxiongshu daxiongshu deleted the fea-precision-recall-curve-cupy branch August 31, 2021 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4 - Waiting on Reviewer Waiting for reviewer to review or respond
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants