Re-implement ROC-AUC. #6747

trivialfis · 2021-03-11T19:06:55Z

Binary
MultiClass
LTR
Add documents.

This PR resolves a few issues:

Define a value when the dataset is invalid, which can happen if there's an empty dataset, or when the dataset contains only positive or negative values.
Define ROC-AUC for multi-class classification.
Define weighted average value for distributed setting.
A correct implementation for learning to rank task. The previous implementation is just binary classification with averaging across groups, which doesn't measure ordered learning to rank.

doc/parameter.rst

src/common/common.h

src/common/device_helpers.cuh

src/metric/auc.cc

src/metric/auc.cu

src/metric/auc.h

tests/cpp/metric/test_auc.cc

hcho3 · 2021-03-11T23:31:33Z

Does this PR also reimplement the learning to rank task? Can you point us to relevant literature for the new implementation?

trivialfis · 2021-03-11T23:36:23Z

Does this PR also reimplement the learning to rank task?

Yes. For the AUC metric. Some code was pulled from #6465 .

Can you point us to relevant literature for the new implementation?

I will do that later, it's a little bit complicated.

hcho3 · 2021-03-11T23:37:51Z

Got it. It would be nice to see how the new implementation is more in line with the latest literature on Learning To Rank (LTR). We can even write a short vignette or technical report to document our new LTR implementation. This will be useful for others who would inherit this codebase in the future.

trivialfis · 2021-03-11T23:40:55Z

I have adopted some heuristics here. Will explain in detail once I can get all tests passed.

hcho3 · 2021-03-11T23:44:18Z

I have adopted some heuristics here.

Interesting. Once the implementation is complete and you can explain the heuristics, I'd like to volunteer time to write up a short technical report. This way, we could hopefully remember all the pertinent details and still be able to maintain the LTR objective and metrics in the coming future.

codecov-io · 2021-03-13T00:09:49Z

Codecov Report

Merging #6747 (67d26b1) into master (4ee8340) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #6747   +/-   ##
=======================================
  Coverage   81.78%   81.78%           
=======================================
  Files          13       13           
  Lines        3848     3848           
=======================================
  Hits         3147     3147           
  Misses        701      701

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4ee8340...67d26b1. Read the comment docs.

trivialfis · 2021-03-16T01:01:52Z

A smaller PR is extracted in #6749 .

RAMitchell

Awesome to get multiclass and ranking support. Some minor comments. I'm assuming your math is correct.

It might be a good idea to add hypothesis tests comparing the results against sklearn.

What is performance looking like?

doc/parameter.rst

src/common/ranking_utils.cuh

src/metric/auc.cu

trivialfis · 2021-03-16T05:20:38Z

It might be a good idea to add hypothesis tests comparing the results against sklearn.

I have some test cases for comparing with sklearn on binary classification and multi-class classification. Ranking is not supported by sklearn.

I will try to make the data more random with hypothesis.

This is part of #6747 .

trivialfis · 2021-03-16T11:30:29Z

@RAMitchell

I added parametrize test with different sample sizes. scikit-learn throws an error when the dataset is invalid so the degenerate case is tested separately.

I will run some basic benchmark later.

trivialfis · 2021-03-16T12:12:19Z

Ran small benchmark on higgs with Booster.eval. There's no performance change for binary classification:

before
CPU 2.074251174926758
GPU 0.022435665130615234
after
CPU 2.070319652557373
GPU 0.022629976272583008

RAMitchell

Nice work.

hcho3

LGTM. Some comments:

doc/parameter.rst

src/common/ranking_utils.cuh

src/metric/auc.cc

hcho3 · 2021-03-20T00:28:50Z

src/metric/auc.cc

+ auto s_results = common::Span<float>(results);
+ auto local_area = s_results.subspan(0, n_classes);
+ auto tp = s_results.subspan(n_classes, n_classes);
+ auto auc = s_results.subspan(2 * n_classes, n_classes);


I am very happy that we have access to the Span abstraction. Imagine having to sub-slice an array with raw pointers 😨

It would be even better if we have a matrix/tensor library backend. ;-) If I were to start a new ML project that might be the first thing I do... (assuming it's not based on logic)

hcho3 · 2021-03-20T00:31:05Z

src/metric/auc.cc

+#pragma omp parallel for
+ for (omp_ulong c = 0; c < n_classes; ++c) {


Reminder for the future: we may want to change this line if we were to adopt parallel scan in the CPU binary AUC. Something like 2D Task block would be necessary?

I don't know what's the best way for CPU. The GPU is parallized by the scan operator.

Maybe we can look at tensorflow and see how they calculate AUC.

hcho3 · 2021-03-20T00:40:00Z

src/metric/auc.cc

+ */
+float GroupRankingAUC(common::Span<float const> predts,
+ common::Span<float const> labels, float w) {
+ // on ranking, we just count all pairs.


This will impose O(n**2) cost for the AUC computation, where n is the number of data points in the evaluation set. Have you considered using a sampling approach, where a sample of pairs are randomly collected? The LTR objective currently uses this approach and offers a parameter called num_pairsample to control sampling.

Actually I want to remove the random sampling in objective and replace it with a top k parameter.

I implement AUC for ranking mostly for compatibility reason. I don't think it's a good metric for ranking. Averaging between groups is destroying all the auc properties.

@trivialfis Hi, been watching through xgboost versions and stumbled upon this discussion. Could you please elaborate on this idea of averaging between groups that destroys auc properties? Or could you give a link to read about it? Thx in advance

The AUC is "area under the curve", which is a normalized ratio with a geometric interpretation. One cannot average ratios directly.

* Binary * MultiClass * LTR * Add documents. This PR resolves a few issues: - Define a value when dataset is invalid, which can happens if there's an empty dataset, or when the dataset contains only positive or negative value. - Define ROC-AUC for multi-class classification. - Define weighted average value for distributed setting. - A correct implementation for learning to rank task. Previous implementation is just binary classification with averaging across groups, which doesn't measure ordered learning to rank.

trivialfis · 2021-03-20T07:53:16Z

@hcho3 All style comments are addressed.

trivialfis added the status: WIP label Mar 11, 2021

trivialfis commented Mar 11, 2021

View reviewed changes

trivialfis mentioned this pull request Mar 13, 2021

Add device argsort. #6749

Merged

trivialfis removed the status: WIP label Mar 13, 2021

trivialfis changed the title ~~[WIP] Re-implement ROC-AUC.~~ Re-implement ROC-AUC. Mar 13, 2021

RAMitchell reviewed Mar 16, 2021

View reviewed changes

doc/parameter.rst Outdated Show resolved Hide resolved

src/common/ranking_utils.cuh Show resolved Hide resolved

src/metric/auc.cu Outdated Show resolved Hide resolved

src/metric/auc.cu Show resolved Hide resolved

src/metric/auc.cu Show resolved Hide resolved

trivialfis force-pushed the auc branch from 310059c to 4fff3aa Compare March 16, 2021 05:37

trivialfis added a commit that referenced this pull request Mar 16, 2021

Add device argsort. (#6749)

1a73a28

This is part of #6747 .

trivialfis force-pushed the auc branch from 4fff3aa to 6536cd9 Compare March 16, 2021 11:25

trivialfis force-pushed the auc branch from 808c33a to 8e430ac Compare March 16, 2021 12:30

RAMitchell approved these changes Mar 17, 2021

View reviewed changes

trivialfis requested a review from hcho3 March 17, 2021 19:06

hcho3 added the Blocking label Mar 19, 2021

trivialfis mentioned this pull request Mar 19, 2021

[Roadmap] 1.4.0 Roadmap #6500

Closed

23 tasks

hcho3 approved these changes Mar 20, 2021

View reviewed changes

trivialfis added 2 commits March 20, 2021 15:36

Reviewer's comments.

67d26b1

trivialfis force-pushed the auc branch from 8e430ac to 67d26b1 Compare March 20, 2021 07:50

trivialfis merged commit bcc0277 into dmlc:master Mar 20, 2021

trivialfis deleted the auc branch March 20, 2021 08:52

JohnZed mentioned this pull request Mar 23, 2021

/workspace/xgboost/src/metric/rank_metric.cc:603: Check failed: dat[0] <= dat[1] (1.07837 vs. 1) : AUC-PR: AUC > 1.0 #6561

Closed

trivialfis mentioned this pull request Jan 6, 2022

Different train result between xgboost>=1.4.0 and xgboost<=1.3.3 #7542

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-implement ROC-AUC. #6747

Re-implement ROC-AUC. #6747

trivialfis commented Mar 11, 2021 •

edited

Loading

hcho3 commented Mar 11, 2021

trivialfis commented Mar 11, 2021 •

edited

Loading

hcho3 commented Mar 11, 2021 •

edited

Loading

trivialfis commented Mar 11, 2021

hcho3 commented Mar 11, 2021 •

edited

Loading

codecov-io commented Mar 13, 2021 •

edited

Loading

trivialfis commented Mar 16, 2021

RAMitchell left a comment

trivialfis commented Mar 16, 2021

trivialfis commented Mar 16, 2021

trivialfis commented Mar 16, 2021

RAMitchell left a comment

hcho3 left a comment

hcho3 Mar 20, 2021

trivialfis Mar 20, 2021 •

edited

Loading

hcho3 Mar 20, 2021

trivialfis Mar 20, 2021

trivialfis Mar 20, 2021

hcho3 Mar 20, 2021

trivialfis Mar 20, 2021

EvgenShSg Jul 10, 2023

trivialfis Jul 11, 2023 •

edited

Loading

EvgenShSg Jul 11, 2023

trivialfis commented Mar 20, 2021

		#pragma omp parallel for
		for (omp_ulong c = 0; c < n_classes; ++c) {

Re-implement ROC-AUC. #6747

Re-implement ROC-AUC. #6747

Conversation

trivialfis commented Mar 11, 2021 • edited Loading

hcho3 commented Mar 11, 2021

trivialfis commented Mar 11, 2021 • edited Loading

hcho3 commented Mar 11, 2021 • edited Loading

trivialfis commented Mar 11, 2021

hcho3 commented Mar 11, 2021 • edited Loading

codecov-io commented Mar 13, 2021 • edited Loading

Codecov Report

trivialfis commented Mar 16, 2021

RAMitchell left a comment

Choose a reason for hiding this comment

trivialfis commented Mar 16, 2021

trivialfis commented Mar 16, 2021

trivialfis commented Mar 16, 2021

RAMitchell left a comment

Choose a reason for hiding this comment

hcho3 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trivialfis Mar 20, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trivialfis Jul 11, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trivialfis commented Mar 20, 2021

trivialfis commented Mar 11, 2021 •

edited

Loading

trivialfis commented Mar 11, 2021 •

edited

Loading

hcho3 commented Mar 11, 2021 •

edited

Loading

hcho3 commented Mar 11, 2021 •

edited

Loading

codecov-io commented Mar 13, 2021 •

edited

Loading

trivialfis Mar 20, 2021 •

edited

Loading

trivialfis Jul 11, 2023 •

edited

Loading