Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch NDCG #342

Merged
merged 10 commits into from
Jan 8, 2024
Merged

Batch NDCG #342

merged 10 commits into from
Jan 8, 2024

Conversation

donglihe-hub
Copy link
Contributor

@donglihe-hub donglihe-hub commented Jan 1, 2024

What does this PR do?

The original NDCG metric calculates scores for one instance at a time, which is inefficient. The new NDCG metric calculate scores in batch.

Performance Test Settings:

Number of labels = 100

Batch size = 40
Number of batchs = 100
Effective number of validation samples = 4000

Results:

1c07d1925b9145c64f09d44690f1112

Test CLI & API (bash tests/autotest.sh)

Test APIs used by main.py.

  • Test Pass
    • (Copy and paste the last outputted line here.)
  • Not Applicable (i.e., the PR does not include API changes.)

Check API Document

If any new APIs are added, please check if the description of the APIs is added to API document.

  • API document is updated (linear, nn)
  • Not Applicable (i.e., the PR does not include API changes.)

Test quickstart & API (bash tests/docs/test_changed_document.sh)

If any APIs in quickstarts or tutorials are modified, please run this test to check if the current examples can run correctly after the modified APIs are released.

Copy link
Contributor

@Sinacam Sinacam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should review the use of plurals vs singulars.

libmultilabel/nn/metrics.py Outdated Show resolved Hide resolved
libmultilabel/nn/metrics.py Outdated Show resolved Hide resolved
libmultilabel/nn/metrics.py Outdated Show resolved Hide resolved
libmultilabel/nn/metrics.py Outdated Show resolved Hide resolved
@donglihe-hub
Copy link
Contributor Author

donglihe-hub commented Jan 3, 2024

We should review the use of plurals vs singulars.

A doubt that has haunted me for years is that why it is preds and target, rather than preds and targets or pred and target. Therefore, I prefer using singular form for all variables except for those concepts that have been accepted by the general public.

libmultilabel/nn/metrics.py Outdated Show resolved Hide resolved
@donglihe-hub donglihe-hub changed the title Optimize NDCG Batch NDCG Jan 5, 2024
@@ -45,6 +45,7 @@ class NDCG(Metric):
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.ndcg_score.html
Please find the formal definition here:
https://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-ranked-retrieval-results-1.html
The target has to be a binary tensor.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add

We do not use NDCG in ?? because of ??

Copy link
Contributor Author

@donglihe-hub donglihe-hub Jan 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not pretty sure what should be filled in the ??. As Li-Chung only mention this in function-level comments, I will rewrite it and put it under _idcg() to align with Li-Chung's changes and to not confuse anyone.

@cjlin1 cjlin1 merged commit a3f296d into ASUS-AICS:master Jan 8, 2024
1 check passed
@donglihe-hub donglihe-hub deleted the OptimizeNDCG branch January 8, 2024 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants