Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add next-basket recommendation evaluation method #545

Merged
merged 28 commits into from
Nov 27, 2023

Conversation

lthoang
Copy link
Member

@lthoang lthoang commented Nov 9, 2023

Description

Next-basket recommendation takes history baskets (e.g., a sequence of baskets that have been purchased) as input to predict what the next-basket (a set of items) is.
To support modeling basket data, we must allow modeling repeating items.
To support next-basket recommendation, it should be a NextBasketRecommender with following function implemented:

  • fit()
  • score()

TODO:

The conventional metrics for evaluating next-basket recommendation are Recall, NDCG, and PHR. This paper also evaluate performance on repeating/exploring items separately.

Related Issues

#543

Checklist:

  • I have added tests.
  • I have updated the documentation accordingly.
  • I have updated README.md (if you are adding a new model).
  • I have updated examples/README.md (if you are adding a new example).
  • I have updated datasets/README.md (if you are adding a new dataset).

@lthoang lthoang force-pushed the next-basket-recommendation branch 2 times, most recently from 0e1a497 to 6671a15 Compare November 9, 2023 14:44
@tqtg
Copy link
Member

tqtg commented Nov 9, 2023

@lthoang Can we have a short description on the philosophy behind this? And some best practices if any.

cornac/data/basket_dataset.py Outdated Show resolved Hide resolved
@tqtg
Copy link
Member

tqtg commented Nov 9, 2023

Overall comments:

  • It seems that we can avoid a lot of redundancy by inheriting existing base classes. This will help future development and maintenance.
  • Please also think about how this fits in with next-item recommendation. Can we have one evaluation method that supports both scenarios or we need to separate them?
  • For the GPTop baseline, it's good to have it in the model list and give a link to paper where we can find description for better reference.

@lthoang @hieuddo

@lthoang lthoang force-pushed the next-basket-recommendation branch 4 times, most recently from f6e51dd to 028e1b1 Compare November 16, 2023 07:00
@lthoang lthoang force-pushed the next-basket-recommendation branch 4 times, most recently from aeac809 to d85f1a7 Compare November 19, 2023 15:25
@lthoang lthoang marked this pull request as ready for review November 20, 2023 10:04
@lthoang lthoang force-pushed the next-basket-recommendation branch from efe4f32 to 6bb0a24 Compare November 20, 2023 12:09
@lthoang lthoang requested a review from tqtg November 20, 2023 12:11
@lthoang lthoang force-pushed the next-basket-recommendation branch from 8f51f5a to dbd59cb Compare November 22, 2023 15:28
@tqtg
Copy link
Member

tqtg commented Nov 22, 2023

@lthoang one idea, how about maintaining UIR tuple as it is and keeping basket ids as another array like timestamps? It will be backward compatible with the current Dataset, and we could use UBIT datasets for other not-next-basket-rec models.

cornac/data/dataset.py Show resolved Hide resolved
cornac/data/reader.py Outdated Show resolved Hide resolved
cornac/metrics/ranking.py Outdated Show resolved Hide resolved
cornac/models/gp_top/recom_gp_top.py Show resolved Hide resolved
@tqtg
Copy link
Member

tqtg commented Nov 24, 2023

@lthoang please consider some comments above. In addition, I refactored NextBasketRecommender and NextBasketEvaluation, mainly to simplify things but not the logic. Below is the result running gp_top_tafeng.py example, please double check if that's something we should expect.

VALIDATION:
...
      | NDCG@10 | NDCG@20 | NDCG@50 | Recall@10 | Recall@20 | Recall@50 | Time (s)
----- + ------- + ------- + ------- + --------- + --------- + --------- + --------
PTop  |  0.0062 |  0.0064 |  0.0067 |    0.0090 |    0.0097 |    0.0109 |  18.2922
GTop  |  0.0228 |  0.0278 |  0.0359 |    0.0367 |    0.0553 |    0.0932 |  88.7690
GPTop |  0.0239 |  0.0290 |  0.0373 |    0.0410 |    0.0602 |    0.0991 |  94.4805

TEST:
...
      | NDCG@10 | NDCG@20 | NDCG@50 | Recall@10 | Recall@20 | Recall@50 | Train (s) | Test (s)
----- + ------- + ------- + ------- + --------- + --------- + --------- + --------- + --------
PTop  |  0.0124 |  0.0133 |  0.0137 |    0.0192 |    0.0224 |    0.0240 |    0.0003 |  24.1674
GTop  |  0.0264 |  0.0326 |  0.0425 |    0.0385 |    0.0597 |    0.1024 |    0.0774 | 114.1030
GPTop |  0.0295 |  0.0363 |  0.0467 |    0.0483 |    0.0717 |    0.1168 |    0.0783 | 122.7305

@tqtg
Copy link
Member

tqtg commented Nov 25, 2023

@lthoang I did some changes to reuse the code better. We're almost there. Please check if the following result looks fine.

VALIDATION:
...
      | NDCG@10 | NDCG@20 | NDCG@50 | Recall@10 | Recall@20 | Recall@50 | Time (s)
----- + ------- + ------- + ------- + --------- + --------- + --------- + --------
PTop  |  0.0494 |  0.0525 |  0.0572 |    0.0438 |    0.0578 |    0.0712 |   2.6300
GTop  |  0.0492 |  0.0549 |  0.0703 |    0.0398 |    0.0587 |    0.1011 |  10.3374
GPTop |  0.0672 |  0.0767 |  0.0951 |    0.0594 |    0.0908 |    0.1414 |  11.4246

TEST:
...
      | NDCG@10 | NDCG@20 | NDCG@50 | Recall@10 | Recall@20 | Recall@50 | Train (s) | Test (s)
----- + ------- + ------- + ------- + --------- + --------- + --------- + --------- + --------
PTop  |  0.0489 |  0.0528 |  0.0586 |    0.0436 |    0.0589 |    0.0742 |    0.0003 |   6.0762
GTop  |  0.0497 |  0.0561 |  0.0722 |    0.0394 |    0.0596 |    0.1046 |    0.0767 |  25.9023
GPTop |  0.0697 |  0.0788 |  0.0987 |    0.0628 |    0.0915 |    0.1460 |    0.0769 |  28.5510

@lthoang lthoang force-pushed the next-basket-recommendation branch from 27338be to 6444303 Compare November 25, 2023 23:14
@lthoang
Copy link
Member Author

lthoang commented Nov 25, 2023

@lthoang I did some changes to reuse the code better. We're almost there. Please check if the following result looks fine.

VALIDATION:
...
      | NDCG@10 | NDCG@20 | NDCG@50 | Recall@10 | Recall@20 | Recall@50 | Time (s)
----- + ------- + ------- + ------- + --------- + --------- + --------- + --------
PTop  |  0.0494 |  0.0525 |  0.0572 |    0.0438 |    0.0578 |    0.0712 |   2.6300
GTop  |  0.0492 |  0.0549 |  0.0703 |    0.0398 |    0.0587 |    0.1011 |  10.3374
GPTop |  0.0672 |  0.0767 |  0.0951 |    0.0594 |    0.0908 |    0.1414 |  11.4246

TEST:
...
      | NDCG@10 | NDCG@20 | NDCG@50 | Recall@10 | Recall@20 | Recall@50 | Train (s) | Test (s)
----- + ------- + ------- + ------- + --------- + --------- + --------- + --------- + --------
PTop  |  0.0489 |  0.0528 |  0.0586 |    0.0436 |    0.0589 |    0.0742 |    0.0003 |   6.0762
GTop  |  0.0497 |  0.0561 |  0.0722 |    0.0394 |    0.0596 |    0.1046 |    0.0767 |  25.9023
GPTop |  0.0697 |  0.0788 |  0.0987 |    0.0628 |    0.0915 |    0.1460 |    0.0769 |  28.5510

The result looks fine. I have update the example. The new result as follows:

VALIDATION:
...
      | HitRatio@10 | HitRatio@50 | NDCG@10 | NDCG@50 | Recall@10 | Recall@50 | Time (s)
----- + ----------- + ----------- + ------- + ------- + --------- + --------- + --------
PTop  |      0.3701 |      0.5000 |  0.0836 |  0.0997 |    0.0779 |    0.1290 |   1.7190
GTop  |      0.2606 |      0.5121 |  0.0562 |  0.0791 |    0.0438 |    0.1105 |   6.8573
GPTop |      0.4003 |      0.6722 |  0.0917 |  0.1269 |    0.0861 |    0.1869 |   7.2663

TEST:
...
      | HitRatio@10 | HitRatio@50 | NDCG@10 | NDCG@50 | Recall@10 | Recall@50 | Train (s) | Test (s)
----- + ----------- + ----------- + ------- + ------- + --------- + --------- + --------- + --------
PTop  |      0.3533 |      0.4962 |  0.0794 |  0.0953 |    0.0689 |    0.1211 |    0.0004 |   4.4665
GTop  |      0.2418 |      0.4820 |  0.0540 |  0.0746 |    0.0402 |    0.1013 |    0.0812 |  17.1021
GPTop |      0.3802 |      0.6500 |  0.0857 |  0.1181 |    0.0784 |    0.1740 |    0.0807 |  18.1752

@tqtg
Copy link
Member

tqtg commented Nov 27, 2023

@lthoang LGTM

@lthoang lthoang merged commit 4af34f2 into PreferredAI:master Nov 27, 2023
12 checks passed
@lthoang lthoang deleted the next-basket-recommendation branch November 27, 2023 08:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants