[RMP] KDD Tutorial (Merlin Models—Retrieval, Ranking, and Negative Sampling) #269

karlhigley · 2022-05-04T15:10:55Z

Problem:

We have an accepted KDD tutorial (outline can be found here). But Merlin Model does not include all the necessary components to build the outline examples. We also don't have all the experiment results that allow us to tell the desired story, i.e.: Two-tower is better than MF for the retrieval stage & a two-stage model (Retrieval+Ranking) leads to a better score than Retrieval only.

Goal:

Implement the missing blocks in Merlin Models to support the KDD tutorial notebooks.
- Negative sampling for ranking
- Make blocks part of Model and not of SequentialBlock models#551
- Add Cond Layer to tensorflow combinators models#552
- Update UniformNegativeSampling to handle targets and add optional control for testing models#583
- [BUG] The negative sampling for ranking gives AUC 0.00 value when we add sampling layer to the Model class models#596
  Note: Marc / Oliver to add the PR here
- Note: Negative sampling refactoring for retrieval is not in scope for KDD tutorial. We will re- visit the refactoring in 22.08. Marc to add a roadmap ticket for this.
- Support batch-predict for top-k recommender
  [BUG] to_top_k_recommender model does not have batch_predict() method models#499 (this is not a blocker for KDD tutorial)
  Note: Batch predict is a P2
- Code for generating local-prediction step of two-stage model
- [FEA] cannot properly save and load the TF Retrieval model models#498 (this is not a blocker for KDD tutorial)
Run (Hyper Parameter Optimization) HPO experiments for retrieval and ranking steps:
- Preprocessing of REES46 commerce dataset
- Script for Retrieval models with Two-tower and MF
- [Task] HPO experiments of retrieval models with REES46 dataset for KDD tutorial models#544
- [x ] Script for Ranking models with DLRM and DCN
- [Task] HPO experiments of ranking models with REES46 dataset for KDD tutorial models#545
  Note: 22.07 will focus on retrieval. Gabriel to update the status here . These are related to integration tests which are focussed on retrieval.
- Compute NDCG for ranking models (grouping preds by example id and converting preds to TF tensors to use NDCG metric implementation) --> we have a custom function implementation for that task.

For 22.08 - we will refactor code for negative sampling for retrieval. This may not be in time for KDD tutorial.
Experiments for ranking will have to be done for KDD tutorial.

Evaluation pipeline of the 2-stage model
Note: Ronay to discuss with Sara and Gabriel and update these tasks. This would be mostly be a custom functionality for the KDD tutorial.

Constraints:

REES46 has only positive interactions, we need to generate negative samples for training the ranking model
Get HPO results that show the Two-tower model is better than MF, the current experiments are showing the opposite.
Get results of the ranking models to update the outline accordingly
Save and Load back a two-tower model : [FEA] cannot properly save and load the TF Retrieval model models#498

Starting Point:

Proposal: https://docs.google.com/document/d/1NtuE9dKV3q7Amg6RMJbiFDidP486jW4RACJZ4V8ED80
New content doc: https://docs.google.com/document/d/1mPT0QVjdZKh7Pe4O6hQOMljTfWMBLYoSRCxDUPI2Mhs/edit

The proposed outline is:

* Kickoff: 5 min
* Presentation: 25 min 
      Introduction to two-stage recommender systems
* Hands-on Lab
     **Retrieval model : 50 min
             Introduction to candidate generation and negative sampling techniques 
             GPU-accelerated ETL with NVTabular
                     Introduction to NVTabular
              Training a MF model with TensorFlow using Merlin Models 
              Training a Two-Tower model with TensorFlow using Merlin Models 
                    Generating more features with NVTabular
       Break: 10 min
       ** Two Stage Recommender Models: 45 min
              Training ranking models using Merlin Models
       ** Model evaluation: 30 min
               Evaluation of  two-stage recommender systems
Wrap up and Q&A: 15 min

The text was updated successfully, but these errors were encountered:

rnyak · 2022-05-24T18:15:09Z

Some context:

How do we train a DLRM model without negative interactions, i.e. explicit binary target column?

We would need a candidate retrieval stage to generate the candidates, and then train the ranking model with these
candidates. In that case, the ranking objective will be a binary classification task, and the loss is binary cross entropy. For that we need to generate our own binary target column based on the output from candidate retrieval stage.

Do we have an integrated way to feed the candidates from the retrieval model to ranking model? How would that work?

We will need to implement an additional step to generate candidates, save the data and load it back for the ranking step. Currently we are not supporting this functionality.

sararb · 2022-06-29T16:27:54Z

Progress updates:

Preprocessing of REES46 for retrieval and the generation of negative samples for ranking stage are done
HPO experiments of the retrieval model with REES46 dataset are in progress : [Task] HPO experiments of retrieval models with REES46 dataset for KDD tutorial models#544
HPO experiments of the ranking model with REES46 dataset is an open task: [Task] HPO experiments of ranking models with REES46 dataset for KDD tutorial models#545

viswa-nvidia · 2022-07-06T02:37:59Z

@bschifferer , please fill up the problem, scope and starting point section above. It may be listed in the comments section. Please provide them above for clarity. Let me know if you need any help.

rnyak · 2022-07-10T18:14:45Z

@viswa-nvidia @karlhigley @sararb for viz. Some updates:

currently HPO results on retrieval model with Ecom-REES dataset does not show strong justification that TT model is always better than MF. Thus, we will not move with this motivation, instead we will showcase with different hyper-parameters we can get better results with TT.
we are not blocked by Support matrix updates for 22.07 #499 for now.
We did not run HPO experiments for ranking model yet, it is waiting for negative sampling api from Marc. But I will create the pipeline asap, and run some HPO based on manual negative sampling function that we created.
Saving and loading back a TT model is still required, and we still get an error there.

viswa-nvidia · 2022-07-21T21:36:53Z

Pushing this to arbitration. This ticket is not in POR. Need to refactor this ticket. there is another ticket covering negative sampling work [RMP] Improve negative sampling for retrieval

viswa-nvidia · 2022-07-25T17:20:22Z

@rnyak , please check off the done work and close this ticket #269

rnyak · 2022-07-25T17:59:33Z

[BUG] The negative sampling for ranking gives AUC 0.00 value when we add sampling layer to the Model class models#596
Note: Marc / Oliver to add the PR here

Let's keep it open due to NVIDIA-Merlin/models#596

viswa-nvidia · 2022-08-08T16:58:00Z

Closing as bug is tracked seperately

karlhigley changed the title ~~[RMP] KDD Tutorial~~ [RMP] KDD Tutorial (Merlin Models Retrieval, Ranking, and Negative Sampling) May 4, 2022

karlhigley changed the title ~~[RMP] KDD Tutorial (Merlin Models Retrieval, Ranking, and Negative Sampling)~~ [RMP] KDD Tutorial (Merlin Models—Retrieval, Ranking, and Negative Sampling) May 4, 2022

karlhigley assigned rnyak and bschifferer May 4, 2022

karlhigley added roadmap epic labels May 20, 2022

EvenOldridge added this to the Merlin 22.07 milestone Jul 4, 2022

EvenOldridge assigned sararb Jul 4, 2022

oliverholworthy mentioned this issue Jul 6, 2022

Add Cond Layer to tensorflow combinators NVIDIA-Merlin/models#552

Merged

viswa-nvidia modified the milestones: Merlin 22.07, Merlin 22.08 Jul 20, 2022

viswa-nvidia closed this as completed Aug 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RMP] KDD Tutorial (Merlin Models—Retrieval, Ranking, and Negative Sampling) #269

[RMP] KDD Tutorial (Merlin Models—Retrieval, Ranking, and Negative Sampling) #269

karlhigley commented May 4, 2022 •

edited by rnyak

Loading

rnyak commented May 24, 2022

sararb commented Jun 29, 2022

viswa-nvidia commented Jul 6, 2022

rnyak commented Jul 10, 2022

viswa-nvidia commented Jul 21, 2022 •

edited

Loading

viswa-nvidia commented Jul 25, 2022

rnyak commented Jul 25, 2022

viswa-nvidia commented Aug 8, 2022

[RMP] KDD Tutorial (Merlin Models—Retrieval, Ranking, and Negative Sampling) #269

[RMP] KDD Tutorial (Merlin Models—Retrieval, Ranking, and Negative Sampling) #269

Comments

karlhigley commented May 4, 2022 • edited by rnyak Loading

Problem:

Goal:

Constraints:

Starting Point:

rnyak commented May 24, 2022

sararb commented Jun 29, 2022

viswa-nvidia commented Jul 6, 2022

rnyak commented Jul 10, 2022

viswa-nvidia commented Jul 21, 2022 • edited Loading

viswa-nvidia commented Jul 25, 2022

rnyak commented Jul 25, 2022

viswa-nvidia commented Aug 8, 2022

karlhigley commented May 4, 2022 •

edited by rnyak

Loading

viswa-nvidia commented Jul 21, 2022 •

edited

Loading