Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RMP] KDD Tutorial (Merlin Models—Retrieval, Ranking, and Negative Sampling) #269

Closed
9 of 13 tasks
karlhigley opened this issue May 4, 2022 · 8 comments
Closed
9 of 13 tasks
Assignees
Milestone

Comments

@karlhigley
Copy link
Contributor

karlhigley commented May 4, 2022

Problem:

We have an accepted KDD tutorial (outline can be found here). But Merlin Model does not include all the necessary components to build the outline examples. We also don't have all the experiment results that allow us to tell the desired story, i.e.: Two-tower is better than MF for the retrieval stage & a two-stage model (Retrieval+Ranking) leads to a better score than Retrieval only.

Goal:

For 22.08 - we will refactor code for negative sampling for retrieval. This may not be in time for KDD tutorial.
Experiments for ranking will have to be done for KDD tutorial.

  • Evaluation pipeline of the 2-stage model
  • Note: Ronay to discuss with Sara and Gabriel and update these tasks. This would be mostly be a custom functionality for the KDD tutorial.

Constraints:

  • REES46 has only positive interactions, we need to generate negative samples for training the ranking model
  • Get HPO results that show the Two-tower model is better than MF, the current experiments are showing the opposite.
  • Get results of the ranking models to update the outline accordingly
  • Save and Load back a two-tower model : [FEA] cannot properly save and load the TF Retrieval model models#498

Starting Point:

Proposal: https://docs.google.com/document/d/1NtuE9dKV3q7Amg6RMJbiFDidP486jW4RACJZ4V8ED80
New content doc: https://docs.google.com/document/d/1mPT0QVjdZKh7Pe4O6hQOMljTfWMBLYoSRCxDUPI2Mhs/edit

The proposed outline is:

* Kickoff: 5 min
* Presentation: 25 min 
      Introduction to two-stage recommender systems
* Hands-on Lab
     **Retrieval model : 50 min
             Introduction to candidate generation and negative sampling techniques 
             GPU-accelerated ETL with NVTabular
                     Introduction to NVTabular
              Training a MF model with TensorFlow using Merlin Models 
              Training a Two-Tower model with TensorFlow using Merlin Models 
                    Generating more features with NVTabular
       Break: 10 min
       ** Two Stage Recommender Models: 45 min
              Training ranking models using Merlin Models
       ** Model evaluation: 30 min
               Evaluation of  two-stage recommender systems
Wrap up and Q&A: 15 min
@karlhigley karlhigley changed the title [RMP] KDD Tutorial [RMP] KDD Tutorial (Merlin Models Retrieval, Ranking, and Negative Sampling) May 4, 2022
@karlhigley karlhigley changed the title [RMP] KDD Tutorial (Merlin Models Retrieval, Ranking, and Negative Sampling) [RMP] KDD Tutorial (Merlin Models—Retrieval, Ranking, and Negative Sampling) May 4, 2022
@rnyak
Copy link
Contributor

rnyak commented May 24, 2022

Some context:

How do we train a DLRM model without negative interactions, i.e. explicit binary target column?

  • We would need a candidate retrieval stage to generate the candidates, and then train the ranking model with these
    candidates. In that case, the ranking objective will be a binary classification task, and the loss is binary cross entropy. For that we need to generate our own binary target column based on the output from candidate retrieval stage.

Do we have an integrated way to feed the candidates from the retrieval model to ranking model? How would that work?

  • We will need to implement an additional step to generate candidates, save the data and load it back for the ranking step. Currently we are not supporting this functionality.

@sararb
Copy link

sararb commented Jun 29, 2022

Progress updates:

@EvenOldridge EvenOldridge added this to the Merlin 22.07 milestone Jul 4, 2022
@viswa-nvidia
Copy link

@bschifferer , please fill up the problem, scope and starting point section above. It may be listed in the comments section. Please provide them above for clarity. Let me know if you need any help.

@rnyak
Copy link
Contributor

rnyak commented Jul 10, 2022

@viswa-nvidia @karlhigley @sararb for viz. Some updates:

  • currently HPO results on retrieval model with Ecom-REES dataset does not show strong justification that TT model is always better than MF. Thus, we will not move with this motivation, instead we will showcase with different hyper-parameters we can get better results with TT.
  • we are not blocked by Support matrix updates for 22.07 #499 for now.
  • We did not run HPO experiments for ranking model yet, it is waiting for negative sampling api from Marc. But I will create the pipeline asap, and run some HPO based on manual negative sampling function that we created.
  • Saving and loading back a TT model is still required, and we still get an error there.

@viswa-nvidia
Copy link

viswa-nvidia commented Jul 21, 2022

Pushing this to arbitration. This ticket is not in POR. Need to refactor this ticket. there is another ticket covering negative sampling work [RMP] Improve negative sampling for retrieval

@viswa-nvidia
Copy link

@rnyak , please check off the done work and close this ticket #269

@rnyak
Copy link
Contributor

rnyak commented Jul 25, 2022

@viswa-nvidia
Copy link

Closing as bug is tracked seperately

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants