ML Design Interview

Tackling the ML design interview, with a focus on recommender systems.

Questions are generally open ended so you can dive deeper into the solution. It is important to design a solution thinking about data analysis, potential use cases, the users, scalability, limitations and tradeoffs.

Problem Navigation

Visualize and organize the entire problem and solution space.

Define the ultimate goal.
- Example: "improve user engagement", "improve profits without affecting user enjoyment"
Define proxy events, and business metrics based on them that can measure our progress.
- Example: "sessions per week", "session length", "click-through rate", "revenue" (for ads), "daily active users" etc.

Training Data

Identify methods to collect training data. Analyze constraints / risks associated with each proposed method.

Clarifying questions:
- What data do we already have?
  - User metadata: demographic data (gender, age-group, location)
  - Ad/post metadata: ad/post contents, tags/topics
  - User engagement data: ad/post clicks/likes, ad/post impressions (with view-time), users own posts, etc.
- What is the raw data that can be observed?
  - Example: stream of click events, stream of impression events, user-data in a database etc.
  - What kind of data can be collected? What kind of data processing/transformation pipelines do we need to build for it?
  - We could even create a "not relevant" button to gather negative user-feedbacks.
Possibility of online learning

Feature Engineering

Identifying the most important features for the specific task. Come up with relevant features (or feature engineering techniques) to train the model.

Selecting what features to use, and how to represent history of activity
- Simple examples: time of the day, day of the week, last viewed post(s), days since a post's upload, user/post language, number of previous impressions (without click), user's past history with posts from the same channel, etc.
- Complex example: average of embeddings of all posts that a user liked/clicked, bag-of-words representation of topics/tags a user interacted with
Also read section 4.1 of the paper, Deep Neural Networks for YouTube Recommendations (see References).
Handling missing feature values (data sparsity and feature coverage)
Careful about feature leakage

Modeling

Evaluating model choices, and justifying the final decision. Explain the training process. Anticipate risks and mitigate them.

Clarifying questions:
- What is the existing system? (This could be a baseline.)
Simple baseline model (or performance)
- A simple recommender system: random recommendations
Popular techniques:
- Collaborative filtering uses other users preferences to build recommendations for a user.
  - User-based filtering (possible example, news feed in Instagram), item-based filtering (possible example, "bought together" in Amazon)
  - Types:
    - Memory-based: similarity search based on user-item interaction matrix
    - Model-based: similarity search based on dimensionality-reduced interaction history
    - Hybrid and more...
  - Similarity between two users could be measured using Pearson correlation (co-variance) or vector cosine (angular) or Jaccard coefficient (binary IoU) on the vector representation of their interaction history
  - Cons: Cannot handle fresh items since there is no interaction history associated with them.
- Content-based filtering uses the description/contents of items to provide recommendations based on the user's profile and preferences.
  - Embedding-based approach for items and users, and using a nearest-neighbor search or vector-similarity metrics to predict interest/relevance.
- Hybrid approaches use collaborative and content-based filtering to improve recommendations.
Model: What to predict? What are the inputs?
- Probability of the user liking an ad/post. (A classification problem -- like or not liking.)
- Relevance of an ad/post to the user. (A regression problem -- user ratings from 1 to 5.)
Beyond accuracy (post-processing or re-ranking):
- Intra-list diversity: recommend items from different sources or topics
- Serendipity: recommend items that are outside the user's engagement history (a pleasant surprise)
- Novelty: recommend items that are unpopular (an underrated gem)
- Robustness: recommendation system must be resistant to manipulation (e.g. negative ratings to competitors)
- Context-awareness: time (morning vs. night, workday vs. holidays), location, type of device, etc.
- Persistence vs. Temporal diversity: both are important in different scenarios
- Bias: imbalances in training data can cause biased recommendation
- Filter bubble: intellectual isolation from opposing views makes the society more polarized
Model selection, influenced by features, data volume, interpretability requirements, and more.
- Fewer data -> linear models fare better
- Normalized (Cross) Entropy to compare models
  - See Practical Lessons from Predicting Clicks on Ads at Facebook

Evaluation & Deployment

Consistent evaluation and deployment techniques. Justify and articulate the choice of metrics to track.

Offline evaluation
- Example: Recall@k, Mean Reciprocal Rank (MRR), Mean Average Precision (mAP), Normalized Discounted Cumulative Gain (nDCG)
- Progressive evaluation (train on days 1..n; evaluate on days n+1..n+k)
- With diversity re-rankings, it becomes difficult to evaluate offline
Online evaluation
- A/B testing: click-through rate (CTR), session-length over time (weeks), etc.
- Detecting data drift, and retraining base models
Debugging a failed model?
Continuous evaluation and deployment (daily updates?)
Training data recency and the problem of Online-Offline Consistency

References

Paper: Deep Neural Networks for YouTube Recommendations
Mock Interview: ML design to recommend artists on Spotify
Mock Interview: ML design to predict watch-times on Netflix; the interviewee is a bit passive, which is not ideal, but he has good ideas
Evaluating Recommender and Ranking Systems
Article suggested by a Meta recruiter, but it's not very helpful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ml-design.md

ml-design.md

ML Design Interview

Problem Navigation

Training Data

Feature Engineering

Modeling

Evaluation & Deployment

References

Files

ml-design.md

Latest commit

History

ml-design.md

File metadata and controls

ML Design Interview

Problem Navigation

Training Data

Feature Engineering

Modeling

Evaluation & Deployment

References