forked from kyleclo/stackedit-notebooks
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path2017-04-12-netflix-recommender
52 lines (25 loc) · 3.17 KB
/
2017-04-12-netflix-recommender
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
What was team BellKor's model that won the 2007 Netflix prize?
#### Obtain baseline estimates by solving the least squares problem
$$\min_{b} \sum_{ui} (r_{ui} - (\mu + b_u + b_i))^2 + \lambda (\sum_u b_u^2 + \sum_i b_i^2) $$
We can also add other features like interaction terms, number of users who rated Item $i$ (i.e. popularity), time taken for User $u$ to rate Item $i$ (i.e. personality), etc.
#### Nearest neighbor regression using item-to-item similarities
We can compute the similarity matrix between pairs of items based on the User-Item rating matrix.
For example, we can compute the Pearson correlation distance between any two columns $i$ and $j$ based on the common user support (i.e. using only the rows corresponding to users who rated both items). Denote this distance $d_{ij}$.
A $k$-nearest neighbor model could predict an unobserved rating:
$$\widehat{r}_{ui} = b_{ui} + \frac{\sum_{j} d_{ij}(r_{uj} - b_{uj})}{\sum_j d_{ij}}$$
where $b_{ui} = \mu + b_u + b_i$ is the baseline rating of User $u$ of Item $i$.
where $j$ denotes one of the $k$ nearest neighbors of Item $i$.
A major problem arises when the $k$ nearest neighbors of item $i$ are unrated by user $u$. In such a case, we'd like to default to the baseline $b_{ui}$.
We could've also used user-to-user similarities, but Sawar et. al. found item similarities to be more efficient; i.e. generally fewer items than users means fewer pre-computations and faster retrieval. Item similarities are also better from a practical perspective because letting users know about their use can help guide them towards tuning their own recommendations; they have a better sense of item similarities than their similarities with other users, and this knowledge can guide how they rate items.
Note that item-based collaborative filtering is preferred to user-based collaborative filtering because
#### Sources
See
https://pdfs.semanticscholar.org/2285/c262bc459b5ad96b2e16ccb755a44dc5f918.pdf
In this paper, they estimated user and item effects as a preprocessing step and then fit the collaborative filtering methods on the residuals.
$$e = y - (\mu + b_u + b_i)$$
I suspect they used indicator variables and least-squares estimation to do this step. I suppose if we had features to represent users or items, this would be the time to use them. It'd probably be better than having a fixed indicator effects because the estimates wouldn't be entirely reliable for users or items that don't show up often.
[Bell, Koren, and Volinsky (2007) paper describing first attempt at Netflix recommender problem using baseline effects and kNN regression](https://pdfs.semanticscholar.org/15e7/b5c1b43f7a8f097c752c071389ddf5173458.pdf).
[Bell and Koren (2007) paper describing how to derive optimal weights for kNN regression](https://pdfs.semanticscholar.org/65f7/8de6184ebf2dac934909e1112e002b79aa56.pdf), as opposed to using the inverse of some arbitrary distance metric.
[Koren (2008) paper describing a different form of the SVD](http://www.cs.rochester.edu/twiki/pub/Main/HarpSeminar/Factorization_Meets_the_Neighborhood-_a_Multifaceted_Collaborative_Filtering_Model.pdf).
Time:
http://sydney.edu.au/engineering/it/~josiah/lemma/kdd-fp074-koren.pdf