Skip to content

Latest commit

 

History

History
42 lines (32 loc) · 3.17 KB

dataList.md

File metadata and controls

42 lines (32 loc) · 3.17 KB

Benchmark Datasets for RecSys

Here are some widely-used benchmark datasets for evaluting recommendation methods.

General data

Data #Users #Items #Event #User links Link type W/ Time W/ KG
DoubanMusic 39,742 164,223 1,792,501 1,908,081 1,908,081 Yes
DoubanMovie 94,890 81,906 11,742,260 1,908,081 1,908,081 Yes
DoubanBook 46,548 212,995 1,908,081 1,908,081 1,908,081 Yes
Yelp* 1,183,362 156,639 4,736,897 39,846,890 Friendship Yes
Delicious 1867 40897 437,594 15,328 Friendship Yes
Last.FM1 1892 12,523 186,480 25,435 Friendship Yes
MovieLens-1M 71,567 10,681 10,000,054 - - Yes
FilmTrust 1,508 2,071 35,497 1,853 Trust
Jester 73,421 100 4,100,000 - -
BookCrossing 278,858 271,379 1,149,780 - - Yes
Gowalla 107,092 1,280,969 6,442,890 950,327 Friendship Yes
  • Amazon-review. It contains a large corpus of product reviews collected from Amazon.

*: We can't find these data set(s) online anymore. If you want to use it, please feel free to contact Weiping (songweiping@pku.edu.cn).

Session-based recommendation

CTR Prediction

KG for Recommendation

  • KB4Rec. It provides linkages between movie data and Freebase.
  • MovieLens-1M. It merges MovieLens-1M with Microsoft Satori.
  • Book-Crossing. It merges Book-Crossing with Microsoft Satori.

Acknowlegement & References: