主流推荐系统Rank算法的实现

项目简介

实现推荐系统中主要使用的Rank算法，并使用公开数据集评测，所有算法均已跑通并完成完整的训练，最终生成saved_model和checkpoint供tf-serving部署；

使用微信视频号推荐算法比赛数据集，数据详情请见 ./dataset/README.md；
为了贴合工业界使用情况，使用TensorFlow Estimator框架，数据format为Tfrecord；
算法实现在./algrithm下，每个算法单独一个文件夹，名字为普遍接受的大写算法名称，训练入口为文件夹下对应的小写算法名称py文件，如DIN文件夹下的din.py文件为训练DIN模型的入口，具体请见末尾的示例部分；
每个算法都实现了自己的model_fn，没有使用Keras高阶API，只使用TensorFlow的中低阶API构造静态图；
算法超参数可由--parameter_name=parameter_value方式传入训练入口脚本，超参数定义请见训练入口脚本tf.app.flags部分；
单任务模型使用数据集因变量中的read_comemnt评测，多任务模型使用read_commet like click_avatar三个任务评测；

单任务Models列表

Model	Paper	*Best_read_comment_Auc
FFM	[2016] Field-aware Factorization Machines for CTR Prediction	0.8911285
DeepCrossing	[2016] Deep Crossing - Web-Scale Modeling without Manually Crafted Combinatorial Features	0.9185908
PNN	[2016] Product-based neural networks for user response prediction	0.9065931
Wide & Deep	[2016] Wide & Deep Learning for Recommender Systems	0.9133482
DeepFM	[2017] DeepFM: A Factorization-Machine based Neural Network for CTR Prediction	0.8529998
DCN	[2017] Deep & Cross Network for Ad Click Predictions	0.9183242
AFM	[2017] Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks	0.9117872
xDeepFM	[2018] xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems	0.9152467
FwFM	[2018] Field-weighted Factorization Machines for Click-Through Rate Prediction in Display Advertising	0.9118794
DIN	[2018] Deep Interest Network for Click-Through Rate Prediction	0.9116896
DIEN	[2018] Deep Interest Evolution Network for Click-Through Rate Prediction	-
FiBiNet	[2019] FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction	0.9149044
BST	[2019] Behavior sequence transformer for e-commerce recommendation in Alibaba	0.9165866

*Best_read_comment_Auc为每个model各自调参后的测试集最大Auc，每个model各自的评测见每个model路径下的result.md。
*DIEN不适用于微信视频号数据集，故只实现了静态图，并没有评测。

多任务Models列表

Model	Paper	*Best_read_commet_AUC	*Best_like_AUC	*Best_click_avatar_AUC
ESMM	[2018] Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate	-	-	-
MMOE	[2018] Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts	0.91860557	0.8126400	0.8139362
PLE	[2020] Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations	0.91965175	0.8136461	0.8154559

*Best_xx_AUC为所有超参数组合中的最高值，横向的三个AUC可能不在同一组超参数中。
*由于ESMM的结构特殊性，不适用于微信视频号数据集，故只实现了静态图，并没有评测。

示例

# 先执行以下命令确保生成了tfrecord
# cd ./dataset/wechat_algo_data1
# python DataGenerator.py && cd ..
cd ./DIN
# 训练时可自定义参数
python din.py --use_softmax=True

To Do List

增加多任务学习Trick: Uncertainty, GradNorm, PCGrad, etc.
增加AutoInt, FLEN, etc.
重构特征工程部分, 包括配置化输入等, 参考https://github.com/Shicoder/Deep_Rec

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
algorithm		algorithm
dataset		dataset
docs		docs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

主流推荐系统Rank算法的实现

项目简介

单任务Models列表

多任务Models列表

示例

To Do List

欢迎提issue，或直接勾搭

About

Uh oh!

Releases

Packages

Languages

License

tangxyw/RecAlgorithm

Folders and files

Latest commit

History

Repository files navigation

主流推荐系统Rank算法的实现

项目简介

单任务Models列表

多任务Models列表

示例

To Do List

欢迎提issue，或直接勾搭

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages