MMVED: Multimodal Variational Encoder Decoder Framework for Micro Video Popularity Prediction

Note!!! Please refers to [here] for the lastest update of MMVED from me and my colleage! The NEW work is interesting which creates a hierarchical and multimodal version of the deep variational information bottleneck. It is accepted by IEEE TMM.

This is our implementation of MMVED for micro-video popularity prediction associated with:

A multimodal variational encoder decoder framework for micro video popularity prediction,
Xie, Jiayi and Zhu, Yaochen, and others
Accepted as a conference paper in WWW 2020.

Predicting the Popularity of Micro-videos with Multimodal Variational Encoder-Decoder Framework,
Yaochen Zhu, Jiayi Xie, Zhenzhong Chen
arXiv:2003.12724

It includes two parts:

Micro-video popularity regression on NUS dataset.
Micro-video temporal popularity prediction on Xigua dataset.

Each part contains everything required to train or test the corresponding MMVED model.

For the Xigua datset we collect, we release the data as well.

Architecture

Environment

python == 3.6.5
numpy == 1.16.1
tensorflow == 1.13.1
tensorflow-probability == 0.6.0

Datasets

The Xigua dataset

The Xigua micro-video temporal popularity prediction dataset we collect is available [google drive], [baidu] (pin: zpwb). For usage, download, unzip the data folder and put them in the xigua directory. Descriptions of the files are as follows:

resnet50.npy: (N×128). Visual features extracted by ResNet50 pre-trained on ImageNet.
audiovgg.npy: (N×128). Aural features extracted by AudioVGG pre-trained on AudioSet.
fudannlp.npy: (N×20). Textual features extracted by the FudanNLP toolkit.
social.npy: (N×3). Social features crawled from the user attributes.
len_9/target.npy: (N×9×2). Popularity groundtruth (0-axis) and absolute time (1-axis) at each timestep.
split/0-4/{train, val, test}.txt: Five splits of train, val and test samples used in our paper.

The NUS dataset

The original NUS dataset can be found here, which was released with the TMALL model in this paper. The descriptions of files in the data folder in the NUS directory are as follows:

vid.txt: The ids of the micro-videos that we were able to download successfully at the time of our experiment.
split/0-4/{train, val, test}.txt: Five splits of the dataset we used in our paper.

Examples to run the Codes

The basic usage of the codes for training and testing MMVED model on both Xigua and NUS dataset is as follows:

For training:

python train.py --lambd [LAMBDA] --split [SPLIT]
For testing:

python predict.py --model_path [PATH_TO_MODEL] --test_mods [VATS]

For more advanced arguments, run the code with --help argument.

If you find our codes and dataset helpful, please kindly cite the following papers. Thanks!

Fullfledged version: Here ; WWW 2020 paper: Here

@article{mmved-fullfledged,
  title={Predicting the Popularity of Micro-videos with Multimodal Variational Encoder-Decoder Framework},
  author={Zhu, Yaochen and Xie, Jiayi and Chen, Zhenzhong},
  booktitle={arXiv preprint arXiv:2003.12724},
  year={2020},
}	

@inproceedings{mmved-www2020-preliminary,
  title={A Multimodal Variational Encoder-Decoder Framework for Micro-video Popularity Prediction},
  author={Xie, Jiayi and Zhu, Yaochen and Zhang, Zhibin and Peng, Jian and Yi, Jing and Hu, Yaosi and Liu, Hongyi and Chen, Zhenzhong},
  booktitle={The World Wide Web Conference},
  year={2020},
  pages = {2542–2548},
}

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
1_sequential_xigua		1_sequential_xigua
2_regression_nus		2_regression_nus
LICENSE		LICENSE
README.md		README.md
framework.png		framework.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MMVED: Multimodal Variational Encoder Decoder Framework for Micro Video Popularity Prediction

Architecture

Environment

Datasets

The Xigua dataset

The NUS dataset

Examples to run the Codes

If you find our codes and dataset helpful, please kindly cite the following papers. Thanks!

About

Releases

Packages

Languages

License

yaochenzhu/MMVED

Folders and files

Latest commit

History

Repository files navigation

MMVED: Multimodal Variational Encoder Decoder Framework for Micro Video Popularity Prediction

Architecture

Environment

Datasets

The Xigua dataset

The NUS dataset

Examples to run the Codes

If you find our codes and dataset helpful, please kindly cite the following papers. Thanks!

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages