Skip to content

Finetune multiple pre-trained Transformer-based models to solve Vietnamese Fake News Detection problem (ReINTEL) in VLSP2020 shared task

Notifications You must be signed in to change notification settings

heraclex12/VLSP2020-Fake-News-Detection

Repository files navigation


VLSP2020: Fake News Detection

Fine-tune a variety of pre-trained Transformer-based models to solve Vietnamese Reliable Intelligent Identification (ReINTEL) problem in VLSP2020 shared task.

About The Project

In this project, we utilize the effectiveness of the different pre-trained language models such as vELECTRA, vBERT, PhoBERT, Bert Multilingual Cased, XLM-RoBERTa to identify reliable information shared on social network sites.

We evaluate the different input length models, it includes 256, 512, and multiple 512 (long document)

Prerequisites

To reproduce the experiment of our model, please install the requirements.txt according to the following instructions:

  • huggingface transformer
  • emoji
  • vncorenlp
  • nltk
  • pytorch
  • python3
pip install -r requirements.txt

Data

The dataset is provided by VLSP2020 Organizers. Please access this site for more information.

Contact

Hieu Tran - heraclex12@gmail.com

Project Link: https://github.com/heraclex12/VLSP2020-Fake-News-Detection

Citation

@misc{tran2020leveraging,
      title={Leveraging Transfer Learning for Reliable Intelligence Identification on Vietnamese SNSs (ReINTEL)}, 
      author={Trung-Hieu Tran and Long Phan and Truong-Son Nguyen},
      year={2020},
      eprint={2012.07557},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Acknowledgements

About

Finetune multiple pre-trained Transformer-based models to solve Vietnamese Fake News Detection problem (ReINTEL) in VLSP2020 shared task

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages