English|简体中文
Remind
: ERNIE-Gram model has been officially released in here. Our reproduction codes have been released to repro branch.
ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding
Since ERNIE 1.0, Baidu researchers have introduced knowledge-enhanced representation learning in pre-training to achieve better pre-training learning by masking consecutive words, phrases, named entities, and other semantic knowledge units. Furthermore, we propose ERNIE-Gram, an explicitly n-gram masking language model to enhance the integration of coarse-grained information for pre-training. In ERNIE-Gram, n-grams are masked and predicted directly using explicit n-gram identities rather than contiguous sequences of tokens.
In downstream tasks, ERNIE-gram uses a bert-style
fine-tuning approach, thus maintaining the same parameter size and computational complexity.
We pre-train ERNIE-Gram on English
and Chinese
text corpora and fine-tune on 19
downstream tasks. Experimental results show that ERNIE-Gram outperforms previous pre-training models like XLNet and RoBERTa by a large margin, and achieves comparable results with state-of-the-art methods.
The ERNIE-Gram paper has been accepted for NAACL-HLT 2021, for more details please see in here.
mkdir -p data
cd data
wget https://ernie-github.cdn.bcebos.com/data-xnli.tar.gz
tar xf data-xnli.tar.gz
cd ..
#demo for NLI task
sh run_cls.sh task_configs/xnli_conf
This repo requires PaddlePaddle 2.0.0+, please see here for installaton instruction.
git clone https://github.com/PaddlePaddle/ERNIE.git --depth 1
cd ERNIE
pip install -r requirements.txt
pip install -e .
Model | Description | abbreviation |
---|---|---|
ERNIE-Gram Base for Chinese | Layer:12, Hidden:768, Heads:12 | ernie-gram |
ERNIE-Gram Base for English | Layer:12, Hidden:768, Heads:12 | ernie-gram-en |
English Datasets
Download the GLUE datasets by running this script
the --data_dir
option in the following section assumes a directory tree like this:
data/xnli
├── dev
│ └── 1
├── test
│ └── 1
└── train
└── 1
see demo data for MNLI task.
try eager execution with dygraph model
:
- Natural Language Inference
- Sentiment Analysis
- Semantic Similarity
- Name Entity Recognition(NER)
- Machine Reading Comprehension
recomended hyper parameters:
- See ERNIE-Gram paper Appendix B.1-4
For full reproduction of paper results, please checkout to repro
branch of this repo, the site is at ernie-gram.
@article{xiao2020ernie,
title={ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding},
author={Xiao, Dongling and Li, Yu-Kun and Zhang, Han and Sun, Yu and Tian, Hao and Wu, Hua and Wang, Haifeng},
journal={arXiv preprint arXiv:2010.12148},
year={2020}
}
- ERNIE homepage
- Github Issues: bug reports, feature requests, install issues, usage issues, etc.
- QQ discussion group: 760439550 (ERNIE discussion group).
- QQ discussion group: 958422639 (ERNIE discussion group-v2).
- Forums: discuss implementations, research, etc.