Skip to content

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Notifications You must be signed in to change notification settings

leoeaton/Pretrained-Language-Model

 
 

Repository files navigation

Pretrained Language Model

This repository provides the latest pretrained language models and its related optimization techniques developed by Huawei Noah's Ark Lab.

Directory structure

  • NEZHA-TensorFlow is a pretrained Chinese language model which achieves the state-of-the-art performances on several Chinese NLP tasks developed by TensorFlow.
  • NEZHA-PyTorch is the PyTorch version of NEZHA.
  • NEZHA-Gen-TensorFlow provides two GPT models. One is Yuefu (乐府), a Chinese Classical Poetry generation model, the other is a common Chinese GPT model.
  • TinyBERT is a compressed BERT model which achieves 7.5x smaller and 9.4x faster on inference.
  • TinyBERT-MindSpore is a MindSpore version of TinyBERT.
  • DynaBERT is a dynamic BERT model with adaptive width and depth.
  • BBPE provides a byte-level vocabulary building tool and its correspoinding tokenizer.
  • PMLM is an improved method for pretrained language model. Trained without the complex two-stream self-attention, PMLM can be treated as a simple approximation of XLNet.

About

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.0%
  • Other 1.0%