Segatron

This repo contains codes and pre-trained models for our paper

Segatron: Segment-aware Transformer for Language Modeling and Understanding

He Bai, Peng Shi, Jimmy Lin, Yuqing Xie, Luchen Tan, Kun Xiong, Wen Gao, Ming Li

AAAI 2021

Setup

To use this repo, please install NVIDIA APEX. We recommand using this docker or building your own environment with NGC's PyTorch container nvcr.io/nvidia/pytorch:20.03-py3.

Download Checkpoints

We have uploaded following checkpoints to the huggingace models:

Pre-training

The source code is in the segabert folder, which is based on Megatron-LM's repository.
Please refer to segabert/README.md for details.

Evaluation

1. Wikitext-103

The source code is in the segatron-xl folder, which is based on Transformer-xl's repository.
Please refer to segatron-xl/README.md for details.

2. GLUE and Machine Reading Comprehension

The source code is in the transformers folder, which is based on huggingface's Transformers repository. It should be notice that Segatron needs paragraph position index, sentence position index, and token position index in its input features. Hence we changed the input feature extraction and model forward functions of Transformers, which means our codes is not compatiable with the huggingface's Transformers.
Please refer to transformers/README.md for details.

3. SST

The source code is in the sentence-transformers folder, which is based on Sentence-Transformer's repository.
Please refer to sentence-transformers/README.md for details.

Citation

Please cite the AAAI 2021 paper:

@inproceedings{bai2021segatron,
  title={Segatron: Segment-Aware Transformer for Language Modeling and Understanding},
  author={Bai, He and Shi, Peng and Lin, Jimmy and Xie, Yuqing and Tan, Luchen and Xiong, Kun and Gao, Wen and Li, Ming},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={35},
  number={14},
  pages={12526--12534},
  year={2021}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Segatron

Setup

Download Checkpoints

Pre-training

Evaluation

1. Wikitext-103

2. GLUE and Machine Reading Comprehension

3. SST

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
imgs		imgs
segabert		segabert
segatron-xl		segatron-xl
sentence-transformers		sentence-transformers
transformers		transformers
README.md		README.md

rsvp-ai/segatron_aaai

Folders and files

Latest commit

History

Repository files navigation

Segatron

Setup

Download Checkpoints

Pre-training

Evaluation

1. Wikitext-103

2. GLUE and Machine Reading Comprehension

3. SST

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages