Skip to content

LiyuanLucasLiu/Vanilla_NER

Folders and files

NameName
Last commit message
Last commit date

Latest commit

c235e99 · May 3, 2020

History

24 Commits
May 2, 2020
Apr 24, 2020
Sep 25, 2018
Dec 15, 2018
Oct 5, 2018
Dec 15, 2018

Repository files navigation

Vanilla NER

License

Check Our New NER Toolkit🚀🚀🚀

  • Inference:
    • LightNER: inference w. models pre-trained / trained w. any following tools, efficiently.
  • Training:
    • LD-Net: train NER models w. efficient contextualized representations.
    • VanillaNER: train vanilla NER models w. pre-trained embedding.
  • Distant Training:
    • AutoNER: train NER models w.o. line-by-line annotations and get competitive performance.

This project is drivied from LD-Net, and provides a vanilla Char-LSTM-CRF model for Named Entity Recognition (LD-Net w.o. contextualized representations).

We are in an early-release beta. Expect some adventures and rough edges. LD-Net is a more mature project, please refer to LD-Net for detailed documents and also demo scripts.

https://github.com/LiyuanLucasLiu/LD-Net

Training

Dependency

Our package is based on Python 3.6 and the following packages:

numpy
tqdm
torch-scope>=0.5.0
torch==0.4.1

Command

Please first generate the word dictionary by:

python pre_seq/gene_map.py -h

Then encode the dictionary by:

python pre_seq/encode_data.py -h

Then train the model:

python train_seq.py -h

Inference

Models trained with this package can be used to inference with the LightNER package.

Citation

If you find the implementation useful, please cite the following paper: Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling

@inproceedings{liu2018efficient,
  title = "{Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling}", 
  author = {Liu, Liyuan and Ren, Xiang and Shang, Jingbo and Peng, Jian and Han, Jiawei}, 
  booktitle = {EMNLP}, 
  year = 2018, 
}

About

Vanilla Sequence Labeling w. Char-LSTM-CRF

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published