Models used for Sequence Labeling
allennlp == 0.7.0
python == 3.7.0
pytorch == 0.4.1
$ git clone https://github.com/zysite/tagger.git
$ cd tagger
# eg: BiLSTM+CHAR+CRF
$ python run.py --model=char_lstm_crf --task=ner
$ python run.py -h
usage: run.py [-h] [--model {char_lstm_crf,elmo_lstm_crf}]
[--task {chunking,ner,pos}] [--drop DROP]
[--batch_size BATCH_SIZE] [--epochs EPOCHS]
[--patience PATIENCE] [--lr LR] [--threads THREADS]
[--seed SEED] [--device DEVICE] [--file FILE]
Create several models for Sequence Labeling.
optional arguments:
-h, --help show this help message and exit
--model {char_lstm_crf,elmo_lstm_crf}
choose the model for Sequence Labeling
--task {chunking,ner,pos}
choose the task of Sequence Labeling
--drop DROP set the prob of dropout
--batch_size BATCH_SIZE
set the size of batch
--epochs EPOCHS set the max num of epochs
--patience PATIENCE set the num of epochs to be patient
--lr LR set the learning rate of training
--threads THREADS, -t THREADS
set the max num of threads
--seed SEED, -s SEED set the seed for generating random numbers
--device DEVICE, -d DEVICE
set which device to use
--file FILE, -f FILE set where to store the model
# CHAR+BiLSTM+CRF
CHAR_LSTM_CRF(
(embed): Embedding(405440, 100)
(char_lstm): CharLSTM(
(embed): Embedding(517, 30)
(lstm): LSTM(30, 150, batch_first=True, bidirectional=True)
)
(word_lstm): LSTM(400, 150, batch_first=True, bidirectional=True)
(hid): Linear(in_features=300, out_features=150, bias=True)
(activation): Tanh()
(out): Linear(in_features=150, out_features=17, bias=True)
(crf): CRF(n_tags=17)
(drop): Dropout(p=0.5)
)
# ELMo+BiLSTM+CRF
ELMO_LSTM_CRF(
(embed): Embedding(405440, 100)
(scalar_mix): ScalarMix(n_reprs=3)
(char_lstm): CharLSTM(
(embed): Embedding(517, 30)
(lstm): LSTM(30, 150, batch_first=True, bidirectional=True)
)
(word_lstm): LSTM(1424, 150, batch_first=True, bidirectional=True)
(hid): Linear(in_features=300, out_features=150, bias=True)
(activation): Tanh()
(out): Linear(in_features=150, out_features=17, bias=True)
(crf): CRF(n_tags=17)
(drop): Dropout(p=0.5)
)
- Pretrained: glove.6B.100d.txt.
- Dataset: CoNLL-2003
- Train: 14987
- Dev: 3466
- Test: 3684
Dev | Test | mT(s) | |
---|---|---|---|
CHAR_LSTM_CRF | 94.49% | 90.72% | 0:01:50.889580 |
ELMO_LSTM_CRF | 95.64% | 92.09% | 0:01:46.960411 |
- Pretrained: glove.6B.100d.txt.
- Dataset: CoNLL-2000
- Train: 7936
- Dev: 1000
- Test: 2012
Dev | Test | mT(s) | |
---|---|---|---|
CHAR_LSTM_CRF | 95.02% | 94.51% | 0:01:21.141716 |
ELMO_LSTM_CRF | 97.08% | 96.34% | 0:01:14.761098 |
- Pretrained: glove.6B.100d.txt.
- Dataset: WSJ
- Train: 38219
- Dev: 5527
- Test: 5462
Dev | Test | mT(s) | |
---|---|---|---|
CHAR_LSTM_CRF | 97.68% | 97.64% | 0:05:59.462637 |
ELMO_LSTM_CRF | 97.86% | 97.81% | 0:05:55.335100 |