CLUE NER

Here is a short summary of our solution on CLUE NER benchmark.

CLUENER2020

The example of fine-tuning and doing inference on CLUENER2020 dataset with google_zh_model.bin:

python3 run_ner.py --pretrained_model_path models/google_zh_model.bin --vocab_path models/google_zh_vocab.txt \
                   --train_path datasets/cluener2020/train.tsv --dev_path datasets/cluener2020/dev.tsv \
                   --label2id_path datasets/cluener2020/label2id.json --epochs_num 5 --batch_size 16 \
                   --output_model_path models/ner_model.bin \
                   --embedding word_pos_seg --encoder transformer --mask fully_visible

python3 inference/run_ner_infer.py --load_model_path models/ner_model.bin --vocab_path models/google_zh_vocab.txt \
                                   --test_path datasets/cluener2020/test_nolabel.tsv --prediction_path datasets/cluener2020/prediction.tsv \
                                   --label2id_path datasets/cluener2020/label2id.json --embedding word_pos_seg --encoder transformer --mask fully_visible

The example of fine-tuning and doing inference on CLUENER2020 dataset with mixed_corpus_bert_large_model.bin:

python3 run_ner.py --pretrained_model_path models/mixed_corpus_bert_large_model.bin --vocab_path models/google_zh_vocab.txt --config_path models/bert/large_config.json \
                   --train_path datasets/cluener2020/train.tsv --dev_path datasets/cluener2020/dev.tsv \
                   --output_model_path models/ner_model.bin \
                   --label2id_path datasets/cluener2020/label2id.json --epochs_num 5 --batch_size 16 \
                   --embedding word_pos_seg --encoder transformer --mask fully_visible

python3 inference/run_ner_infer.py --load_model_path models/ner_model.bin --vocab_path models/google_zh_vocab.txt --config_path models/bert/large_config.json \
                                   --test_path datasets/cluener2020/test_nolabel.tsv --prediction_path datasets/cluener2020/prediction.tsv \
                                   --label2id_path datasets/cluener2020/label2id.json --embedding word_pos_seg --encoder transformer --mask fully_visible

Home
主页
- 项目特色
- 依赖环境
- 快速上手
- 预训练数据
- 下游任务数据集
- 预训练模型仓库
- 使用说明
- 竞赛解决方案
  - 中文任务测评基准CLUE
  - SMP2020-EWECT
  - SMP2019-ECISA
  - CCF-BDCI2021-面向黑灰产治理的恶意短信变体字还原
  - 英文任务测评基准GLUE
- 引用

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLUE NER

CLUENER2020

Clone this wiki locally