Code for our NLPCC 2022 paper Kformer: Knowlede Injection in Transformer Feed-Forward Layers
The project is based on Fairseq.
To install requirements:
cd fairseq
./setup.sh
mkdir models
cd models
wget https://dl.fbaipublicfiles.com/fairseq/models/roberta.base.tar.gz
tar -zxvf roberta.base.tar.gz
You can download the data from ZJU Cloud and put it under the .\data\
.
The data we provide here is the question with the retrieved knowledge using bm25.
Use the command below to finetune SocialIQA on Kformer. You can change the layer to inject by editing the arg --knowledge_layer
.
--knowledge_layer
contains two arguments [a,b) denoting the interval of the layer of Roberta. You need to change this line to change the number of the knowledge used for infusion.
./fairseq/run_social.sh
Use the command below to finetune MedQA on Kformer.
./fairseq/run_med.sh
Use the following command to evalute the finetuned model. Set the --knowledge_layer
the same as the arg during finetuning.
export ModelPath = $ModelPath$
export DataPath = $DataPath$
python fairseq/test_social.py --model_path $ModelPath$ --knowledge_layer 9 12 --data_file $DataPath$
Change fairseq/test_social.py to test_med.py to evaluate MedQA.
Please give us a ⭐ and cite our paper as
@article{Yao2022KformerKI,
title={Kformer: Knowledge Injection in Transformer Feed-Forward Layers},
author={Yunzhi Yao and Shaohan Huang and Li Dong and Furu Wei and Huajun Chen and Ningyu Zhang},
journal={ArXiv},
year={2022},
volume={abs/2201.05742}
}