The implementation of the paper "Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters":
Knowledge-Grounded Dialogue Response Generation with Adapters. Yan Xu, Etsuko Ishii, Samuel Cahyawijaya, Zihan Liu, Genta Indra Winata, Andrea Madotto, Dan Su, Pascale Fung DialDoc@ACL2022 [PDF]
If you use any source codes included in this toolkit in your work, please cite the following paper. The bibtex is listed below:
@article{xu2021retrieval, title={Retrieval-free knowledge-grounded dialogue response generation with adapters}, author={Xu, Yan and Ishii, Etsuko and Cahyawijaya, Samuel and Liu, Zihan and Winata, Genta Indra and Madotto, Andrea and Su, Dan and Fung, Pascale}, journal={arXiv preprint arXiv:2105.06232}, year={2021} }
pip install -r requirements.txt
pip install -U contextualized_topic_models==2.0.1
In this paper, we conduct experiments on Wizard of Wikipedia (WoW) and CMU DoG dataset.
-
For WoW dataset, we download it from PARLAI source. For CMU_DoG dataset, we need to download CMU_DoG dataset from the Github Repo. All the data will be put it under
data
folder. -
We also follow the same preprocess procedure on the CMU_DoG dataset as what ITDD paper (Github) has done. As there is a little overlap between training set and testing set in the original dataset, they remove the duplicates and format the data for our model. The preprocessed data could be downloaded here. The data should be put under
data
folder and decompressed.
To make this process simple, just run the following command:
sh scripts/prepare_data.sh
- Based on the two datasets, we collect the involved articles as the
resource to pre-train the knowledge experts. We provide the pre-
processed articles here.
Please download the folder and move the content under
data
folder.
To use our model for evaluation, we provide the checkpoints of the models and the corresponding predictions.
Please download the models and the results from OneDrive
and put the downloaded folders under the save
folder.
We train a CTM on the knowledge corpus and classify all the articles into a specific number of clusters:
CUDA_VISIBLE_DEVICES=0 sh scripts/train_ctm.sh
To better predict the topic distribution of the response from dialogue
history, we further fine-tune sentenceBERT to minimize the MSE loss between
the representation of history + response
and that of history-only
.
CUDA_VISIBLE_DEVICES=0 python train_sentenceBERT.py -bs 8 --lr 1e-6 --wd 0 -ep 20 -pa 5 --output_dir save/models/topic_models/his_only_sentenceBert --do_train --do_eval --dataset wow
Next, we classify all the dialogue samples into different clusters:
CUDA_VISIBLE_DEVICES=0 sh scripts/inference_ctm_for_dials.sh
Now, we train the four knowledge experts individually
CUDA_VISIBLE_DEVICES=0 sh scripts/expert_training.sh <index 0,1,2,3>
Here we provide the example for #cluster as 4 (same as the setting in our paper).
The following scripts reproduce the task adaptation process, along with evaluating the perplexity of generating the gold responses. All the loss information will be saved into a log file.
- Task adaptation under weighted-sum setting
CUDA_VISIBLE_DEVICES=0 sh scripts/task_adaptation_w.sh <dataset name:wow/cmu_dog> <cluster prediction path after `topic_models`> <exp name>
- Task adaptation under one-hot setting
CUDA_VISIBLE_DEVICES=0 sh scripts/task_adaptation_o.sh <dataset name> <cluster prediction path after `topic_models`> <exp name>
For instance, to reproduce the KnowExpert_w results on WoW dataset, run:
CUDA_VISIBLE_DEVICES=0 sh scripts/task_adaptation_w.sh wow ctm_20k_new_hisres_4/wow_20k_new_hisres_split_4.npy wow-moe-cluster4-ckp49-ctm-cmu-20knew-all-1e5-kn768-newp-hisres
Now, we conduct inference on WoW and CMU DoG with the obtained models. The results
will be saved under save/results
.
- Inference under weighted-sum setting
CUDA_VISIBLE_DEVICES=0 sh scripts/inference_w.sh <dataset name> <cluster prediction path after `topic_models`> <exp name>
- Inference under one-hot setting
CUDA_VISIBLE_DEVICES=0 sh scripts/inference_o.sh <dataset name> <cluster prediction path after `topic_models`> <exp name>
For instance, to reproduce the KnowExpert_w results on WoW dataset, run:
CUDA_VISIBLE_DEVICES=0 sh scripts/inference_w.sh wow ctm_20k_new_SB/wow_20k_new_SB_split_4.npy wow-moe-cluster4-ckp49-ctm-cmu-20knew-all-1e5-kn768-newp-hisres
In this paper, three autometic evaluation metrics are involved to evaluate the generated response: Uni-gram F1, Distinct-1, and Distinct-2.
python evaluation.py --split test --checkpoint best --save_path save/results --exp <exp name> --unigram_f1 --dist