Multi-Task Learning for Conversational Question Answering over a Large-Scale Knowledge Base

Release Note:

This is an initial releasing version
Thanks @guoday for providing the codes for grammars and logical form search method
Some programs are not tested after moving to this repo and keeping anonymous, so if any bug occurred during running, no hesitate to contact me via tao.shen@student.uts.edu.au

Requirement

This code is based on Tensorflow (1.11 or 1.10) and implemented via Python 3.6. Please install following python package:

timeout_decorator, fuzzywuzzy, tqdm, flask, nltk, ftfy
spacy
numpy
tensorflow-gpu==1.11.0

Download dataset and Bert Model (for its vocab)

To reimplement our experiment, you need to download dataset from website.

Download and unzip dialog files (CSQA_v9.zip ) to the data folder

shell
unzip data/CSQA_v9.zip -d data/
mv data/CSQA_v9 data/CSQA

Download wikidata and move all wikidata jsons to "data/kb"
```
shell
mkdir data/kb
```
Down load bert base model from the official github repo to any where

Preprocess and Build KB

Simply run the following code for preprocess and build kb

python main_bfs_preprocess.py

Search GOLD Logical form for Training and Dev

Some key parameter are:

num_parallel: The number of processes to perform the BFS, each consume about 55G ram
max_train: the number of dialog to search

For the training data

90000 dialogs will be searched

python main_bfs_run.py -mode offline -num_parallel 10 -beam_size 1000 -start_index 0 -max_train 90000 -data_mode dir -data_path data/BFS/train -shuffle 1 -out_dir_suffix wo_con -mask_mode direct -all_lf 0

For the dev data

6000 dialogs will be searched

python main_bfs_run.py -mode offline -num_parallel 10 -beam_size 1000 -start_index 0 -max_train 6000 -data_mode dir -data_path data/BFS/dev -shuffle 1 -out_dir_suffix subset_wo_con -mask_mode direct -all_lf 0

The resulting bfs results are data/BFS/train_proc_direct_1000_wo_con and data/BFS/dev_proc_direct_1000_subset_wo_con.

Training the multi-task model

key argments:

pretrained_num_layers: -2 for do not loading any pretrianed and -1 for loading the word emb.
num_parallels: the number of process to parallally decoding, each consumes 70-80G ram.

python3 main_e2e.py --network_class bert --network_type bert_template --dataset e2e_wo_con \
--preprocessing_hparams \
bert_pretrained_dir=path_to_bert_base,max_sequence_len=72 --training_hparams \
train_batch_size=128,test_batch_size=128,num_epochs=8,eval_period=2000,save_model=True \
--model_hparams \
pos_gain=10.,use_qt_loss_gain=True,seq_label_loss_weight=1.,seq2seq_loss_weight=1.5,pretrained_num_layers=-2,level_for_ner=1,level_for_predicate=1,level_for_type=1,level_for_dec=1,decoder_layer=2,warmup_proportion=0.01,learning_rate=1e-4,hidden_size_input=300,num_attention_heads_input=6,intermediate_size_input=1200,hn=300,clf_head_num=6, \
--model_dir_prefix exp \
--gpu 0;

This will result in a dir located in runtime/run_model/xxx where xxx is the model id.

decoding using trained model

key argments:

timeout: timeout second for each example
num_parallels: the number of process to parallally decoding, each consumes 70-80G ram.
gpu: each gpu with 16G memory can bear maximum three parallels

python3 main_e2e.py --mode parallel_test --network_class bert --network_type bert_template \
--dataset e2e_wo_con \
--preprocessing_hparams bert_pretrained_dir=path_to_bert_base,timeout=5.,num_parallels=7,dump_dir=multi_sp,kb_mode=offline,use_filtered_ent=True \
--training_hparams \
load_model=True,load_path=/path_to_the_dir_gen_previously/ckpt \
--model_dir_prefix parallel_decoding \
--gpu 0,1,2

Paper

Paper download

Paper Introduction Video

BibTex:

@article{shen2019masp,
  author    = {Tao Shen and
               Xiubo Geng and
               Tao Qin and
               Daya Guo and
               Duyu Tang and
               Nan Duan and
               Guodong Long and
               Daxin Jiang},
  title     = {Multi-Task Learning for Conversational Question Answering over a Large-Scale
               Knowledge Base},
  journal   = {CoRR},
  volume    = {abs/1910.05069},
  year      = {2019},
  url       = {http://arxiv.org/abs/1910.05069}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
BFS		BFS
data		data
e2e		e2e
nn_utils		nn_utils
peach		peach
utils		utils
.gitignore		.gitignore
README.md		README.md
main_bfs_preprocess.py		main_bfs_preprocess.py
main_bfs_run.py		main_bfs_run.py
main_e2e.py		main_e2e.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Task Learning for Conversational Question Answering over a Large-Scale Knowledge Base

Requirement

Download dataset and Bert Model (for its vocab)

Preprocess and Build KB

Search GOLD Logical form for Training and Dev

For the training data

For the dev data

Training the multi-task model

decoding using trained model

Paper

About

Releases

Packages

Languages

taoshen58/MaSP

Folders and files

Latest commit

History

Repository files navigation

Multi-Task Learning for Conversational Question Answering over a Large-Scale Knowledge Base

Requirement

Download dataset and Bert Model (for its vocab)

Preprocess and Build KB

Search GOLD Logical form for Training and Dev

For the training data

For the dev data

Training the multi-task model

decoding using trained model

Paper

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages