PyTorch 0.4 Implementation of "A Hierarchical Latent Structure for Variational Conversation Modeling" (NAACL 2018 Oral)
Install Python packages
pip install -r requirements.txt
Following scripts will
-
Create directories
./datasets/cornell/
and./datasets/ubuntu/
respectively. -
Download and preprocess conversation data inside each directory.
python cornell_preprocess.py
--max_sentence_length (maximum number of words in sentence; default: 30)
--max_conversation_length (maximum turns of utterances in single conversation; default: 10)
--max_vocab_size (maximum size of word vocabulary; default: 20000)
--max_vocab_frequency (minimum frequency of word to be included in vocabulary; default: 5)
--n_workers (number of workers for multiprocessing; default: os.cpu_count())
python ubuntu_preprocess.py
--max_sentence_length (maximum number of words in sentence; default: 30)
--max_conversation_length (maximum turns of utterances in single conversation; default: 10)
--max_vocab_size (maximum size of word vocabulary; default: 20000)
--max_vocab_frequency (minimum frequency of word to be included in vocabulary; default: 5)
--n_workers (number of workers for multiprocessing; default: os.cpu_count())
Go to the model directory and set the save_dir in configs.py (this is where the model checkpoints will be saved)
We provide our implementation of VHCR, as well as our reference implementations for HRED and VHRED.
To run training:
python train.py --data=<data> --model=<model> --batch_size=<batch_size>
For example:
- Train HRED on Cornell Movie:
python train.py --data=cornell --model=HRED
- Train VHRED with word drop of ratio 0.25 and kl annealing iterations 250000:
python train.py --data=ubuntu --model=VHRED --batch_size=40 --word_drop=0.25 --kl_annealing_iter=250000
- Train VHCR with utterance drop of ratio 0.25:
python train.py --data=ubuntu --model=VHCR --batch_size=40 --sentence_drop=0.25 --kl_annealing_iter=250000
By default, it will save a model checkpoint every epoch to <save_dir> and a tensorboard summary. For more arguments and options, see config.py.
To evaluate the word perplexity:
python eval.py --model=<model> --checkpoint=<path_to_your_checkpoint>
For embedding based metrics, you need to download Google News word vectors, unzip it and put it under the datasets folder. Then run:
python eval_embed.py --model=<model> --checkpoint=<path_to_your_checkpoint>
If you use this code or dataset as part of any published research, please refer the following paper.
@inproceedings{VHCR:2018:NAACL,
author = {Yookoon Park and Jaemin Cho and Gunhee Kim},
title = "{A Hierarchical Latent Structure for Variational Conversation Modeling}",
booktitle = {NAACL},
year = 2018
}