Skip to content

Latest commit

 

History

History
44 lines (21 loc) · 1.63 KB

README.rst

File metadata and controls

44 lines (21 loc) · 1.63 KB

An partial reimplementation of "Generating Sentences From a Continuous Space", Bowman, Vilnis, Vinyals, Dai, Jozefowicz, Bengio (https://arxiv.org/abs/1511.06349).

Based on code from Sean Robertson (@spro) https://github.com/spro/pytorch-text-vae, adapted to word level as in the original paper.

This code doesn't implement iterative conditional modes for sampling, or several other details of the original paper. The resulting interpolations in this recreation seem less grammatically accurate than those presented in the paper, though the overall result is similar.

To get a saved model, and preprocessed data, download the file (~1.1GB) at this link

https://drive.google.com/file/d/0Bzz1g90lrPKNZTVtSHFDM0t0cG8/view?usp=sharing&resourcekey=0-Wkt32ho71aDtAgiFm_uGgg

Next, run this command to unzip the pretrained models and preprocessed data

tar xzf stored_pytorch_text_vae_info.tar.gz

The pretrained model was trained on the Book Corpus dataset (http://yknzhu.wixsite.com/mbweb).

Sampling Usage:

python interpolate.py -1 "it had taken years to believe" -2 "but it was all lies at the end" -t .01 -s saved_vae.pt

Example output:

('(s0)', u'it had taken years to believe')

('(z0)', ' it had taken time to his')

(' .)', ' it had my hands to his')

(' .)', ' but it was mad at his end')

(' .)', ' but it was nt mad at his end')

('(z1)', ' but it was all her at the end')

('(s1)', u'but it was all lies at the end')

New Model Training:

python train.py myfile.txt

Where myfile.txt is a text file with one sentence per line. The model will be saved in vae.pt.