The relevant blog post is here: http://hjweide.github.io/char-rnn
This implementation is largely based on https://github.com/Lasagne/Recipes/blob/master/examples/lstm_text_generation.py. See http://karpathy.github.io/2015/05/21/rnn-effectiveness/ for a thorough explanation of how char-rnn works.
This implementation of char-rnn can be used to train on any text file. My goal, however, was to train it on the entire history of my Facebook conversations. If you have your own text file, you can skip to step 5 below.
-
Follow these instructions to get a copy of all your Facebook data. You may want to do this first, because it can take a while for them to send you the download link. When the download is complete, unzip the archive.
-
Clone and install this [Facebook chat parser](Facebook chat parser).
git clone https://github.com/ownaginatious/fbchat-archive-parser python setup.py develop
-
Run the parser on the
messages.htm
file from the extracted archive:fbcap html/messages.htm > messages.txt
-
Use this snippet of code to strip out all messages not written by you. Set the name appearing in your Facebook chats as the
name
variable, and runpython parse_messages.py
. You may need to write a more sophisticated parser if you want more control about which messages you want to extract, or if you had a name change, for example. -
Set the
text_fpath
intrain_char_rnn.py
to the text file containing the training data. If you used the snippet mentioned above, this will already be appropriately set asparsed.txt
. -
Observe the sequences generated during training. Once you are happy that the model has reached reasonable convergence, end the training with
ctrl-c
. -
Set the
text_fpath
ingenerate_samples.py
, and runpython generate_samples.py
to continually supply phrases and sample from the model to amuse yourself.