Seq2Seq models

This is a project to learn to implement different s2s model on tensorflow.

This project is only used for learning, which means it will contain many bugs. I suggest to use nmt project to do experiments and train seq2seq models. You can find it in the reference part.

Experiments

I am experimenting the copynet and pg on lcsts dataset, you can find the code in the lcsts branch.

Issues and suggestions are welcomed.

Models

The models I have implemented are as following:

Basic seq2seq model
- A model with bi-direction RNN encdoer and attention mechanism
Seq2seq model
- Same as basic model, but using tf.data pipeline to process input data
GNMT model
- Residual conection and attention same as GNMT model to speed up training
- refer to GNMT for more details
Pointer-Generator model
- A model that support copy mechanism
- refer to Pointer-Generator for more details
CopyNet model
- A model also support copy mechanism
- refer to CopyNet for more details.

For the implement details, refer to ReadMe in the model folder.

Structure

A typical sequence to sequence(seq2seq) model contains an encoder, an decoder and an attetion structure. Tensorflow provide many useful apis to implement a seq2seq model, usually you will need belowing apis:

tf.contrib.rnn
- Different RNNs
tf.contrib.seq2seq
- Provided different attention mechanism and also a good implementation of beam search
tf.data
- data preproces pipeline apis
Other apis you need to build and train a model

Encoder

Use either:

Multi-layer rnn
- use the last state of the last layer rnn as the initial decode state
Bi-direction rnn
- use a Dense layer to convert the fw and bw state to the initial decode state
GNMT encoder
- a bidirection rnn + serveral rnn with residual conection

Decoder

Use multi-layer rnn, and set the inital state of each layer to initial decode state
GNMT decoder
- only apply attention to the bottom layer of decoder, so we can utilize multi gpus during training

Attention

Bahdanau
Luong

Metrics

Right now I only have cross entropy loss. Will add following metrics:

bleu
- for translation problems
rouge
- for summarization problems

Dependency

Using tf-1.4
Python 3

Run

Run the model on a toy dataset, ie. reverse the sequence

train:

python -m bin.toy_train

inference:

python -m bin.toy_inference

Also you can run on en-vi dataset, refer to en_vietnam_train.py in bin for more details.

You can find more training scripts in bin directory.

Reference

Thanks to following resources:

https://github.com/tensorflow/nmt
- google's NMT tutorial, very good resource to learn seq2seq
https://github.com/OpenNMT/OpenNMT-tf
- code from harvardnlp group, also a good resource to learn seq2seq. Good code style and structure. You can find tensor2tensor implementation details here, which is becoming more and more popular nowdays.
https://github.com/JayParks/tf-seq2seq
- A good implementation of seq2seq with beam search based on tf 1.2.rc1
https://github.com/j-min/tf_tutorial_plus
- I used the demo data from here
https://github.com/vahidk/EffectiveTensorflow
- how to use tensorflow effectivly
https://github.com/abisee/pointer-generator
- The original pointer-generator network that use old seq2seq apis
https://github.com/stanfordmlgroup/nlc
- This project shows how to implement an attention wrapped rnn cell
https://github.com/lspvic/CopyNet
- this project using nmt to implement copynet

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
bin		bin
model		model
notebooks		notebooks
toy_data		toy_data
utils		utils
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Seq2Seq models

Experiments

Models

Structure

Encoder

Decoder

Attention

Metrics

Dependency

Run

Reference

About

Releases

Packages

Languages

xueyouluo/my_seq2seq

Folders and files

Latest commit

History

Repository files navigation

Seq2Seq models

Experiments

Models

Structure

Encoder

Decoder

Attention

Metrics

Dependency

Run

Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages