================================= Tensorflow implementation of Layer Normalization and Hyper Networks.
This implementation contains:
-
Layer Normalization for GRU
-
Layer Normalization for LSTM
- Currently normalizing c causes lot of nan's in the model, thus commenting it out for now.
-
Hyper Networks for LSTM
-
Layer Normalization and Hyper Networks (combined) for LSTM
- Python 2.7 or Python 3.3+
- NLTK
- TensorFlow >= 0.9
To evaluate the new model, we train it on MNIST. Here is the model and results using Layer Normalized GRU
To train a mnist model with different cell_types:
$ python mnist.py --hidden 128 summaries_dir log/ --cell_type LNGRU
To train a mnist model with HyperNetworks:
$ python mnist.py --hidden 128 summaries_dir log/ --cell_type HyperLnLSTMCell --layer_norm 0
To train a mnist model with HyperNetworks and Layer Normalization:
$ python mnist.py --hidden 128 summaries_dir log/ --cell_type HyperLnLSTMCell --layer_norm 1
cell_type = [LNGRU, LNLSTM, LSTM , GRU, BasicRNN, HyperLnLSTMCell]
To view graph:
$ tensorboard --logdir log/train/
- Add attention based models ( in progress ).