Tree-Structured Long Short-Term Memory Networks

A PyTorch based implementation of Tree-LSTM from Kai Sheng Tai's paper Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks.

Requirements

PyTorch
tqdm
Java >= 8 (for Stanford CoreNLP utilities)
Python >= 3

Usage

First run the script ./fetch_and_preprocess.sh, which downloads:

SICK dataset (semantic relatedness task)
Glove word vectors (Common Crawl 840B) -- Warning: this is a 2GB download!
Stanford Parser and Stanford POS Tagger

The preprocessing script also generates dependency parses of the SICK dataset using the Stanford Neural Network Dependency Parser.

To try the Dependency Tree-LSTM from the paper to predict similarity for pairs of sentences on the SICK dataset, run python main.py to train and test the model, and have a look at config.py for command-line arguments.

The first run takes a few minutes because the GLOVE embeddings for the words in the SICK vocabulary will need to be read and stored to a cache for future runs. In later runs, only the cache is read in during later runs.

This code with --lr 0.01 --wd 0.0001 --optim adagrad --batchsize 25 gives a Pearson's coefficient of 0.8336 and a MSE of 0.3119, as opposed to a Pearson's coefficient of 0.8676 and a MSE of 0.2532 in the original paper. The difference might be because of differences in the way the word embeddings are updated.

Notes

PyTorch 0.1.12 has support for sparse tensors in both CPU and GPU modes. This means that nn.Embedding can now have sparse updates, potentially reducing memory usage. Enable this by the --sparse argument, but be warned of two things:

Sparse training has not been tested by me. The code works, but performance has not been benchmarked for this code.
Weight decay does not work with sparse gradients/parameters.

Acknowledgements

Shout-out to Kai Sheng Tai for the original LuaTorch implementation, and to the Pytorch team for the fun library.

Author

Riddhiman Dasgupta

This is my first PyTorch based implementation, and might contain bugs. Please let me know if you find any!

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
checkpoints		checkpoints
lib		lib
scripts		scripts
.gitignore		.gitignore
Constants.py		Constants.py
LICENSE		LICENSE
README.md		README.md
config.py		config.py
dataset.py		dataset.py
fetch_and_preprocess.sh		fetch_and_preprocess.sh
main.py		main.py
metrics.py		metrics.py
model.py		model.py
trainer.py		trainer.py
tree.py		tree.py
utils.py		utils.py
vocab.py		vocab.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tree-Structured Long Short-Term Memory Networks

Requirements

Usage

Notes

Acknowledgements

Author

License

About

Releases

Packages

Contributors 3

Languages

License

pklfz/treelstm-pytorch

Folders and files

Latest commit

History

Repository files navigation

Tree-Structured Long Short-Term Memory Networks

Requirements

Usage

Notes

Acknowledgements

Author

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages