To be more accurate, this implementation is just a line-by-line translation from the DyNet implementation that can be found here. The techniques behind the parser are described in the paper Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations.
- Python 2.7 interpreter (For Python 3 implementation, please checkout branch pytorch_python3, Thanks to Zhiqiang Xie)
- Pytorch library
The software requires having a training.conll
and development.conll
files formatted according to the CoNLL data format, or a training.conllu
and development.conllu
files formatted according to the CoNLLU data format.
python src/parser.py --outdir [results directory] --train data/en-universal-train.conll --dev data/en-universal-dev.conll --epochs 30 --lstmdims 125 --bibi-lstm
The command for parsing a test.conll
file formatted according to the CoNLL data format with a previously trained model is:
python src/parser.py --predict --outdir [results directory] --test data/en-universal-test.conll --model [trained model file] --params [param file generate during training]
The parser will store the resulting conll file in the out directory (--outdir
).