This is a word aligner for English: given two English sentences, it aligns related words in the two sentences. It exploits the semantic and contextual similarities of the words to make alignment decisions.
- Python NLTK
- The Python wrapper for Stanford CoreNLP
-
Install the above tools.
-
Change line 100 of corenlp.py, from "rel, left, right = map(lambda x: remove_id(x), split_entry)" to "rel, left, right = split_entry".
-
Download the NLTK stopword corpus:
python -m nltk.downloader stopwords
-
Install jsonrpclib:
sudo pip install jsonrpclib
-
Download the aligner:
git clone https://github.com/ma-sultan/monolingual-word-aligner.git
-
Run the corenlp.py script to launch the server:
python corenlp.py
-
To view the aligner in action, run testAlign.py. (Word indexing starts at 1.)