File description:
-
data.py: preprocessing Senseval2 and Senseval3 dataset, get the input for model4.py, including the sense embedding of the target sense, and forward data and backward arount the central word.
-
google_data.py: preprocessing Google research dataset - Word Sense disambiguation corpora, get the input for model4.py, including the sense embedding of the target sense, and forward data and backward arount the central word.
-
model4.py: build bidrection LSTM for word sense disambiguation, using data.py or google_data.py as input, it will output the model.
-
globe.py: load the pre-trained Glove word embedding vector for our own dataset
-
sense_embedding.csv: the 100-dimension sense vector of Google - Word Sense Disambiguation corpora
-
senseval_sense_embedding.csv: the 100-dimensiion sense vector of Senseval2 dataset
-
Final_report.docx: descript the whole idea of this project
-
data: including Senseval2 and Senseval3 dataset