This repository contains the code for our paper: SUNNYNLP at SemEval-2018 Task 10: A Support-Vector-Machine-Based Method for Detecting Semantic Difference using Taxonomy and Word Embedding Features
Task description: SemEval 2018 Task 10 -- Capturing Discriminative Attributes
Our Support-Vector-Machine(SVM)-based system combines features extracted from pre-trained embeddings and statistical information from Probase to detect semantic difference of concepts pairs.
We recommend using a separate Python 3.6 environment to install packages. All packages required are listed in requirements.txt
. You can install them using pip:
pip install -r requirements.txt
As our system is using the English model in spaCy, run
python -m spacy download en
to install the language model required
-
Pre-trained word vectors:
-
Our spaCy parsed version of Probase or the original Probase
-
Edit the path for Probase, pre-trained vectors, output path, etc. according to the instructions in
./config/configuration-sample.yml
. -
Run the main program in root directory to generate predictions based on your configuration:
$ python src/main.py config/configuration-sample.yml
- Run the official script to evaluate the predictions in the directory you have specified and save scores in
./score/
$ ./official-evaluation.sh ./prediction/configuration-sample
./prediction/configuration-sample/dev-5folds-FastText-dtc.txt
3 ./score/all-score.txt
./prediction/configuration-sample/dev-5folds-FastText-LinearSVC.txt
5 ./score/all-score.txt
If you find our work useful, please cite our work.
@inproceedings{lai2018sunnynlp,
title={SUNNYNLP at SemEval-2018 Task 10: A Support-Vector-Machine-Based Method for Detecting Semantic Difference using Taxonomy and Word Embedding Features},
author={Lai, Sunny and Leung, Kwong Sak and Leung, Yee},
booktitle={Proceedings of The 12th International Workshop on Semantic Evaluation},
pages={741--746},
year={2018}
}
If you use the code, please cite according to the hyperwords repository
If you have used our spaCy parsed version of Probase, please cite according to the Probase official website