Skip to content
/ fnet Public

Fine-Grained Entity Type Classification by Jointly Learning Representations and Label Embeddings

License

Notifications You must be signed in to change notification settings

abhipec/fnet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FNET

Publication

Fine-Grained Entity Type Classification by Jointly Learning Representations and Label Embeddings. Abhishek, Ashish Anand and Amit Awekar. EACL 2017.

Please use the following BibTex code for citing this work.

@InProceedings{abhishek-anand-awekar:2017:EACLlong,
  author    = {Abhishek, Abhishek  and  Anand, Ashish  and  Awekar, Amit},
  title     = {Fine-Grained Entity Type Classification by Jointly Learning Representations and Label Embeddings},
  booktitle = {Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers},
  month     = {April},
  year      = {2017},
  address   = {Valencia, Spain},
  publisher = {Association for Computational Linguistics},
  pages     = {797--807},
  url       = {http://www.aclweb.org/anthology/E17-1075}
}

Compatibility with TensorFlow 1.12

An updated version of the main code, compatible with TensorFlow 1.12 is available at https://github.com/abhipec/FgEC. Transfer learning related experiments are not part of that code.

data

Download the necessary data as per instructions mentioned in data/processed/f1/README.md file.

Directory structure:

  • /home/
    • EACL-2017
      • fnet
      • glove.840B.300d

dependencies

Python3 version of TensorFlow (0.10.0rc0) framework is used in this experiment.

pip install numpy docopt pandas plotly matplotlib scipy sklearn 

Compile Cpp libraries.

cd src/lib
bash compile_gcc_5.bash

run

cd src
bash scripts/BBN.bash
bash scripts/OntoNotes.bash
bash scripts/Wiki.bash

This will create model checkpoints in the ckpt directory.

Please have a look at the scripts and modify necessary variables.

Report result:

python report_results.py ~/EACL-2017/fnet/ckpt/

Feature level transfer learning experiment

Download the necessary data as per instructions mentioned in data/processed/f4/README.md file.

bash scripts/tl.bash
python report_results.py ~/EACL-2017/fnet/ckpt/

Preprocessing steps (Optional)

These steps will convert the original data https://github.com/shanzhenren/AFET to tfrecord format used in this code.

Download the necessary data as per instructions mentioned in data/AFET/dataset/README.md file.

Also download and extract GloVe vectors (http://nlp.stanford.edu/data/glove.840B.300d.zip) in glove.840B.300d directory.

Dataset names used: BBN, Wiki and OntoNotes.

Preprocess data and generate train, development and test set.

cd data_processing/
python sanitizer.py BBN ~/EACL-2017/fnet/data/AFET/ 10 ~/EACL-2017/fnet/data/sanitized/

Convert json to Tfrecord format

python data_processing/json_to_tfrecord.py BBN ~/EACL-2017/fnet/data/sanitized/ ~/EACL-2017/glove.840B.300d/glove.840B.300d.txt f1 ~/EACL-2017/fnet/data/processed/
python data_processing/json_to_tfrecord.py BBN ~/EACL-2017/fnet/data/sanitized/ ~/EACL-2017/glove.840B.300d/glove.840B.300d.txt f2 ~/EACL-2017/fnet/data/processed/
python data_processing/json_to_tfrecord.py BBN ~/EACL-2017/fnet/data/sanitized/ ~/EACL-2017/glove.840B.300d/glove.840B.300d.txt f3 ~/EACL-2017/fnet/data/processed/
data_format alias remarks
our f1 Used in our, our-NoM, our-AllC
Attentive f2 Used in Attentive
transfer-learning-model f3 Used in model level transfer learning

Transfer learning experiments

  1. Train our model on Wiki dataset.
  2. Note down its uid.
  3. Modify ../ckpt/uid/checkpint file such that it points to the best performing checkpoint.
  4. Change the fintune_directory parameter in the following scripts to include uid noted in step 2.

Model level transfer learning

bash scripts/transfer_learning_model.bash

Feature level transfer learning

bash scripts/transfer_learning_feature_dumping.bash
bash scripts/tl.bash

Report result

python report_results.py ~/EACL-2017/fnet/ckpt/

type-wise analysis

Please change the dataset and the path of result file that need to be analysed type wise.

python class_wise_analysis.py --all_labels_file=../data/sanitized/BBN/sanitized_labels.txt  --json_file=../data/sanitized/BBN/sanitized_test.json --result_file=../ckpt/Wiki_1.2/result_7.txt --dataset=Wiki

About

Fine-Grained Entity Type Classification by Jointly Learning Representations and Label Embeddings

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages