Skip to content

Latest commit

 

History

History
36 lines (20 loc) · 2.81 KB

README.md

File metadata and controls

36 lines (20 loc) · 2.81 KB

Semantic Role Labeling with Multi-mask Tensorized Self-Attention (MTSA)

Requirement

  1. Python2
  2. Tensorflow>=1.3

Getting Started

  1. Applying MTSA to SRL tasks is based on the SRL Tagger released by @XMUNLP at here. Please clone the codes first and follow the README.md in that repo to know how the code works.

  2. Copy the deepatt.py in this repo to folder Tagger/models and then replace original one.

  3. Add a new variable initialization method, Glorot, after line 306 in Tagger/main.py by using following lines. The reason why the glorot is used is that we empirically found that the orthogonal initializer led to the NaN loss in tensorflow>=1.5

     elif params.initializer == "glorot":
         return tf.glorot_normal_initializer()
    
  4. Do not forget to choose the 'glorot' as the initializer by using running command

     --training_params=...,initializer=orthogona,...
    

to replace original initializer=orthogonal in Training Command.

Tips to Run the SRL-Tagger

  1. The validation script (i.e., run.sh) is not provided in the Repo but is indispensable when running the code. You can save the codes given in this Section to a new run.sh file. Besides, do not forget to check the arguments passed to run.sh in Tagger/utils/validation.py.
  2. Read After Run Codes Successfully There are a bug in Tagger and I give you a solution here. The bug is that the validation program need to read the latest checkpoint multiple times because because tf.contrib.learn.Estimator.predict need to re-load the checkpoint when it is invoked for every validation batch. However, when validation bash script is running in a sub-process, the latest checkpoint model could be updated by the main training process. This may lead to the situation that the latest checkpoint updates during a validation procedure, resulting in wrong validation results. The solution is that, before starting the period validation, you can copy the latest ckpt to a temporary path (do not forget to form a new tensorflow checkpoint file in the path) and then pass the path to the run.sh.

Contact Info

Please feel free to open an issue if you encounter any bug and confusion when you execute the codes.

Acknowledgements

Thanks to Zhixing Tan for the SRL Tagger Framework and his neat code style!