In the name of Allah

POSTagger version 1.0

  10 January 2013

This is the README for the "POSTagger" package which can helps for predicting fine-grained and/or coarse-grained POS tags of dependency treebank. This package has been developed by [Mojtaba Khallash] (mailto: mkhallash@gmail.com) from Iran University of Science and Technology (IUST).

The home page for the project is: http://nlp.iust.ac.ir

If you want to use this software for research, please refer to this web address in your papers.

The package can be used freely for non-commercial research and educational purposes. It comes with no warranty, but we welcome all comments, bug reports, and suggestions for improvements.

3a. Convert Dependency To Tagger

Goal of this section is converting dependency treebank to tagger format for train (with gold tag) or test.

-i <input conll file>	input conll file which you want to convert to tagger format
-o <output tag path>	output file that can be used in tagger
-g <adding gold tag (0\|1) [default: 1]>	this parameter determines converted file used for train (1) or test(0)
-t <type of tag (fpod\|cpos) [default: fpos]>	determine type of tag that want extract from dependency treebank for train

For example:

java -jar POSTagger.jar -v 0 -mode D2T -i input.conll -o output.lbl -g 1 -t cpos

3b. Convert Tag To Dependency

After predicting tags, can used this section put resulting tags to correct place in dependency treebank.

-i <input tag file>	input tag file that contains predicted POS tags of each word
-c <input conll file>	input conll file which you wantr to add predicted POS tags in correct place
-o <output dependency path>	output dependency file after adding tags
-t <type of tag (fpod\|cpos) [default: fpos]>	type of predicted tag that need for determine place of adding in conll file

For example:

java -jar POSTagger.jar -v 0 -mode T2D -i input.lbl -c input.conll -o output.conll -t cpos

3c. Train Tagger

Using implementation of the part-of-speech tagger described in [1].

-i <input tag file for train>	input tag file which contains word_tag of each sentence per line
-m <output model path>	name of folder for saving training model
-iter <maximim number of iteration [default: 100]>	maximum number of iteration usef during training

For example:

java -jar POSTagger.jar -v 0 -mode TR -i input.lbl -m model -iter 100

Requirements:

"[mxpost.jar] (http://www.inf.ed.ac.uk/resources/nlp/local_doc/MXPOST.html)" for tagging.

3d. Tagging

These parameters exsit for Tagging:

-i <input tag file for tagging>	input tag file contain only words of each sentence per line
-m <trained model path>	name of folder than saved pre-trained model
-o <output tag path>	output tag file for adding predicted pos
-g <gold tag file [optional]>	if you want evaluate predicted POS tag, can set tag file equal to input tag but with gold POS tag

For example:

java -jar POSTagger.jar -v 0 -mode TG -i input.lbl -m model -o output.lbl -g gold.lbl

Requirements:

"[mxpost.jar] (http://www.inf.ed.ac.uk/resources/nlp/local_doc/MXPOST.html)" for tagging.

References

[1] A. Ratnaparkhi, "A maximum entropy model for part-of-speech tagging", in Proceedings of the Empirical Methods in Natural Language Processing Conference, University of Pennsylvania, pp. 133-142, 1996.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

POSTagger version 1.0

Table of contents

3a. Convert Dependency To Tagger

3b. Convert Tag To Dependency

3c. Train Tagger

3d. Tagging

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

POSTagger version 1.0

Table of contents

3a. Convert Dependency To Tagger

3b. Convert Tag To Dependency

3c. Train Tagger

3d. Tagging

References