UETnlp

UETnlp is an open-source toolkit written in Java and developed with Esclipse IDE. This toolkit supports 4 steps for Vietnamese language processing in a pepline: tokenization, sentence segmentation, word segmentation, part-of-speech tagging. The word segmenter is integrated from UETsegmenter.

How to use

The following command is used to run this toolkit, your PC needs JDK 1.8 or newer:

java -jar uetnlp.jar [options...] [arguments..]
	-res FILE : the resources folder containing model for WS and configure file for POS tagging (require)
	-in FILE  : the input file/directory (require)
	-ext VAL  : specify file types to process (in the case -in is a directory) (default: .txt)
	-tok      : specify if doing tokenization is set or not (default: false)
	-sent     : specify if doing sentence segmentation is set or not (default: false)
	-ws       : specify if doing word segmentation is set or not (default: false)
	-pos      : specify if doing pos tagging or not is set or not (default: false)

Citation

If you use the word segmenter for academic work, please cite the following paper:

@INPROCEEDINGS{UETSegmenter, 
	author={Nguyen, Tuan-Phong and Le, Anh-Cuong}, 
	booktitle={2016 IEEE RIVF International Conference on Computing Communication Technologies, Research, Innovation, and Vision for the Future (RIVF)}, 
	title={A hybrid approach to Vietnamese word segmentation}, 
	year={2016}, 
	pages={114-119},
	doi={10.1109/RIVF.2016.7800279}, 
	month={Nov},
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
dictionary		dictionary
libs		libs
resources		resources
samples		samples
src/vn/edu/vnu/uet		src/vn/edu/vnu/uet
.gitignore		.gitignore
README.md		README.md
uetnlp.jar		uetnlp.jar

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UETnlp

How to use

Citation

About

Releases

Packages

Languages

phongnt570/UETnlp

Folders and files

Latest commit

History

Repository files navigation

UETnlp

How to use

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages