Skip to content

altkatz/jieba_rb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JiebaRb

Gem Version

Build Status

Ruby extension for Cppjieba

Installation

Add this line to your application's Gemfile:

gem 'jieba_rb'

And then execute:

$ bundle

Or install it yourself as:

$ gem install jieba_rb

Word segment Usage

Mix Segment mode (HMM with Max Prob, default):

require 'jieba_rb'
seg = JiebaRb::Segment.new  # equivalent to "JiebaRb::Segment.new mode: :mix"
words = seg.cut "令狐冲是云计算行业的专家"
# 令狐冲 是 云 计算 行业 的 专家

Mix Segment mode with user-defined dictionary:

seg = JiebaRb::Segment.new mode: :mix, user_dict: "ext/cppjieba/dict/user.dict.utf8"
words = seg.cut "令狐冲是云计算行业的专家"
# 令狐冲 是 云计算 行业 的 专家

HMM or Max probability (mp) Segment mode:

seg = JiebaRb::Segment.new mode: :hmm # or  mode: :mp
words = seg.cut "令狐冲是云计算行业的专家"

Word tagging Usage

Default tagging:

require 'jieba_rb'
tagging = JiebaRb::Tagging.new
pairs = tagging.tag "我是蓝翔技工拖拉机学院手扶拖拉机专业的。"
# [{"我"=>"r"}, {"是"=>"v"}, {"蓝翔"=>"x"}, {"技工"=>"n"}, {"拖拉机"=>"n"}, {"学院"=>"n"}, {"手扶拖拉机"=>"n"}, {"专业"=>"n"}, {"的"=>"uj"}, {"。"=>"x"}]

Tagging with user-defined dictionary:

require 'jieba_rb'
tagging = JiebaRb::Tagging.new user_dict: :default
pairs = tagging.tag "我是蓝翔技工拖拉机学院手扶拖拉机专业的。"
# [{"我"=>"r"}, {"是"=>"v"}, {"蓝翔"=>"nz"}, {"技工"=>"n"}, {"拖拉机"=>"n"}, {"学院"=>"n"}, {"手扶拖拉机"=>"n"}, {"专业"=>"n"}, {"的"=>"uj"}, {"。"=>"x"}]

Keyword Extractor Usage

  • only support TF-IDF currently
    keyword = JiebaRb::Keyword.new
    keywords_weights = keyword.extract "我是拖拉机学院手扶拖拉机专业的。不用多久,我就会升职加薪,当上CEO,走上人生巅峰。", 5

                  [
                  ["CEO", 11.739204307083542],
                  ["升职", 10.8561552143],
                  ["加薪", 10.642581114],
                  ["手扶拖拉机", 10.0088573539],
                  ["巅峰", 9.49395840471]
                  ]

Contributing

  1. Fork it ( http://github.com//jieba_rb/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published