Skip to content

tuem/resembla

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Resembla: Word-based Japanese similar sentence search library

Features

  • Candidate elimination using N-gram index and bit-parallel edit distance computation
  • Word, kana and romaji-based edit distance variables and their ensemble
  • Support vector regression with linguistic features

Requirements

Other included libraries

Running example

  • install MeCab, LIBSVM, ICU and a C++11 compiler

    • if you use CentOS 7, see Wiki
  • clone, build and install Resembla

cd /var/tmp
git clone https://github.com/tuem/resembla.git
cd resembla/src
make
sudo make install
cd executable
make
sudo make install
#optional
cd /var/tmp/resembla/misc/mecab_dic/unidic/
./install-unidic.sh
cd /var/tmp/resembla/misc/mecab_dic/mecab-unidic-neologd/
./install-mecab-unidic-neologd.sh
  • run with example files
# on src/executable
./resembla_index -c ../../example/conf/name.json
./resembla_cli -c ../../example/conf/name.json
# input some names like 'タケダ'
./resembla_index -c ../../example/conf/address.json
./resembla_cli -c ../../example/conf/address.json
# input some addresses like '京都北区'
# you may need to run install-unidic.sh or edit configuration file
./resembla_index -c ../../example/conf/apple.json
./resembla_cli -c ../../example/conf/apple.json
# input some sentences like 'りんごおいしくねえ'

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published