Skip to content

tlefsad/elasticsearch-analysis-thaichub2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Thai Analysis Plugin for Elasticsearch

The Thaichub2 (thai-chub-chub) Analysis Plugin integrates the Thai word segmentation modules into Elasticsearch.

Installation on Elasticsearch

  • Download a release zip from the release page matching your ES version (Now supports only version 7.6.2).

  • Install with this command

./bin/elasticsearch-plugin install --batch file:///<path to zip>
  • Restart Elasticsearch

Sample Usage

Sample request

POST _analyze
{
    "analyzer": "thaichub2_analyzer",
    "text": "นมตรามะลิ"
}

Result

{
  "tokens" : [
    {
      "token" : "นม",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "word",
      "position" : 0
    },
    {
      "token" : "ตรา",
      "start_offset" : 2,
      "end_offset" : 5,
      "type" : "word",
      "position" : 1
    },
    {
      "token" : "มะลิ",
      "start_offset" : 5,
      "end_offset" : 9,
      "type" : "word",
      "position" : 2
    }
  ]
}

Thanks