Problem using own analyzer configuration #30

petersiman · 2015-07-14T12:15:22Z

Hi, I am trying to set up Liferay with Elasticsearch and use hunspell as analyzer for czech language. I have set up the index with following analyzer definition:

PUT /liferay_0
{
   "settings": {
      "analysis": {
         "analyzer": {
            "cestina_hunspell": {
               "type": "custom",
               "tokenizer": "standard",
               "filter": [
                  "stopwords_CZ",
                  "cs_CZ",
                  "icu_folding",
                  "stopwords_CZ",
                  "remove_duplicities"
               ]
            }
         },
         "filter": {
            "stopwords_CZ": {
               "type": "stop",
               "stopwords": [
                  "právě",
                  "že",
                  "_czech_"
               ],
               "ignore_case": true
            },
            "cs_CZ": {
               "type": "hunspell",
               "locale": "cs_CZ",
               "dedup": true,
               "recursion_level": 0
            },
            "remove_duplicities": {
               "type": "unique",
               "only_on_same_position": true
             }
           }
      }
   }
}

It seems to work on czech text, when I try to call the analyzer through the REST API:

curl 'localhost:9200/i/_analyze?analyzer=cestina_hunspell&pretty=true' -d 'Právě se mi zdálo, že se kolem okna něco mihlo.'

I get tokens:

zdát
kolem, kolo
okno
něco
mihnout

which are the wanted tokens.

But when I try to search web content with such text (indexed after new settings) I don´t get right results (I have to provide exact word, to get result).

Any ideas what could cause this behaviour.

Thanks.

The text was updated successfully, but these errors were encountered:

ajay-kottapally · 2015-07-14T12:52:05Z

Hi,

Elasticray creates indices on liferay with name liferay_{companyId}. So you must apply settings not just for liferay_0 but also to liferay_{comapnyId}. Or else use elasticsearch templates and apply template to liferay_*.

ajay-kottapally · 2015-07-14T12:54:14Z

Sorry closed by mistake.

petersiman · 2015-07-14T13:08:08Z

Hi,
thanks for quick reply. I have applied settings to indices for every company in Liferay (i have chosen the liferay_0 index only for ilustration). However, I think the problem might be in in dynamic mapping according to language defined in this file - https://github.com/R-Knowsys/elasticray/blob/master/webs/elasticray-web/docroot/WEB-INF/classes/com/rknowsys/portal/search/elastic/template.json. Is there any other way (some configuration) to bypass the

{
    "cs": {
        "match": "*_cs*",
        "match_mapping_type": "string",
        "mapping": {
            "type": "string",
            "analyzer": "czech"
        }
    }
}

mapping? Or do I have to modify this file and re-deploy the package?

ajay-kottapally · 2015-07-14T13:18:35Z

I am afraid we don't have a configuration for this. You can change the file and redeploy the package and reindex or you can do this

Delete templates by name liferay_template*.
Then Apply you own template and reindex.

petersiman · 2015-07-17T13:55:43Z

Thank you for reply. After defining my own analyzer in dynamic template, the search ran well.

ajay-kottapally closed this as completed Jul 14, 2015

ajay-kottapally reopened this Jul 14, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem using own analyzer configuration #30

Problem using own analyzer configuration #30

petersiman commented Jul 14, 2015

ajay-kottapally commented Jul 14, 2015

ajay-kottapally commented Jul 14, 2015

petersiman commented Jul 14, 2015

ajay-kottapally commented Jul 14, 2015

petersiman commented Jul 17, 2015

Problem using own analyzer configuration #30

Problem using own analyzer configuration #30

Comments

petersiman commented Jul 14, 2015

ajay-kottapally commented Jul 14, 2015

ajay-kottapally commented Jul 14, 2015

petersiman commented Jul 14, 2015

ajay-kottapally commented Jul 14, 2015

petersiman commented Jul 17, 2015