Skip to content

SearchAnalyzer is not set during field mapping. #8499

@alkampfergit

Description

@alkampfergit

Elastic.Clients.Elasticsearch version: 8.17.4

Elasticsearch version: tried on both: 8.13.0 and 8.18.0

.NET runtime version: .NET 8

Operating system version: Windows 11

Description of the problem including expected versus actual behavior:

I'm moving from NEST for elastic7 to the new driver. I'm mapping a field with this code

        mapping.Properties["securityTokens"] = new TextProperty()
        {
            Analyzer = "not_analyzed_lowercase",
            SearchAnalyzer = "not_analyzed_lowercase",
        };

But the SearchAnalyzer settings seems to be missing, actually I've a unit test that read the mapping from the index to verify that everything is correct and SearchAnalyzer settings is null.

Expected behavior
SearchAnalyzer should be set correctly on index mapping. I've verified using the _mapping endpoint that the mapping is incorrect.

Image

Activity

flobernd

flobernd commented on Apr 17, 2025

@flobernd
Member

Hi @alkampfergit,

this is a weird one. Could you please post the JSON request that is made by the client?

You can inspect the response in the debugger and check the ApiCallDetails for that purpose.

alkampfergit

alkampfergit commented on Apr 18, 2025

@alkampfergit
Author

Mapping is done with a call to this function (this is a unit test that aim is to check our compatibility with the driver, actually we are using NEST for version 2 of elastic, NEST for version 7 and we are adding version 8, yes we have customers with all three versions and we must be able to still use up to elastic 2 :) )

 await _elasticClient.Indices.CreateAsync

This is the full dump of the call. As you can see securityTokens has both analyzer and search analyzer.

Valid Elasticsearch response built from a successful (200) low level call on PUT: /test0b54a5cb9c232e2b95b5bf48784efe4121d5e64d-catalog-indexer_16?pretty=true

# Audit trail of this API call:
 - [1] HealthyResponse: Node: http://localhost:9800/ Took: 00:00:00.3328589
# Request:
{
  "mappings": {
    "dynamic_templates": [
      {
        "StringProperties": {
          "match": "s_*",
          "mapping": {
            "analyzer": "omnisearch_string_props",
            "fields": {
              "na": {
                "analyzer": "not_analyzed_lowercase",
                "type": "text"
              },
              "raw": {
                "type": "keyword"
              },
              "nan": {
                "normalizer": "lowercase",
                "type": "keyword"
              }
            },
            "type": "text"
          }
        }
      },
      {
        "NumericProperties": {
          "match": "n_*",
          "mapping": {
            "type": "double"
          }
        }
      },
      {
        "DateProperties": {
          "match": "d_*",
          "mapping": {
            "type": "date"
          }
        }
      },
      {
        "dense_vector_1536": {
          "match": "v1536_*",
          "mapping": {
            "dims": 1536,
            "element_type": "float",
            "index": true,
            "similarity": "dot_product",
            "type": "dense_vector"
          }
        }
      },
      {
        "dense_vector_3072": {
          "match": "v3072_*",
          "mapping": {
            "dims": 3072,
            "element_type": "float",
            "index": true,
            "similarity": "dot_product",
            "type": "dense_vector"
          }
        }
      }
    ],
    "properties": {
      "title": {
        "normalizer": "lowercase",
        "type": "keyword"
      },
      "type": {
        "type": "keyword"
      },
      "checkpointToken": {
        "type": "long"
      },
      "secondaryUpdateToken": {
        "type": "long"
      },
      "lastUpdated": {
        "type": "date"
      },
      "deleted": {
        "store": false,
        "type": "boolean"
      },
      "unsercured": {
        "type": "boolean"
      },
      "offline": {
        "type": "boolean"
      },
      "index": {
        "type": "keyword"
      },
      "ngrammed": {
        "analyzer": "trigram_standard",
        "type": "text"
      },
      "payload": {
        "index": false,
        "type": "keyword"
      },
      "securityTokens": {
        "analyzer": "not_analyzed_lowercase",
        "search_analyzer": "not_analyzed_lowercase",
        "type": "text"
      },
      "mainSearch": {
        "analyzer": "omnisearch_mainsearch",
        "fields": {
          "edge_n_gram": {
            "analyzer": "edge_ngram_standard_analyzer",
            "norms": false,
            "search_analyzer": "omnisearch_mainsearch",
            "type": "text"
          },
          "raw": {
            "normalizer": "lowercase",
            "type": "keyword"
          },
          "na": {
            "analyzer": "not_analyzed_lowercase",
            "type": "text"
          },
          "std": {
            "analyzer": "standard",
            "type": "text"
          }
        },
        "type": "text"
      },
      "mainsearch_it": {
        "analyzer": "italian",
        "type": "text"
      },
      "mainsearch_en": {
        "analyzer": "english",
        "type": "text"
      },
      "mainsearch_de": {
        "analyzer": "german",
        "type": "text"
      },
      "mainsearch_ru": {
        "analyzer": "russian",
        "type": "text"
      },
      "fulltext_it": {
        "analyzer": "italian",
        "type": "text"
      },
      "fulltext_en": {
        "analyzer": "english",
        "type": "text"
      },
      "fulltext_de": {
        "analyzer": "german",
        "type": "text"
      },
      "fulltext_ru": {
        "analyzer": "russian",
        "type": "text"
      },
      "nested": {
        "properties": {
          "name": {
            "analyzer": "not_analyzed_lowercase",
            "type": "text"
          },
          "depth": {
            "type": "integer"
          },
          "path": {
            "analyzer": "omnisearch_path_analyzer",
            "fields": {
              "na": {
                "analyzer": "not_analyzed_lowercase",
                "type": "text"
              }
            },
            "type": "text"
          },
          "svalue": {
            "fields": {
              "na": {
                "analyzer": "not_analyzed_lowercase",
                "type": "text"
              }
            },
            "type": "keyword"
          },
          "nvalue": {
            "type": "double"
          },
          "dvalue": {
            "type": "date"
          }
        },
        "type": "nested"
      },
      "internalData": {
        "index": false,
        "store": true,
        "type": "text"
      },
      "relatedIds": {
        "type": "keyword"
      }
    }
  },
  "settings": {
    "analysis": {
      "analyzer": {
        "standard_analyzer": {
          "type": "standard"
        },
        "omnisearch_path_analyzer": {
          "filter": "lowercase_filter",
          "tokenizer": "jarvis_path_tokenizer",
          "type": "custom"
        },
        "not_analyzed_lowercase": {
          "filter": [
            "lowercase_filter",
            "asciifolding"
          ],
          "tokenizer": "keyword_tokenizer",
          "type": "custom"
        },
        "omnisearch_mainsearch": {
          "filter": [
            "lowercase",
            "asciifolding"
          ],
          "tokenizer": "standard",
          "type": "custom"
        },
        "omnisearch_string_props": {
          "filter": [
            "lowercase",
            "asciifolding"
          ],
          "tokenizer": "standard",
          "type": "custom"
        },
        "edge_ngram_standard_analyzer": {
          "filter": [
            "lowercase_filter",
            "asciifolding",
            "edge_ngram_filter_standard"
          ],
          "tokenizer": "standard",
          "type": "custom"
        },
        "trigram_standard": {
          "filter": "lowercase_filter",
          "tokenizer": "trigram_tokenizer",
          "type": "custom"
        },
        "omni_property_indexTime": {
          "filter": "lowercase",
          "tokenizer": "standard",
          "type": "custom"
        }
      },
      "filter": {
        "lowercase_filter": {
          "type": "lowercase"
        },
        "edge_ngram_filter_standard": {
          "max_gram": 15,
          "min_gram": 2,
          "type": "edge_ngram"
        },
        "trim_zero_chars": {
          "max": 100,
          "min": 1,
          "type": "length"
        }
      },
      "tokenizer": {
        "jarvis_path_tokenizer": {
          "delimiter": "/",
          "type": "path_hierarchy"
        },
        "keyword_tokenizer": {
          "type": "keyword"
        },
        "edge_ngram_tokenizer": {
          "max_gram": 10,
          "min_gram": 3,
          "type": "edge_ngram"
        },
        "trigram_tokenizer": {
          "max_gram": 3,
          "min_gram": 3,
          "type": "ngram"
        },
        "non_ascii_and_space_split_lowercase_tokenizer": {
          "flags": "CASE_INSENSITIVE|MULTILINE",
          "group": -1,
          "pattern": "(?\u003C=[^\\p{ASCII}]|\\s)",
          "type": "pattern"
        }
      }
    },
    "number_of_replicas": 1,
    "number_of_shards": 1
  }
}
# Response:
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "test0b54a5cb9c232e2b95b5bf48784efe4121d5e64d-catalog-indexer_16"
}

Then doing the classic mapping requestl

http://localhost:9800/test0b54a5cb9c232e2b95b5bf48784efe4121d5e64d-catalog-indexer_16/_mapping

I got this response

{
  "test0b54a5cb9c232e2b95b5bf48784efe4121d5e64d-catalog-indexer_16": {
    "mappings": {
      "dynamic_templates": [
        {
          "StringProperties": {
            "match": "s_*",
            "mapping": {
              "analyzer": "omnisearch_string_props",
              "fields": {
                "na": {
                  "analyzer": "not_analyzed_lowercase",
                  "type": "text"
                },
                "raw": {
                  "type": "keyword"
                },
                "nan": {
                  "normalizer": "lowercase",
                  "type": "keyword"
                }
              },
              "type": "text"
            }
          }
        },
        {
          "NumericProperties": {
            "match": "n_*",
            "mapping": {
              "type": "double"
            }
          }
        },
        {
          "DateProperties": {
            "match": "d_*",
            "mapping": {
              "type": "date"
            }
          }
        },
        {
          "dense_vector_1536": {
            "match": "v1536_*",
            "mapping": {
              "dims": 1536,
              "element_type": "float",
              "index": true,
              "similarity": "dot_product",
              "type": "dense_vector"
            }
          }
        },
        {
          "dense_vector_3072": {
            "match": "v3072_*",
            "mapping": {
              "dims": 3072,
              "element_type": "float",
              "index": true,
              "similarity": "dot_product",
              "type": "dense_vector"
            }
          }
        }
      ],
      "properties": {
        "checkpointToken": {
          "type": "long"
        },
        "deleted": {
          "type": "boolean"
        },
        "fulltext_de": {
          "type": "text",
          "analyzer": "german"
        },
        "fulltext_en": {
          "type": "text",
          "analyzer": "english"
        },
        "fulltext_it": {
          "type": "text",
          "analyzer": "italian"
        },
        "fulltext_ru": {
          "type": "text",
          "analyzer": "russian"
        },
        "index": {
          "type": "keyword"
        },
        "internalData": {
          "type": "text",
          "index": false,
          "store": true
        },
        "lastUpdated": {
          "type": "date"
        },
        "mainSearch": {
          "type": "text",
          "fields": {
            "edge_n_gram": {
              "type": "text",
              "norms": false,
              "analyzer": "edge_ngram_standard_analyzer",
              "search_analyzer": "omnisearch_mainsearch"
            },
            "na": {
              "type": "text",
              "analyzer": "not_analyzed_lowercase"
            },
            "raw": {
              "type": "keyword",
              "normalizer": "lowercase"
            },
            "std": {
              "type": "text",
              "analyzer": "standard"
            }
          },
          "analyzer": "omnisearch_mainsearch"
        },
        "mainsearch_de": {
          "type": "text",
          "analyzer": "german"
        },
        "mainsearch_en": {
          "type": "text",
          "analyzer": "english"
        },
        "mainsearch_it": {
          "type": "text",
          "analyzer": "italian"
        },
        "mainsearch_ru": {
          "type": "text",
          "analyzer": "russian"
        },
        "nested": {
          "type": "nested",
          "properties": {
            "depth": {
              "type": "integer"
            },
            "dvalue": {
              "type": "date"
            },
            "name": {
              "type": "text",
              "analyzer": "not_analyzed_lowercase"
            },
            "nvalue": {
              "type": "double"
            },
            "path": {
              "type": "text",
              "fields": {
                "na": {
                  "type": "text",
                  "analyzer": "not_analyzed_lowercase"
                }
              },
              "analyzer": "omnisearch_path_analyzer"
            },
            "svalue": {
              "type": "keyword",
              "fields": {
                "na": {
                  "type": "text",
                  "analyzer": "not_analyzed_lowercase"
                }
              }
            }
          }
        },
        "ngrammed": {
          "type": "text",
          "analyzer": "trigram_standard"
        },
        "offline": {
          "type": "boolean"
        },
        "payload": {
          "type": "keyword",
          "index": false
        },
        "relatedIds": {
          "type": "keyword"
        },
        "secondaryUpdateToken": {
          "type": "long"
        },
        "securityTokens": {
          "type": "text",
          "analyzer": "not_analyzed_lowercase"
        },
        "title": {
          "type": "keyword",
          "normalizer": "lowercase"
        },
        "type": {
          "type": "keyword"
        },
        "unsercured": {
          "type": "boolean"
        }
      }
    }
  }
}

If I have time I'll try to reproduce on a simple onefile project.

flobernd

flobernd commented on Apr 19, 2025

@flobernd
Member

Hi @alkampfergit , thanks for providing the JSON request/response payloads.

The request produced by the Indices.CreateAsync correctly serializes the search_analyzer field which means that this is not a client error.

Just to triple check, could you please execute the exact same request using curl or in the Kibana Dev Console? I strongly expect this to produce the same result.

Unfortunately I don't know why the server does not seem to save the search_analyzer setting. To clarify this, you might probably want to contact support or ask in our discuss forums.

alkampfergit

alkampfergit commented on Apr 24, 2025

@alkampfergit
Author

Hi @flobernd sorry for late response but I was ill. Actually I've tried with postman and I got the very same result. I'll move to the forum.

I also solved the issue, it seems that elasticsearch changed behaviour from version previous to 8 to version 8. If you examine the mapping the test is setting a searchanalyzer that is THE SAME of the analyzer. Since this is the default behaviour, it seems that Elasticsearch 8 will not set the value explicitly if the two are the same. Setting a different analyzer only for searchAnalyzer works correctly.

I've changed the test to try this situation because the old test makes little sense. This is a set of more than 1000 unit test that is performing every query we do to elastic as all the kind of mapping we can do dynamically in the code, that specific test is testing the ability to explicitly set the searchanalyzer but it uses the same value of the analyzer.

I'm closing this bug because it is not a bug after all.

flobernd

flobernd commented on Apr 24, 2025

@flobernd
Member

Hi @alkampfergit , thanks for the update 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @alkampfergit@flobernd

        Issue actions

          SearchAnalyzer is not set during field mapping. · Issue #8499 · elastic/elasticsearch-net