Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mapping parameters documentation #7115

Merged
merged 70 commits into from
Oct 14, 2024
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
43c661a
Add mapping parameters documentation
vagimeli May 8, 2024
f6d9428
Add mapping parameters documentation
vagimeli May 8, 2024
5d49c33
Add mapping parameters documentation
vagimeli May 10, 2024
abdbd51
Add mapping parameters documentation
vagimeli May 13, 2024
3ca39b5
Add mapping parameters documentation
vagimeli May 13, 2024
c562982
Add files
vagimeli May 17, 2024
506bba5
Add files
vagimeli May 17, 2024
295f134
Add files
vagimeli May 17, 2024
59ff4e1
Write first-pass draft
vagimeli May 17, 2024
67cfe85
Update _field-types/mapping-parameters/analyzer.md
vagimeli Sep 30, 2024
69ed961
Merge branch 'main' into mapping-parameters
vagimeli Sep 30, 2024
6a71f49
Update _field-types/mapping-parameters/copy-to.md
vagimeli Sep 30, 2024
20265ba
Update _field-types/mapping-parameters/coerce.md
vagimeli Sep 30, 2024
aa316d0
Address final tech review comments
vagimeli Oct 8, 2024
8ee94de
Update _field-types/mapping-parameters/index.md
vagimeli Oct 10, 2024
771bfb0
Update _field-types/mapping-parameters/index.md
vagimeli Oct 10, 2024
5e5abcd
Update _field-types/mapping-parameters/copy-to.md
vagimeli Oct 10, 2024
3043bf1
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 10, 2024
f3c1128
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 10, 2024
205de69
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 10, 2024
db765a2
Update _field-types/mapping-parameters/coerce.md
vagimeli Oct 10, 2024
9c4d834
Update _field-types/mapping-parameters/copy-to.md
vagimeli Oct 10, 2024
b15e813
Update analyzer.md
vagimeli Oct 10, 2024
3f5e16b
Update boost.md
vagimeli Oct 10, 2024
4489740
Update coerce.md
vagimeli Oct 10, 2024
83fa99b
Update copy-to.md
vagimeli Oct 10, 2024
636ea75
Merge branch 'main' into mapping-parameters
vagimeli Oct 11, 2024
b00b542
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 14, 2024
2144849
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 14, 2024
f1e3330
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 14, 2024
8f27836
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 14, 2024
f232f81
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 14, 2024
2b34a12
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 14, 2024
f87aceb
Update _field-types/mapping-parameters/boost.md
vagimeli Oct 14, 2024
f6ef5aa
Update _field-types/mapping-parameters/boost.md
vagimeli Oct 14, 2024
d0b82b9
Update _field-types/mapping-parameters/boost.md
vagimeli Oct 14, 2024
6a1abf9
Update _field-types/mapping-parameters/boost.md
vagimeli Oct 14, 2024
021d7d5
Update _field-types/mapping-parameters/boost.md
vagimeli Oct 14, 2024
65d04b6
Update _field-types/mapping-parameters/coerce.md
vagimeli Oct 14, 2024
206408d
Update _field-types/mapping-parameters/coerce.md
vagimeli Oct 14, 2024
f978231
Update _field-types/mapping-parameters/coerce.md
vagimeli Oct 14, 2024
e9d2d32
Update _field-types/mapping-parameters/coerce.md
vagimeli Oct 14, 2024
6a1dc3a
Update _field-types/mapping-parameters/copy-to.md
vagimeli Oct 14, 2024
5a677ec
Update _field-types/mapping-parameters/copy-to.md
vagimeli Oct 14, 2024
6ea6e8c
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
5b369a5
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
a26ed71
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
4d3e32d
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
a2e4af6
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
0e4a2d1
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
ccf3081
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
adcee30
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
0db800b
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
d226acb
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
c411314
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
df8513f
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
9198ea5
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
0959cae
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
f3f1d19
Merge branch 'main' into mapping-parameters
vagimeli Oct 14, 2024
dfbd331
Update front matter
vagimeli Oct 14, 2024
d89fda1
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 14, 2024
b81be6b
Update _field-types/mapping-parameters/boost.md
vagimeli Oct 14, 2024
25184f0
Update _field-types/mapping-parameters/coerce.md
vagimeli Oct 14, 2024
6e74fb2
Update _field-types/mapping-parameters/copy-to.md
vagimeli Oct 14, 2024
8e129e4
Delete empty files
vagimeli Oct 14, 2024
6d624bc
Merge branch 'mapping-parameters' of https://github.com/opensearch-pr…
vagimeli Oct 14, 2024
17b8012
Delete empty files
vagimeli Oct 14, 2024
43e0712
Delete empty files
vagimeli Oct 14, 2024
a7437d1
Delete empty files
vagimeli Oct 14, 2024
45b1a88
Delete empty files
vagimeli Oct 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 92 additions & 0 deletions _field-types/mapping-parameters/analyzer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
---
layout: default
title: analyzer
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
parent: Mapping parameters
grand_parent: Mapping and field types
nav_order: 5
has_children: false
has_toc: false
---

# `analyzer`

The `analyzer` mapping parameter is used to define the text analysis process that is applied to a text field during both indexing and searching operations.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

The key functions of the `analyzer` mapping parameter are:

1. **Tokenization:** The analyzer determines how the text is broken down into individual tokens (words, numbers) that can be indexed and searched. Each generated token must not exceed 32,766 bytes to avoid indexing failures.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

2. **Normalization:** The analyzer can apply various normalization techniques, such as converting text to lowercase, removing stopwords, and stemming/lemmatizing words.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

3. **Consistency:** By defining the same analyzer for both indexing and searching, you ensure that the text analysis process is consistent, which helps improve the relevance of search results.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"indexing and searching" => "index and search operations"?

vagimeli marked this conversation as resolved.
Show resolved Hide resolved

4. **Customization:** OpenSearch allows you to define custom analyzers by specifying the tokenizer, character filters, and token filters to use. This gives you fine-grained control over the text analysis process.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

For information about specific analyzer parameters, such as `analyzer`, `search_analyzer`, and `search_quote_analyzer`, see [Search analyzers]({{site.url}}{{site.baseurl}}/analyzers/search-analyzers/).
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
{: .note}

------------

## Example
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

For example, here's a sample configuration that defines a custom analyzer called `my_custom_analyzer`:
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

```json
PUT my_index
{
"settings": {
"analysis": {
"analyzer": {
"my_custom_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"my_stop_filter",
"my_stemmer"
]
}
},
"filter": {
"my_stop_filter": {
"type": "stop",
"stopwords": ["the", "a", "and", "or"]
},
"my_stemmer": {
"type": "stemmer",
"language": "english"
}
}
}
},
"mappings": {
"properties": {
"my_text_field": {
"type": "text",
"analyzer": "my_custom_analyzer",
"search_analyzer": "standard",
"search_quote_analyzer": "my_custom_analyzer"
}
}
}
}
```
{% include copy-curl.html %}

In this example, the `my_custom_analyzer` uses the standard tokenizer, converts all tokens to lowercase, applies a custom stopword filter, and then applies an English stemmer.

You can then map a text field to use this custom analyzer for both indexing and searching:
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"indexing and searching" => "index and search operations"? "to use" => "so that it uses"?

vagimeli marked this conversation as resolved.
Show resolved Hide resolved

```json
"mappings": {
"properties": {
"my_text_field": {
"type": "text",
"analyzer": "my_custom_analyzer"
}
}
}
```
{% include copy-curl.html %}

By configuring the `analyzer` mapping parameter, you can ensure that your text fields are analyzed consistently and in a way that optimizes the relevance of your search results.
52 changes: 52 additions & 0 deletions _field-types/mapping-parameters/boost.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
---
layout: default
title: boost
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
parent: Mapping parameters
grand_parent: Mapping and field types
nav_order: 10
has_children: false
has_toc: false
---

# `boost`

The `boost` mapping parameter is used to increase or decrease the relevance score of a field during search queries. It allows you to give more or less weight to specific fields when calculating the overall relevance score for a document.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

The `boost` parameter is applied as a multiplier to the score of a field. For example, if a field has a `boost` value of `2`, then the score contribution of that field is doubled. Conversely, a `boost` value of `0.5` would halve the score contribution of that field.

-----------

## Example

The following is an example of how you can use the `boost` parameter in an OpenSearch mapping:

```json
PUT my-index1
{
"mappings": {
"properties": {
"title": {
"type": "text",
"boost": 2
},
"description": {
"type": "text",
"boost": 1
},
"tags": {
"type": "keyword",
"boost": 1.5
}
}
}
}
```
{% include copy-curl.html %}

In this example, the `title` field has a boost of 2, which means it contributes twice as much to the overall relevance score as the description field (which has a boost of 1). The `tags` field has a boost of 1.5, so it contributes 1.5 times more than the description field.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above: Should the numbers be in code font?

The `boost` parameter is particularly useful when you want to give more weight to certain fields that are more important for your use case. For example, you might want to boost the `title` field more than the `description` field, as the title is often a better indicator of the document's relevance.

It is important to note that the `boost` parameter is a multiplicative factor, not an additive one. This means that a field with a higher boost value will have a disproportionately higher impact on the overall relevance score compared to fields with lower boost values.

When using the `boost` parameter, it is recommended to start with small values (1.5 or 2) and test the impact on your search results. Overly high boost values can skew the relevance scores and lead to unexpected or undesirable search results.
98 changes: 98 additions & 0 deletions _field-types/mapping-parameters/coerce.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
---
layout: default
title: coerce
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
parent: Mapping parameters
grand_parent: Mapping and field types
nav_order: 15
has_children: false
has_toc: false
---

# `coerce`

The `coerce` mapping parameter controls how values are converted to the expected data type of a field during indexing. By using this parameter, you can ensure your data is properly formatted and indexed according to the expected field types, helping to maintain data integrity and improve the accuracy of your search results.

## Examples

Here are some examples for using the `coerce` mapping parameter.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

#### Indexing a document with `coerce` enabled

```json
PUT products
{
"mappings": {
"properties": {
"price": {
"type": "integer",
"coerce": true
}
}
}
}

PUT products/_doc/1
{
"name": "Product A",
"price": "19.99"
}
```
{% include copy-curl.html %}

In this example, the `price` field is defined as an `integer` type with `coerce` set to `true`. When indexing the document, the string value `19.99` is coerced to the integer `19`.

#### Indexing a document with `coerce` disabled

```json
PUT orders
{
"mappings": {
"properties": {
"quantity": {
"type": "integer",
"coerce": false
}
}
}
}

PUT orders/_doc/1
{
"item": "Widget",
"quantity": "10"
}
```
{% include copy-curl.html %}

In this example, the `quantity` field is defined as an `integer` type with `coerce` set to `false`. When indexing the document, the string value `10` is not coerced, and the document is rejected due to the type mismatch.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

#### Setting the index-level coercion setting

```json
PUT inventory
{
"settings": {
"index.mapping.coerce": false
},
"mappings": {
"properties": {
"stock_count": {
"type": "integer",
"coerce": true
},
"sku": {
"type": "keyword"
}
}
}
}

PUT inventory/_doc/1
{
"sku": "ABC123",
"stock_count": "50"
}
```
{% include copy-curl.html %}

In this example, the index-level `index.mapping.coerce` setting is set to `false`, which disables coercion for the index. However, the `stock_count` field overrides this setting and enables coercion for that specific field.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
104 changes: 104 additions & 0 deletions _field-types/mapping-parameters/copy-to.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
---
layout: default
title: copy_to
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
parent: Mapping parameters
grand_parent: Mapping and field types
nav_order: 20
has_children: false
has_toc: false
---

# `copy-to`
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

The `copy_to` parameter allows you to copy the values of multiple fields into a single field. This can be useful if you often search across multiple fields, as it allows you to search the group field instead.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

The field value is copied, not the terms resulting from the analysis process. The original `_source` field remains unmodified, and the same value can be copied to multiple fields using the `copy_to` parameter. However, recursive copying through intermediary fields is not supported; instead, use `copy_to` directly from the originating field to multiple target fields.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

For example, if you want to search for products by their name and description, you can use the `copy-to` parameter to copy those values into a single field, as follows:

```json
PUT my-products-index
{
"mappings": {
"properties": {
"name": {
"type": "text",
"copy_to": "product_info"
},
"description": {
"type": "text",
"copy_to": "product_info"
},
"product_info": {
"type": "text"
},
"price": {
"type": "float"
}
}
}
}

PUT my-products-index/_doc/1
{
"name": "Wireless Headphones",
"description": "High-quality wireless headphones with noise cancellation",
"price": 99.99
}

PUT my-products-index/_doc/2
{
"name": "Bluetooth Speaker",
"description": "Portable Bluetooth speaker with long battery life",
"price": 49.99
}
```
{% include copy-curl.html %}

In this example, the values from the name and description fields are copied into the `product_info` field. You can now search for products by querying the `product_info` field, as follows:

```json
GET my-products-index/_search
{
"query": {
"match": {
"product_info": "wireless headphones"
}
}
}
```
{% include copy-curl.html %}

#### Response
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

```json
{
"took": 20,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1.9061546,
"hits": [
{
"_index": "my-products-index",
"_id": "1",
"_score": 1.9061546,
"_source": {
"name": "Wireless Headphones",
"description": "High-quality wireless headphones with noise cancellation",
"price": 99.99
}
}
]
}
}
```
9 changes: 9 additions & 0 deletions _field-types/mapping-parameters/doc-values.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
layout: default
title: doc_values
parent: Mapping parameters
grand_parent: Mapping and field types
nav_order: 25
has_children: false
has_toc: false
---
9 changes: 9 additions & 0 deletions _field-types/mapping-parameters/dynamic.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
layout: default
title: dynamic
parent: Mapping parameters
grand_parent: Mapping and field types
nav_order: 30
has_children: false
has_toc: false
---
9 changes: 9 additions & 0 deletions _field-types/mapping-parameters/eager-global-ordinals.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
layout: default
title: eager_global_ordinals
parent: Mapping parameters
grand_parent: Mapping and field types
nav_order: 35
has_children: false
has_toc: false
---
9 changes: 9 additions & 0 deletions _field-types/mapping-parameters/enabled.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
layout: default
title: enabled
parent: Mapping parameters
grand_parent: Mapping and field types
nav_order: 40
has_children: false
has_toc: false
---
Loading
Loading