Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mapping parameters documentation #7115

Merged
merged 70 commits into from
Oct 14, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
43c661a
Add mapping parameters documentation
vagimeli May 8, 2024
f6d9428
Add mapping parameters documentation
vagimeli May 8, 2024
5d49c33
Add mapping parameters documentation
vagimeli May 10, 2024
abdbd51
Add mapping parameters documentation
vagimeli May 13, 2024
3ca39b5
Add mapping parameters documentation
vagimeli May 13, 2024
c562982
Add files
vagimeli May 17, 2024
506bba5
Add files
vagimeli May 17, 2024
295f134
Add files
vagimeli May 17, 2024
59ff4e1
Write first-pass draft
vagimeli May 17, 2024
67cfe85
Update _field-types/mapping-parameters/analyzer.md
vagimeli Sep 30, 2024
69ed961
Merge branch 'main' into mapping-parameters
vagimeli Sep 30, 2024
6a71f49
Update _field-types/mapping-parameters/copy-to.md
vagimeli Sep 30, 2024
20265ba
Update _field-types/mapping-parameters/coerce.md
vagimeli Sep 30, 2024
aa316d0
Address final tech review comments
vagimeli Oct 8, 2024
8ee94de
Update _field-types/mapping-parameters/index.md
vagimeli Oct 10, 2024
771bfb0
Update _field-types/mapping-parameters/index.md
vagimeli Oct 10, 2024
5e5abcd
Update _field-types/mapping-parameters/copy-to.md
vagimeli Oct 10, 2024
3043bf1
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 10, 2024
f3c1128
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 10, 2024
205de69
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 10, 2024
db765a2
Update _field-types/mapping-parameters/coerce.md
vagimeli Oct 10, 2024
9c4d834
Update _field-types/mapping-parameters/copy-to.md
vagimeli Oct 10, 2024
b15e813
Update analyzer.md
vagimeli Oct 10, 2024
3f5e16b
Update boost.md
vagimeli Oct 10, 2024
4489740
Update coerce.md
vagimeli Oct 10, 2024
83fa99b
Update copy-to.md
vagimeli Oct 10, 2024
636ea75
Merge branch 'main' into mapping-parameters
vagimeli Oct 11, 2024
b00b542
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 14, 2024
2144849
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 14, 2024
f1e3330
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 14, 2024
8f27836
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 14, 2024
f232f81
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 14, 2024
2b34a12
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 14, 2024
f87aceb
Update _field-types/mapping-parameters/boost.md
vagimeli Oct 14, 2024
f6ef5aa
Update _field-types/mapping-parameters/boost.md
vagimeli Oct 14, 2024
d0b82b9
Update _field-types/mapping-parameters/boost.md
vagimeli Oct 14, 2024
6a1abf9
Update _field-types/mapping-parameters/boost.md
vagimeli Oct 14, 2024
021d7d5
Update _field-types/mapping-parameters/boost.md
vagimeli Oct 14, 2024
65d04b6
Update _field-types/mapping-parameters/coerce.md
vagimeli Oct 14, 2024
206408d
Update _field-types/mapping-parameters/coerce.md
vagimeli Oct 14, 2024
f978231
Update _field-types/mapping-parameters/coerce.md
vagimeli Oct 14, 2024
e9d2d32
Update _field-types/mapping-parameters/coerce.md
vagimeli Oct 14, 2024
6a1dc3a
Update _field-types/mapping-parameters/copy-to.md
vagimeli Oct 14, 2024
5a677ec
Update _field-types/mapping-parameters/copy-to.md
vagimeli Oct 14, 2024
6ea6e8c
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
5b369a5
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
a26ed71
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
4d3e32d
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
a2e4af6
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
0e4a2d1
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
ccf3081
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
adcee30
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
0db800b
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
d226acb
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
c411314
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
df8513f
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
9198ea5
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
0959cae
Update _field-types/mapping-parameters/index.md
vagimeli Oct 14, 2024
f3f1d19
Merge branch 'main' into mapping-parameters
vagimeli Oct 14, 2024
dfbd331
Update front matter
vagimeli Oct 14, 2024
d89fda1
Update _field-types/mapping-parameters/analyzer.md
vagimeli Oct 14, 2024
b81be6b
Update _field-types/mapping-parameters/boost.md
vagimeli Oct 14, 2024
25184f0
Update _field-types/mapping-parameters/coerce.md
vagimeli Oct 14, 2024
6e74fb2
Update _field-types/mapping-parameters/copy-to.md
vagimeli Oct 14, 2024
8e129e4
Delete empty files
vagimeli Oct 14, 2024
6d624bc
Merge branch 'mapping-parameters' of https://github.com/opensearch-pr…
vagimeli Oct 14, 2024
17b8012
Delete empty files
vagimeli Oct 14, 2024
43e0712
Delete empty files
vagimeli Oct 14, 2024
a7437d1
Delete empty files
vagimeli Oct 14, 2024
45b1a88
Delete empty files
vagimeli Oct 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 89 additions & 0 deletions _field-types/mapping-parameters/analyzer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
---
layout: default
title: Analyzer
Parent: Mapping parameters
Grand_parent: Mapping and field types
nav_order: 5
has_children: false
has_toc: false
---

# Analyzer

The `analyzer` mapping parameter is used to define the text analysis process that is applied to a text field during both indexing and searching operations.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

The key functions of the `analyzer` mapping parameter are:

1. **Tokenization:** The analyzer determines how the text is broken down into individual tokens (words, numbers) that can be indexed and searched.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

2. **Normalization:** The analyzer can apply various normalization techniques, such as converting text to lowercase, removing stopwords, and stemming/lemmatizing words.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

3. **Consistency:** By defining the same analyzer for both indexing and searching, you ensure that the text analysis process is consistent, which helps improve the relevance of search results.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"indexing and searching" => "index and search operations"?

vagimeli marked this conversation as resolved.
Show resolved Hide resolved

4. **Customization:** OpenSearch allows you to define custom analyzers by specifying the tokenizer, character filters, and token filters to use. This gives you fine-grained control over the text analysis process.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

------------

## Example
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

For example, here's a sample configuration that defines a custom analyzer called `my_custom_analyzer`:
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

```json
PUT my_index
{
"settings": {
"analysis": {
"analyzer": {
"my_custom_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"my_stop_filter",
"my_stemmer"
]
}
},
"filter": {
"my_stop_filter": {
"type": "stop",
"stopwords": ["the", "a", "and", "or"]
},
"my_stemmer": {
"type": "stemmer",
"language": "english"
}
}
}
},
"mappings": {
"properties": {
"my_text_field": {
"type": "text",
"analyzer": "my_custom_analyzer",
"search_analyzer": "standard",
"search_quote_analyzer": "my_custom_analyzer"
}
}
}
}
```
{% include copy-curl.html %}

In this example, the `my_custom_analyzer` uses the standard tokenizer, converts all tokens to lowercase, applies a custom stopword filter, and then applies an English stemmer.

You can then map a text field to use this custom analyzer for both indexing and searching:
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"indexing and searching" => "index and search operations"? "to use" => "so that it uses"?

vagimeli marked this conversation as resolved.
Show resolved Hide resolved

```json
"mappings": {
"properties": {
"my_text_field": {
"type": "text",
"analyzer": "my_custom_analyzer"
}
}
}
```
{% include copy-curl.html %}

By configuring the `analyzer` mapping parameter, you can ensure that your text fields are analyzed consistently and in a way that optimizes the relevance of your search results.
28 changes: 28 additions & 0 deletions _field-types/mapping-parameters/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
---
layout: default
title: Mapping parameters
nav_order: 75
has_children: true
has_toc: false
---

# Mapping parameters

Mapping parameters are used to configure the behavior of fields in an index. For parameter use cases, see the mapping parameter's respective page.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

The following table lists OpenSearch mapping parameters.

Parameter | Description
:--- | :---
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 18: I'm not following "Specifies a field-level query time to boost." Can we rephrase for clarity? Line 21: It looks like we provide two sets of allowed values.

Copy link
Contributor Author

@vagimeli vagimeli Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised as follows: Specifies a field-level boost factor applied at query time. Allows you to increase or decrease the relevance score of a specific field during search queries.

`analyzer` | Specifies the analyzer used to analyze string fields. Default is the `standard` analyzer, which is a general-purpose analyzer that splits text on white space and punctuation, converts to lowercase, and removes stop words. Allowed values are `standard`, `simple`, and`whitespace`.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`boost` | Specifies a field-level query time to boost. Default boost value is `1.0`, which means no boost is applied. Allowed values are any floating-point number.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`coerce` | Tries to convert the value to the specified data type. Default value is `true`, which means OpenSearch tries to coerce the value to the expected value type. Allowed values are `true` or `false`.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`copy_to` | Copies the values of this field to another field. There is no default value for this parameter. It is an optional parameter that allows you to copy the value of a field to another field.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`doc_values` | Specifies whether the field should be stored on disk to make sorting and aggregation faster. Default value is `true`, which means the doc values are enabled. Allowed values are a single field name or a list of field names. Allowed values are `true` or `false`.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`dynamic` | Determines whether new fields should be added dynamically. Default value is `true`, which means new fields can be added dynamically. Allowed values are `true`, `false`, or `strict`.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`enabled` | Specifies whether the field is enabled or disabled. Default value is `true`, which means the field is enabled. Allowed values are `true` or `false`.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`format` | Specifies the date format for date fields. There is no default value for this parameter. It is used for date fields to specify the date format. Allowed values are any valid date format string, such as `yyyy-MM-dd` or `epoch_millis`.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`ignore_above` | Skips indexing values that are longer than the specified length. Default value is `2147483647`, which means there is no limit on the length of the field value. Allowed values are any positive integer.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`ignore_malformed` | Specifies whether malformed values should be ignored. Default value is `false`, which means malformed values are not ignored. . Allowed values are `true` or `false`.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`index` | Specifies whether the field should be indexed. Default value is `true`, which means the field is indexed. Allowed values are `true`, `false`, or `not_analyzed`.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
`index_options` | Specifies what information should be stored in the index for scoring purposes. Default value `docs`, which means only the document numbers are stored in the index. Allowed values are `docs`, `freqs`, `positions`, or `offsets`.
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
Loading