Skip to content

Commit

Permalink
Add note about Data Prepper versus ingest processors (#6886)
Browse files Browse the repository at this point in the history
* Add note about Data Prepper versus ingest

Signed-off-by: Melissa Vagi <vagimeli@amazon.com>

* Add note about Data Prepper versus ingest

Signed-off-by: Melissa Vagi <vagimeli@amazon.com>

---------

Signed-off-by: Melissa Vagi <vagimeli@amazon.com>
  • Loading branch information
vagimeli authored Apr 11, 2024
1 parent 75136a5 commit 4957271
Show file tree
Hide file tree
Showing 11 changed files with 35 additions and 4 deletions.
6 changes: 5 additions & 1 deletion _ingest-pipelines/processors/append.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,14 @@ nav_order: 10
redirect_from:
- /api-reference/ingest-apis/processors/append/
---


This documentation describes using the `append` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `add_entries` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/add-entries/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# Append processor

The `append` processor is used to add values to a field:

- If the field is an array, the `append` processor appends the specified values to that array.
- If the field is a scalar field, the `append` processor converts it to an array and appends the specified values to that array.
- If the field does not exist, the `append` processor creates an array with the specified values.
Expand Down
3 changes: 3 additions & 0 deletions _ingest-pipelines/processors/convert.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ redirect_from:
- /api-reference/ingest-apis/processors/convert/
---

This documentation describes using the `convert` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `convert_entry_type` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/convert_entry_type/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# Convert processor

The `convert` processor converts a field in a document to a different type, for example, a string to an integer or an integer to a string. For an array field, all values in the array are converted.
Expand Down
3 changes: 3 additions & 0 deletions _ingest-pipelines/processors/copy.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ redirect_from:
- /api-reference/ingest-apis/processors/copy/
---

This documentation describes using the `copy` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `copy_values` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/copy-values/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# Copy processor

The `copy` processor copies an entire object in an existing field to another field.
Expand Down
3 changes: 3 additions & 0 deletions _ingest-pipelines/processors/csv.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ redirect_from:
- /api-reference/ingest-apis/processors/csv/
---

This documentation describes using the `csv` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `csv` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/csv/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# CSV processor

The `csv` processor is used to parse CSVs and store them as individual fields in a document. The processor ignores empty fields.
Expand Down
3 changes: 3 additions & 0 deletions _ingest-pipelines/processors/date.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ redirect_from:
- /api-reference/ingest-apis/processors/date/
---

This documentation describes using the `date` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `date` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/date/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# Date processor

The `date` processor is used to parse dates from document fields and to add the parsed data to a new field. By default, the parsed data is stored in the `@timestamp` field.
Expand Down
3 changes: 3 additions & 0 deletions _ingest-pipelines/processors/dissect.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ parent: Ingest processors
nav_order: 60
---

This documentation describes using the `dissect` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `dissect` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/dissect/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# Dissect

The `dissect` processor extracts values from a document text field and maps them to individual fields based on dissect patterns. The processor is well suited for field extractions from log messages with a known structure. Unlike the `grok` processor, `dissect` does not use regular expressions and has a simpler syntax.
Expand Down
3 changes: 3 additions & 0 deletions _ingest-pipelines/processors/drop.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ parent: Ingest processors
nav_order: 70
---

This documentation describes using the `drop` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `drop_events` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/drop-events/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# Drop processor

The `drop` processor is used to discard documents without indexing them. This can be useful for preventing documents from being indexed based on certain conditions. For example, you might use a `drop` processor to prevent documents that are missing important fields or contain sensitive information from being indexed.
Expand Down
6 changes: 3 additions & 3 deletions _ingest-pipelines/processors/grok.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@ grand_parent: Ingest pipelines
nav_order: 140
---

This documentation describes using the `grok` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `grok` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/grok/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# Grok processor

The `grok` processor is used to parse and structure unstructured data using pattern matching. You can use the `grok` processor to extract fields from log messages, web server access logs, application logs, and other log data that follows a consistent format.

This documentation describes using the `grok` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `grok` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/grok/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

## Grok basics

The `grok` processor uses a set of predefined patterns to match parts of the input text. Each pattern consists of a name and a regular expression. For example, the pattern `%{IP:ip_address}` matches an IP address and assigns it to the field `ip_address`. You can combine multiple patterns to create more complex expressions. For example, the pattern `%{IP:client} %{WORD:method} %{URIPATHPARM:request} %{NUMBER:bytes %NUMBER:duration}` matches a line from a web server access log and extracts the client IP address, the HTTP method, the request URI, the number of bytes sent, and the duration of the request.
Expand Down
3 changes: 3 additions & 0 deletions _ingest-pipelines/processors/kv.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ redirect_from:
- /api-reference/ingest-apis/processors/lowercase/
---

This documentation describes using the `kv` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `key_value` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/key-value/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# KV processor

The `kv` processor automatically extracts specific event fields or messages that are in a `key=value` format. This structured format organizes your data by grouping it together based on keys and values. It's helpful for analyzing, visualizing, and using data, such as user behavior analytics, performance optimizations, or security investigations.
Expand Down
3 changes: 3 additions & 0 deletions _ingest-pipelines/processors/lowercase.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ redirect_from:
- /api-reference/ingest-apis/processors/lowercase/
---

This documentation describes using the `lowercase` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `lowercase_string` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/lowercase-string/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# Lowercase processor

The `lowercase` processor converts all the text in a specific field to lowercase letters.
Expand Down
3 changes: 3 additions & 0 deletions _ingest-pipelines/processors/uppercase.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ redirect_from:
- /api-reference/ingest-apis/processors/uppercase/
---

This documentation describes using the `uppercase` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `uppercase_string` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/uppercase-string/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# Uppercase processor

The `uppercase` processor converts all the text in a specific field to uppercase letters.
Expand Down

0 comments on commit 4957271

Please sign in to comment.