-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingest pipeline supports modifying the op_type parameter of an indexing request #15031
base: main
Are you sure you want to change the base?
Conversation
…ng request Signed-off-by: Gao Binlong <gbinlong@amazon.com>
Signed-off-by: Gao Binlong <gbinlong@amazon.com>
❌ Gradle check result for 4025a02: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 8f9f91a: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Gao Binlong <gbinlong@amazon.com>
Signed-off-by: Gao Binlong <gbinlong@amazon.com>
❕ Gradle check result for dde3be1: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #15031 +/- ##
============================================
- Coverage 71.84% 71.81% -0.04%
+ Complexity 62911 62897 -14
============================================
Files 5176 5176
Lines 295133 295149 +16
Branches 42676 42680 +4
============================================
- Hits 212029 211951 -78
- Misses 65709 65754 +45
- Partials 17395 17444 +49 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall with minor suggestions
...les/ingest-common/src/yamlRestTest/resources/rest-api-spec/test/ingest/270_set_processor.yml
Show resolved
Hide resolved
@andrross another look? |
@gaobinlong can you resolve the conflicts? |
Signed-off-by: Gao Binlong <gbinlong@amazon.com>
❌ Gradle check result for a8bd353: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❕ Gradle check result for a8bd353: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
@gaobinlong I'm not convinced this is a good idea. |
Thanks @andrross, this change doesn't target for general cases, but for the case that users want to write to a data stream but don't want to do any code change or configuration change in the client, for example, if Logstash is used to write to a data stream in OpenSearch, the setting
, but if the ingestion tool doesn't support the |
Are there users in this situation asking for this feature? The reason I'm hesitant is that while it does solve the above use case, it seems like it could let users really shoot themselves in the foot, either subtly (the different semantics of |
This PR is stalled because it has been open for 30 days with no activity. |
This PR is stalled because it has been open for 30 days with no activity. |
In this case, given the other fields that can be overridden (e.g I'm trying to think of alternative solutions. The obvious ones would be "Update the client to set the action" or "Set up a proxy that sets the action between the client and cluster". Of course, ingest pipelines are like a proxy that happens to run on the cluster. I'm wondering if it would make sense to add a dedicated processor for this task, rather than doing it through the |
@msfroh I'm in favor of "Update the client to set the action" because A third option would be to allow data streams to support the |
I think the real problem here may be that
It sounds like what we really need to do is leave it @andrross, @gaobinlong, what do you think of that option? |
Thanks @msfroh, your solution already exists, for index API, we don't need to specify
, for bulk API,
@msfroh @andrross , I've checked Logstash, data-prepper, fluent-bit, java-client, all of them can support specifying the |
Description
This PR adds a new metadata field
_op_type
to theIngestDocument
, which makes the ingest processor can modify theop_type
parameter of an indexing request, this is useful when users change the indexing requests' target from a ordinary index to a data stream, but because data stream only supports settingop_type
tocreate
, so the following bulk request will fail with exceptiononly write ops with an op_type of create are allowed in data streams
:, users have to change the
op_type
tocreate
in the request body:, and index API also has this issue.
So this PR gives users an option that they can setup an ingest pipeline with modifying the
op_type
parameter tocreate
to avoid changing the client code, the usage is:Related Issues
Resolves #2856.
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.