Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Search pipelines] Pass "adhocness" flag to processor factories #8164

Merged
merged 4 commits into from
Jul 6, 2023

Conversation

msfroh
Copy link
Collaborator

@msfroh msfroh commented Jun 20, 2023

Description

A named search pipeline may be created with a PUT request, while an "anonymous" or "ad hoc" search pipeline can be defined in the search request source. In the latter case, we don't want to create any "resource-heavy" processors, since they're potentially increasing the cost of every search request, whereas named pipeline processors get reused.

This change passes a configuration flag to a processor factory if it's being called as part of an ad hoc pipeline. The factory can use that information to avoid allocating expensive resources (maybe by throwing an exception instead).

Related Issues

Resolves #8163

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.search.backpressure.SearchBackpressureIT.testSearchShardTaskCancellationWithHighCpu

@codecov
Copy link

codecov bot commented Jun 20, 2023

Codecov Report

Merging #8164 (b391748) into main (c1c23b4) will increase coverage by 0.15%.
The diff coverage is 79.06%.

@@             Coverage Diff              @@
##               main    #8164      +/-   ##
============================================
+ Coverage     70.90%   71.05%   +0.15%     
- Complexity    56903    56994      +91     
============================================
  Files          4758     4758              
  Lines        269225   269236      +11     
  Branches      39407    39406       -1     
============================================
+ Hits         190881   191299     +418     
+ Misses        62256    61842     -414     
- Partials      16088    16095       +7     
Impacted Files Coverage Δ
.../pipeline/common/RenameFieldResponseProcessor.java 94.59% <ø> (ø)
...search/pipeline/common/ScriptRequestProcessor.java 30.00% <0.00%> (-5.30%) ⬇️
...eline/common/SearchPipelineCommonModulePlugin.java 0.00% <ø> (ø)
...pensearch/search/pipeline/PipelineWithMetrics.java 90.65% <88.88%> (-0.44%) ⬇️
...h/pipeline/common/FilterQueryRequestProcessor.java 96.00% <100.00%> (+10.70%) ⬆️
...a/org/opensearch/plugins/SearchPipelinePlugin.java 80.00% <100.00%> (+80.00%) ⬆️
...java/org/opensearch/search/pipeline/Processor.java 100.00% <100.00%> (ø)
...nsearch/search/pipeline/SearchPipelineService.java 84.54% <100.00%> (ø)

... and 463 files with indirect coverage changes

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.search.backpressure.SearchBackpressureIT.testSearchTaskCancellationWithHighCpu

@github-actions
Copy link
Contributor

github-actions bot commented Jul 2, 2023

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      3 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testNodeDropWithOngoingReplication

@mingshl mingshl added the v2.9.0 'Issues and PRs related to version v2.9.0' label Jul 3, 2023
@mingshl mingshl requested a review from dblock July 3, 2023 17:43
msfroh added 4 commits July 6, 2023 09:09
A named search pipeline may be created with a PUT request, while
an "anonymous" or "ad hoc" search pipeline can be defined in the
search request source. In the latter case, we don't want to create any
"resource-heavy" processors, since they're potentially increasing the
cost of every search request, whereas names pipeline processors get
reused.

This change passes a configuration flag to a processor factory if it's
being called as part of an ad hoc pipeline. The factory can use that
information to avoid allocating expensive resources (maybe by throwing
an exception instead).

Signed-off-by: Michael Froh <froh@amazon.com>
Thanks to @dblock for the suggestion to pass the pipeline creation
source in a way that accounts for possible future pipeline sources (and
lets us distinguish between actual named pipeline creation and the
validation create() that executes before we write a pipeline definition
to cluster state).

Signed-off-by: Michael Froh <froh@amazon.com>
Signed-off-by: Michael Froh <froh@amazon.com>
Signed-off-by: Michael Froh <froh@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Jul 6, 2023

Gradle Check (Jenkins) Run Completed with:

@macohen
Copy link
Contributor

macohen commented Jul 6, 2023

@dblock is this change acceptable? looking to merge for the 2.9 release.

@dblock dblock merged commit 431b246 into opensearch-project:main Jul 6, 2023
@dblock dblock added the backport 2.x Backport to 2.x branch label Jul 6, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-8164-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 431b2464324a76e60fb446567504eae846f7b120
# Push it to GitHub
git push --set-upstream origin backport/backport-8164-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-8164-to-2.x.

@dblock
Copy link
Member

dblock commented Jul 6, 2023

Looks like it will need a manual backport, @msfroh

@mingshl
Copy link
Contributor

mingshl commented Jul 7, 2023

Tried to help backport this, but it looks like it's blocking by this PR. We need #7283 getting to 2.x first then backport this one.

@msfroh
Copy link
Collaborator Author

msfroh commented Jul 7, 2023

Tried to help backport this, but it looks like it's blocking by this #7283. We need #7283 getting to 2.x first then backport this one.

Looks like the backport (#8512) was just merged a few hours ago.

msfroh added a commit to msfroh/OpenSearch that referenced this pull request Jul 7, 2023
…search-project#8164)

* [Search pipelines] Pass "adhocness" flag to processor factories

A named search pipeline may be created with a PUT request, while
an "anonymous" or "ad hoc" search pipeline can be defined in the
search request source. In the latter case, we don't want to create any
"resource-heavy" processors, since they're potentially increasing the
cost of every search request, whereas names pipeline processors get
reused.

This change passes a configuration flag to a processor factory if it's
being called as part of an ad hoc pipeline. The factory can use that
information to avoid allocating expensive resources (maybe by throwing
an exception instead).

Signed-off-by: Michael Froh <froh@amazon.com>

* Pass pipeline creation source as enum

Thanks to @dblock for the suggestion to pass the pipeline creation
source in a way that accounts for possible future pipeline sources (and
lets us distinguish between actual named pipeline creation and the
validation create() that executes before we write a pipeline definition
to cluster state).

Signed-off-by: Michael Froh <froh@amazon.com>

* Move PipelineSource into PipelineContext and explicitly pass to create

Signed-off-by: Michael Froh <froh@amazon.com>

* Fix formatting on merge conflict resolution

Signed-off-by: Michael Froh <froh@amazon.com>

---------

Signed-off-by: Michael Froh <froh@amazon.com>
(cherry picked from commit 431b246)
@msfroh
Copy link
Collaborator Author

msfroh commented Jul 7, 2023

Backport PR: #8522

msfroh added a commit to msfroh/OpenSearch that referenced this pull request Jul 10, 2023
…search-project#8164)

* [Search pipelines] Pass "adhocness" flag to processor factories

A named search pipeline may be created with a PUT request, while
an "anonymous" or "ad hoc" search pipeline can be defined in the
search request source. In the latter case, we don't want to create any
"resource-heavy" processors, since they're potentially increasing the
cost of every search request, whereas names pipeline processors get
reused.

This change passes a configuration flag to a processor factory if it's
being called as part of an ad hoc pipeline. The factory can use that
information to avoid allocating expensive resources (maybe by throwing
an exception instead).

Signed-off-by: Michael Froh <froh@amazon.com>

* Pass pipeline creation source as enum

Thanks to @dblock for the suggestion to pass the pipeline creation
source in a way that accounts for possible future pipeline sources (and
lets us distinguish between actual named pipeline creation and the
validation create() that executes before we write a pipeline definition
to cluster state).

Signed-off-by: Michael Froh <froh@amazon.com>

* Move PipelineSource into PipelineContext and explicitly pass to create

Signed-off-by: Michael Froh <froh@amazon.com>

* Fix formatting on merge conflict resolution

Signed-off-by: Michael Froh <froh@amazon.com>

---------

Signed-off-by: Michael Froh <froh@amazon.com>
(cherry picked from commit 431b246)
tlfeng pushed a commit that referenced this pull request Jul 11, 2023
… (#8522)

* [Search pipelines] Pass "adhocness" flag to processor factories

A named search pipeline may be created with a PUT request, while
an "anonymous" or "ad hoc" search pipeline can be defined in the
search request source. In the latter case, we don't want to create any
"resource-heavy" processors, since they're potentially increasing the
cost of every search request, whereas names pipeline processors get
reused.

This change passes a configuration flag to a processor factory if it's
being called as part of an ad hoc pipeline. The factory can use that
information to avoid allocating expensive resources (maybe by throwing
an exception instead).

Backport from commit 431b246

---------

Signed-off-by: Michael Froh <froh@amazon.com>
(cherry picked from commit 431b246)
vikasvb90 pushed a commit to raghuvanshraj/OpenSearch that referenced this pull request Jul 12, 2023
…search-project#8164)

* [Search pipelines] Pass "adhocness" flag to processor factories

A named search pipeline may be created with a PUT request, while
an "anonymous" or "ad hoc" search pipeline can be defined in the
search request source. In the latter case, we don't want to create any
"resource-heavy" processors, since they're potentially increasing the
cost of every search request, whereas names pipeline processors get
reused.

This change passes a configuration flag to a processor factory if it's
being called as part of an ad hoc pipeline. The factory can use that
information to avoid allocating expensive resources (maybe by throwing
an exception instead).

Signed-off-by: Michael Froh <froh@amazon.com>

* Pass pipeline creation source as enum

Thanks to @dblock for the suggestion to pass the pipeline creation
source in a way that accounts for possible future pipeline sources (and
lets us distinguish between actual named pipeline creation and the
validation create() that executes before we write a pipeline definition
to cluster state).

Signed-off-by: Michael Froh <froh@amazon.com>

* Move PipelineSource into PipelineContext and explicitly pass to create

Signed-off-by: Michael Froh <froh@amazon.com>

* Fix formatting on merge conflict resolution

Signed-off-by: Michael Froh <froh@amazon.com>

---------

Signed-off-by: Michael Froh <froh@amazon.com>
raghuvanshraj pushed a commit to raghuvanshraj/OpenSearch that referenced this pull request Jul 12, 2023
…search-project#8164)

* [Search pipelines] Pass "adhocness" flag to processor factories

A named search pipeline may be created with a PUT request, while
an "anonymous" or "ad hoc" search pipeline can be defined in the
search request source. In the latter case, we don't want to create any
"resource-heavy" processors, since they're potentially increasing the
cost of every search request, whereas names pipeline processors get
reused.

This change passes a configuration flag to a processor factory if it's
being called as part of an ad hoc pipeline. The factory can use that
information to avoid allocating expensive resources (maybe by throwing
an exception instead).

Signed-off-by: Michael Froh <froh@amazon.com>

* Pass pipeline creation source as enum

Thanks to @dblock for the suggestion to pass the pipeline creation
source in a way that accounts for possible future pipeline sources (and
lets us distinguish between actual named pipeline creation and the
validation create() that executes before we write a pipeline definition
to cluster state).

Signed-off-by: Michael Froh <froh@amazon.com>

* Move PipelineSource into PipelineContext and explicitly pass to create

Signed-off-by: Michael Froh <froh@amazon.com>

* Fix formatting on merge conflict resolution

Signed-off-by: Michael Froh <froh@amazon.com>

---------

Signed-off-by: Michael Froh <froh@amazon.com>
buddharajusahil pushed a commit to buddharajusahil/OpenSearch that referenced this pull request Jul 18, 2023
…search-project#8164)

* [Search pipelines] Pass "adhocness" flag to processor factories

A named search pipeline may be created with a PUT request, while
an "anonymous" or "ad hoc" search pipeline can be defined in the
search request source. In the latter case, we don't want to create any
"resource-heavy" processors, since they're potentially increasing the
cost of every search request, whereas names pipeline processors get
reused.

This change passes a configuration flag to a processor factory if it's
being called as part of an ad hoc pipeline. The factory can use that
information to avoid allocating expensive resources (maybe by throwing
an exception instead).

Signed-off-by: Michael Froh <froh@amazon.com>

* Pass pipeline creation source as enum

Thanks to @dblock for the suggestion to pass the pipeline creation
source in a way that accounts for possible future pipeline sources (and
lets us distinguish between actual named pipeline creation and the
validation create() that executes before we write a pipeline definition
to cluster state).

Signed-off-by: Michael Froh <froh@amazon.com>

* Move PipelineSource into PipelineContext and explicitly pass to create

Signed-off-by: Michael Froh <froh@amazon.com>

* Fix formatting on merge conflict resolution

Signed-off-by: Michael Froh <froh@amazon.com>

---------

Signed-off-by: Michael Froh <froh@amazon.com>
Signed-off-by: sahil buddharaju <sahilbud@amazon.com>
baba-devv pushed a commit to baba-devv/OpenSearch that referenced this pull request Jul 29, 2023
…search-project#8164)

* [Search pipelines] Pass "adhocness" flag to processor factories

A named search pipeline may be created with a PUT request, while
an "anonymous" or "ad hoc" search pipeline can be defined in the
search request source. In the latter case, we don't want to create any
"resource-heavy" processors, since they're potentially increasing the
cost of every search request, whereas names pipeline processors get
reused.

This change passes a configuration flag to a processor factory if it's
being called as part of an ad hoc pipeline. The factory can use that
information to avoid allocating expensive resources (maybe by throwing
an exception instead).

Signed-off-by: Michael Froh <froh@amazon.com>

* Pass pipeline creation source as enum

Thanks to @dblock for the suggestion to pass the pipeline creation
source in a way that accounts for possible future pipeline sources (and
lets us distinguish between actual named pipeline creation and the
validation create() that executes before we write a pipeline definition
to cluster state).

Signed-off-by: Michael Froh <froh@amazon.com>

* Move PipelineSource into PipelineContext and explicitly pass to create

Signed-off-by: Michael Froh <froh@amazon.com>

* Fix formatting on merge conflict resolution

Signed-off-by: Michael Froh <froh@amazon.com>

---------

Signed-off-by: Michael Froh <froh@amazon.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…search-project#8164)

* [Search pipelines] Pass "adhocness" flag to processor factories

A named search pipeline may be created with a PUT request, while
an "anonymous" or "ad hoc" search pipeline can be defined in the
search request source. In the latter case, we don't want to create any
"resource-heavy" processors, since they're potentially increasing the
cost of every search request, whereas names pipeline processors get
reused.

This change passes a configuration flag to a processor factory if it's
being called as part of an ad hoc pipeline. The factory can use that
information to avoid allocating expensive resources (maybe by throwing
an exception instead).

Signed-off-by: Michael Froh <froh@amazon.com>

* Pass pipeline creation source as enum

Thanks to @dblock for the suggestion to pass the pipeline creation
source in a way that accounts for possible future pipeline sources (and
lets us distinguish between actual named pipeline creation and the
validation create() that executes before we write a pipeline definition
to cluster state).

Signed-off-by: Michael Froh <froh@amazon.com>

* Move PipelineSource into PipelineContext and explicitly pass to create

Signed-off-by: Michael Froh <froh@amazon.com>

* Fix formatting on merge conflict resolution

Signed-off-by: Michael Froh <froh@amazon.com>

---------

Signed-off-by: Michael Froh <froh@amazon.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch Search Search query, autocomplete ...etc v2.9.0 'Issues and PRs related to version v2.9.0'
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[Search Pipelines] Let a processor factory know if it's being called from an ad hoc / per-request pipeline
4 participants