Skip to content

Conversation

ioanatia
Copy link
Contributor

@ioanatia ioanatia commented Sep 16, 2025

follows the same approach from #133720 where we add support for querying multiple indices for the simplified linear retriever.
#133720 already does most of it, here we just need to remove the validation check that disallowed querying multiple indices and add tests.
the tests are very similar to #133720, I just needed to adjust them slightly since specifying boosts for fields is disallowed for the simplified RRF retriever.

@ioanatia ioanatia added >enhancement :SearchOrg/Relevance Label for the Search (solution/org) Relevance team Team:Search - Relevance The Search organization Search Relevance team v9.2.0 labels Sep 16, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @ioanatia, I've created a changelog YAML for you.

@ioanatia ioanatia changed the title Query multiple indices with simplified RRF Support querying multiple indices with the simplified linear retriever Sep 16, 2025
@ioanatia ioanatia changed the title Support querying multiple indices with the simplified linear retriever Support querying multiple indices with the simplifiedrrf retriever Sep 16, 2025
@ioanatia ioanatia changed the title Support querying multiple indices with the simplifiedrrf retriever Support querying multiple indices with the simplified RRF retriever Sep 16, 2025
@ioanatia ioanatia marked this pull request as ready for review September 16, 2025 18:53
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-relevance (Team:Search - Relevance)

Copy link
Contributor

@mridula-s109 mridula-s109 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@kderusso kderusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, nice work! 👏

One question on tests.


- match: { hits.total.value: 6 }
- length: { hits.hits: 6 }
- match: { hits.hits.0._id: "4" }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we tested this with different shard counts, to make sure it won't be flakey in serverless where the default shard count is different or with different randomized shard configurations?

Copy link
Contributor Author

@ioanatia ioanatia Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not - but this is just following the same pattern for linear, for which I have not seen any failures.
we are also not matching on the score values, which always has the potential for flakiness.

Copy link
Contributor

@Mikep86 Mikep86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I left one non-blocking comment about a small test change.

Comment on lines +322 to +350
// Non-default rank window size
retriever = new RRFRetrieverBuilder(
null,
List.of("field_1", "field_2", "semantic_field_1", "semantic_field_2"),
"foo2",
DEFAULT_RANK_WINDOW_SIZE * 2,
RRFRetrieverBuilder.DEFAULT_RANK_CONSTANT,
new float[0]
);
assertMultiIndexMultiFieldsParamsRewrite(
retriever,
queryRewriteContext,
Map.of(
Map.of("field_1", 1.0f, "field_2", 1.0f),
List.of(indexName),
Map.of("field_1", 1.0f, "field_2", 1.0f, "semantic_field_1", 1.0f),
List.of(anotherIndexName)
),
Map.of(
new Tuple<>("semantic_field_1", List.of(indexName)),
1.0f,
new Tuple<>("semantic_field_2", List.of(indexName)),
1.0f,
new Tuple<>("semantic_field_2", List.of(anotherIndexName)),
1.0f
),
"foo2",
null
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use this to test non-default rank constant as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

promise to do it in a follow up - I don't want to keep @mridula-s109 waiting too much on this PR to continue the work on #132680 - this is the second PR that will introduce conflicts for her PR.

@ioanatia ioanatia merged commit bf15d15 into elastic:main Sep 18, 2025
34 checks passed
@ioanatia ioanatia deleted the rrf_linear_retriever_multi_index branch September 18, 2025 08:02
gmjehovich pushed a commit to gmjehovich/elasticsearch that referenced this pull request Sep 18, 2025
szybia added a commit to szybia/elasticsearch that referenced this pull request Sep 18, 2025
* upstream/main: (43 commits)
  Unmute testAckedIndexing to see if it still fails on main (elastic#134682)
  Silence time zone ID deprecation warning for JDK 25 due to log4j2 bug. (elastic#134719)
  Adding a getUnmodifiableSourceAndMetadata() method to IngestDocument (elastic#134816)
  Mark the create-index-from-source action as publicly available on Serverless (elastic#134953)
  ESQL: Rename command from INLINESTATS to INLINE STATS (elastic#134827)
  Document multi index query support for simplified retrievers (elastic#134980)
  [ML] Fix YAMl test to use correct query parameter type (elastic#134999)
  [Transform] Wait for PIT to close (elastic#134955)
  Add XPath to XmlUtils (elastic#134923)
  Fixing conditional processor mutability bugs (elastic#134936)
  [Transform] Lower loglevel of 3 transform-related error messages from ERROR to WARN (elastic#134985)
  Unmute pattern text tests. (elastic#134981)
  Integrate weights into simplified RRF retriever syntax (elastic#132680)
  Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:stats.CountDistinctWithConditions} elastic#134993
  Update periodic java-ea build to test java 26 pre-release (elastic#134983)
  Mute org.elasticsearch.xpack.esql.ccq.MultiClusterSpecIT test {csv-spec:stats.CountDistinctWithConditions} elastic#134984
  Fix and unmute testIndexSettingProviderPrivateSetting (elastic#134861)
  Add missing common cat params (elastic#134870)
  Support querying multiple indices with the simplified RRF retriever (elastic#134822)
  Allow including semantic field embeddings in _source (elastic#134717)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :SearchOrg/Relevance Label for the Search (solution/org) Relevance team Team:Search - Relevance The Search organization Search Relevance team v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants