Skip to content

Add batch_search to sync Index #305

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 29, 2025
Merged

Add batch_search to sync Index #305

merged 8 commits into from
Mar 29, 2025

Conversation

abrookins
Copy link
Collaborator

@abrookins abrookins commented Mar 28, 2025

Add a batch_search method to SearchIndex and AsyncSearchIndex. These methods run search commands in a pipeline and return correctly-parsed and post-processed results.

TODO:

  • Async version
  • batch_query

@tylerhutcherson tylerhutcherson added the enhancement New feature or request label Mar 28, 2025
Copy link
Collaborator

@tylerhutcherson tylerhutcherson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome. Once you get the sync version of batch_query in, let's merge. One question about batch size defaults and pipeline, but otherwise amazing!

query, query_params=query_params
)
batch_built_queries.append(q)
pipe.execute_command(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh what, for real? I thought we just do pipe.ft(<index-name>).search(...) I swear I have code somewhere that does this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhhhh maybe I did it wrong 😂

Copy link
Collaborator Author

@abrookins abrookins Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well look at that, pipe.ft().search() works just fine. This is a much better outcome.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spoke too soon. search appears to be broken for pipelines. Oh well!


async_index = <redisvl.index.index.AsyncSearchIndex object at 0x10ba26c50>

    @pytest.mark.asyncio
    async def test_batch_search_with_multiple_batches(async_index):
        await async_index.create(overwrite=True, drop=True)
        data = [{"id": "1", "test": "foo"}, {"id": "2", "test": "bar"}]
        await async_index.load(data, id_field="id")
    
>       results = await async_index.batch_search(
            [FilterQuery("@test:{foo}"), FilterQuery("@test:{bar}")],
            batch_size=2,

redisvl/index/index.py:1311: in batch_search
    await pipe.ft(self.schema.index.name).search(  # type: ignore
env/lib/python3.11/site-packages/redis/commands/search/commands.py:945: in search
    return self._parse_results(
env/lib/python3.11/site-packages/redis/commands/search/commands.py:72: in _parse_results
    return self._RESP2_MODULE_CALLBACKS[cmd](res, **kwargs)
env/lib/python3.11/site-packages/redis/commands/search/commands.py:79: in _parse_search
    return Result(
env/lib/python3.11/site-packages/redis/commands/search/result.py:29: in __init__
    self.total = res[0]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <redis.asyncio.client.Pipeline(<redis.asyncio.connection.ConnectionPool(<redis.asyncio.connection.Connection(host=0.0.0.0,port=52824,db=0)>)>)>
name = 0

    def __getitem__(self, name: KeyT):
>       raise TypeError("Async Redis client does not support class retrieval")
E       TypeError: Async Redis client does not support class retrieval

env/lib/python3.11/site-packages/redis/commands/core.py:2551: TypeError


for i, query_results in enumerate(results):
_built_query = batch_built_queries[i]
parsed_raw = search._parse_search( # type: ignore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smart to tap into the search parsing capabilities baked into redis py :)

@abrookins abrookins marked this pull request as ready for review March 29, 2025 00:08
@abrookins abrookins merged commit 161a287 into main Mar 29, 2025
31 checks passed
justin-cechmanek pushed a commit that referenced this pull request Apr 2, 2025
Add `batch_search` and `batch_query` methods to `SearchIndex` and
`AsyncSearchIndex`. These methods run search commands in a pipeline
and return correctly-parsed and post-processed results.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants