-
Notifications
You must be signed in to change notification settings - Fork 55
Initial Hybrid Search (FT.HYBRID) bindings #454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
atakavci
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other than a couple of questions and small details, looks pretty good to me.
| @@ -0,0 +1,22 @@ | |||
| Redis 8.4 is currently in preview and may be subject to change. | |||
|
|
|||
| *Hybrid Search* is a new feature in Redis 8.4 that allows you to search across multiple indexes and data types. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multiple indexes
needs small change with wording.
| } | ||
| } | ||
|
|
||
| private sealed class LinearCombiner : Combiner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WINDOW is missing with LinearCombiner.
https://redis.io/docs/latest/commands/ft.hybrid/
|
|
||
| public sealed partial class HybridSearchQuery | ||
| { | ||
| public abstract class Combiner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why YIELD_SCORE_AS is left out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the server doesn't implement this yet
| /// <summary> | ||
| /// Configure the score fusion method (optional). If not provided, Reciprocal Rank Fusion (RRF) is used with server-side default parameters. | ||
| /// </summary> | ||
| internal HybridSearchQuery Combine(Combiner combiner, string scoreAlias) // YIELD_SCORE_AS not yet implemented |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a bit confused, the comment on YIELD_SCORE_AS is a left over?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the server doesn't implement this yet
| /// <summary> | ||
| /// Add the list of fields to return in the results. Well-known fields are available via <see cref="Fields"/>. | ||
| /// </summary> | ||
| public HybridSearchQuery ReturnFields(string field) // naming for consistency with SearchQuery |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this just to avoid array allocation for single item?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, basically
|
|
||
| #if !NET9_0_OR_GREATER | ||
| [AttributeUsage(AttributeTargets.Method, AllowMultiple = false)] | ||
| internal sealed class OverloadResolutionPriorityAttribute(int priority) : Attribute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no use of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will remove; there was at some point
| } | ||
| break; | ||
| } | ||
| static int CountReducer(Reducer reducer) => 3 + reducer.ArgCount() + (reducer.Alias is null ? 0 : 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a way to make these calculations easier to maintain (or self explaning)?
feels like these expressions could get confusing quickly in between command syntax changes in time.
| /// common queries, by passing the search operands as named parameters. | ||
| /// </summary> | ||
| [Experimental(Experiments.Server_8_4, UrlFormat = Experiments.UrlFormat)] | ||
| public sealed partial class HybridSearchQuery |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets put it explicitly that class is not thread-safe and not ready to use by multiple threads with as is condition.
tasks:
primary API is new
HybridSearchResult ISearchCommands.HybridSearch((string indexName, HybridSearchQuery query, IReadOnlyDictionary<string, object>? parameters)API (andAsynctwin).Note that because
FT.HYBRIDsupports parameterization in both the queries and the filters, I have deviated from the pattern forFT.SEARCH, in that the parameters are specified separately to the query. This means that a single query instance can be created and stored, then reused repeatedly with different values. To avoid concurrency concerns, "popsicle immutability" is used in the query object, becoming automatically frozen when issued. For convenience, a secondary type-based API is presented to allow parameters from, for example, anonymous types.HybridSearchQueryacts as a builder, so:However, a full API for the supported server features is available, for example:
To complement this, we introduce some additional additional types; the types specific to hybrid-search are mostly inside
NRedisStack.Search.HybridSearchQuery; the types that are also conceptually wider to all of search are inNRedisStack.Search.NRedisStack.Search.ApplyExpression- immutable readonly struct tuple ofstring expressionandstring? alias-NRedisStack.Search.Scorer- immutable class, private subtypes forTFIDF,BM25STD, etc from https://redis.io/docs/latest/develop/ai/search-and-query/advanced-concepts/scoring/NRedisStack.Search.VectorData- abstracts over ways of representing vector data; readonly structNRedisStack.Search.HybridSearchQuery- the new builder APINRedisStack.Search.HybridSearchQuery- the new result APINRedisStack.Search.HybridSearchQuery.Combiner- immutable class, private subtypes for RRD and linear combinersNRedisStack.Search.HybridSearchQuery.SearchConfig- immutable readonly struct describing a text search with scorer and aliasNRedisStack.Search.HybridSearchQuery.VectorSearchConfig- immutable readonly struct describing a vector search with method, filter, aliasNRedisStack.Search.VectorSearchMethod- immutable class, private subtypes for "nearest neighbour", "range", etcNRedisStack.Search.Parameters- allows objects to be used as parameter sourcesAt the moment
VectorDataonly supportsROM<float>(float[]), in line with SE.Redis VSIM support, but we can extend this as necessary.To make working with vector data more convenient, a new
VectorDataAPI is added, with use-cases:var vec = VectorData.Raw(blob);- raw handling, owner handles lifetimevar vec = VectorData.Parameter(name);- vector deferred to the parameters collectionusing var vec = VectorData.Lease<Half>(20);- lease a vector backed by raw bytes, interpreted asHalf, with.Spana convenience accessorusing var vec = VectorData.LeaseWithValues(...)- likeLease, but inferring type and dimension from the supplied payload, which becomes the initial payloadThere is no quantization etc; the caller is required to understand their vector type and send the values appropriately, but this is hopefully a little easier than making the consumer responsible for the byte shunt.
Note that in all cases, the vector is passed in
PARAMSto prevent$ambiguity, and to pre-empt future server API changes here.The new APIs are marked
[Experimental], pointing people to (when merged) a portal with the contents from https://github.com/redis/NRedisStack/blob/9548d539ff72ac370c7c0706e8e9c53b8492f1b3/docs/exp/NRS001.mdAPI note: the significance of
With*in theVectorSearchConfigandSearchConfigis that these are "withers", i.e. while it is a fluent API, the underlying type is immutable, so this returns a different instance (in this case, of a value-type). This contrasts with theHybridSearchQuerywhich pairs withSearchQueryin exposing a mutable fluent API, where each method endsreturn this.Note that this also includes a fix for #453