-
Notifications
You must be signed in to change notification settings - Fork 4.4k
.NET: Add ITextSearch<TRecord> with LINQ filtering and deprecate legacy ITextSearch (#10456) #13179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
.NET: Add ITextSearch<TRecord> with LINQ filtering and deprecate legacy ITextSearch (#10456) #13179
Conversation
…obsolete VectorSearchFilter - Replace obsolete VectorSearchFilter conversion with direct LINQ filtering for simple equality filters - Add ConvertTextSearchFilterToLinq() method to handle TextSearchFilter.Equality() cases - Fall back to legacy approach only for complex filters that cannot be converted - Eliminates technical debt and performance overhead identified in Issue microsoft#10456 - Maintains 100% backward compatibility - all existing tests pass (1,574/1,574) - Reduces object allocations and removes obsolete API warnings for common filtering scenarios Addresses Issue microsoft#10456 - PR 2: VectorStoreTextSearch internal modernization
0e78309 to
3c9fc7b
Compare
…pliance - Replace broad catch-all exception handling with specific exception types - Add comprehensive exception handling for reflection operations in CreateEqualityExpression: * ArgumentNullException for null parameters * ArgumentException for invalid property names or expression parameters * InvalidOperationException for invalid property access or operations * TargetParameterCountException for lambda expression parameter mismatches * MemberAccessException for property access permission issues * NotSupportedException for unsupported operations (e.g., byref-like parameters) - Maintain intentional catch-all Exception handler with #pragma warning disable CA1031 - Preserve backward compatibility by returning null for graceful fallback - Add clear documentation explaining exception handling rationale - Addresses CA1031 code analysis warning while maintaining robust error handling - All tests pass (1,574/1,574) and formatting compliance verified
- Add InvalidPropertyFilterThrowsExpectedExceptionAsync: Validates that new LINQ filtering creates expressions correctly and passes them to vector store connectors - Add ComplexFiltersUseLegacyBehaviorAsync: Tests graceful fallback for complex filter scenarios when LINQ conversion returns null - Add SimpleEqualityFilterUsesModernLinqPathAsync: Confirms end-to-end functionality of the new LINQ filtering optimization for simple equality filters Analysis: - All 15 VectorStoreTextSearch tests pass (3 new + 12 existing) - All 85 TextSearch tests pass, confirming no regressions - Tests prove the new ConvertTextSearchFilterToLinq() and CreateEqualityExpression() methods work correctly - Exception from InMemory connector in invalid property test confirms LINQ path is being used instead of fallback behavior - Improves edge case coverage for the filtering modernization introduced in previous commits
- Add NullFilterReturnsAllResultsAsync test to verify behavior when no filter is applied - Remove unnecessary Microsoft.Extensions.VectorData using statement - Enhance test coverage for VectorStoreTextSearch edge cases
…INQ filtering - Extend ConvertTextSearchFilterToLinq to handle AnyTagEqualToFilterClause - Add CreateAnyTagEqualToExpression for collection.Contains() operations - Add CreateMultipleClauseExpression for AND logic with Expression.AndAlso - Add 4 comprehensive tests for new filtering capabilities - Add RequiresDynamicCode attributes for AOT compatibility - Maintain backward compatibility with graceful fallback Fixes microsoft#10456
Fixes IL3051 compilation errors by adding RequiresDynamicCode attributes to: - SearchAsync(string, TextSearchOptions<TRecord>?, CancellationToken) - GetTextSearchResultsAsync(string, TextSearchOptions<TRecord>?, CancellationToken) - GetSearchResultsAsync(string, TextSearchOptions<TRecord>?, CancellationToken) The generic ITextSearch<TRecord> interface accepts LINQ expressions via TextSearchOptions<TRecord>.Filter, which requires dynamic code generation for expression tree processing. This change ensures interface methods match their implementations' RequiresDynamicCode attributes. Resolves: Issue microsoft#10456 IL3051 interface mismatch errors Cherry-pick-safe: Interface-only change, no implementation logic
- Fix CA1859: Use specific return types BinaryExpression? and MethodCallExpression? instead of generic Expression? for better performance - Improve test model: Use IReadOnlyList<string> instead of string[] for Tags property to follow .NET collection best practices These changes address code analyzer warnings and apply reviewer applicable feedback from other PRs in the Issue microsoft#10456 modernization series.
- Remove LINQ dependency from non-generic ITextSearch interface - Revert non-generic methods to direct VectorSearchFilter usage - Eliminates IL3051 warnings by avoiding RequiresDynamicCode on non-generic interface - Preserves backward compatibility with legacy TextSearchFilter path - Maintains modern LINQ expressions for generic ITextSearch<TRecord> interface Architectural separation: - Non-generic: TextSearchOptions → VectorSearchFilter (legacy path) - Generic: TextSearchOptions<TRecord> → Expression<Func<TRecord, bool>> (LINQ path) Resolves remaining IL3051 compilation errors while maintaining Issue microsoft#10456 objectives.
ADR1 (PR2)status: superseded-by-option-3 (ADR2)
|
ADR2 (PR2)status: "accepted", decision makers: @markwallace-microsoft , @westey-m, date: 2025-10-23
|
…arning suppressions around reflection operations in LINQ expression building - IL2075 (GetMethod) and IL2060 (MakeGenericMethod) warnings are acceptable for dynamic property access - Fixes GitHub Actions CI/CD pipeline failures where --warnaserror treats warnings as errors - Targeted suppressions with explanatory comments for maintainability
- Add UnconditionalSuppressMessage attributes for IL2075/IL2060 warnings during AOT analysis - Expand DynamicallyAccessedMembers to include PublicMethods for GetMethod reflection calls - Maintains RequiresDynamicCode attribute to properly indicate AOT incompatibility - Addresses AOT test failures where --warnaserror treats warnings as compilation errors The reflection-based LINQ expression building is inherently incompatible with AOT compilation, but these attributes allow the build system to handle this known limitation gracefully instead of failing with cryptic errors. Regular and AOT compilation phases require different suppression mechanisms - pragma directives for regular builds, UnconditionalSuppressMessage for AOT analysis.
dotnet/src/SemanticKernel.Core/Data/TextSearch/VectorStoreTextSearch.cs
Outdated
Show resolved
Hide resolved
837cf0e to
a802eaa
Compare
|
@roji @markwallace-microsoft @westey-m while waiting for the architecture decision input from you, I am entertaining and validating option 3. |
Add ITextSearch<TRecord> generic interface with LINQ filtering while maintaining existing ITextSearch non-generic interface for backward compatibility. ## Background: Architectural Decision (Issue microsoft#10456) Three options considered: Option 1 (Native LINQ): Replace TextSearchFilter with Expression<Func<T, bool>> - Breaking change: requires user migration - Best long-term architecture Option 2 (Translation Layer): Convert TextSearchFilter to LINQ internally - Breaking change: RequiresDynamicCode propagates through API surface - Reflection overhead, AOT incompatible Option 3 (Dual Interface): Add ITextSearch<TRecord> alongside ITextSearch - No breaking changes - Maintains AOT compatibility - Uses obsolete VectorSearchFilter in legacy path (temporary during transition) ## Implementation ### Generic Interface - ITextSearch<TRecord> with 3 methods accepting TextSearchOptions<TRecord> - TextSearchOptions<TRecord> with Expression<Func<TRecord, bool>>? Filter - Explicit interface implementation in VectorStoreTextSearch<TRecord> ### Dual-Path Architecture Two independent code paths, no translation layer: Legacy path (non-generic): - ITextSearch with TextSearchOptions and TextSearchFilter (clause-based) - Uses VectorSearchOptions.OldFilter (obsolete) with pragma warning suppression - No dynamic code, AOT compatible - 10 existing tests unchanged Modern path (generic): - ITextSearch<TRecord> with TextSearchOptions<TRecord> and Expression filter - Uses VectorSearchOptions.Filter (LINQ native, not obsolete) - No dynamic code, AOT compatible - 7 new tests ## Changes - Added ITextSearch<TRecord> interface and TextSearchOptions<TRecord> class - Implemented dual interface in VectorStoreTextSearch<TRecord> - Deleted ~150 lines of Option 2 translation layer code - Removed all RequiresDynamicCode attributes - Removed DynamicallyAccessedMemberTypes.PublicMethods from 5 locations: - VectorStoreTextSearch.cs - TextSearchServiceCollectionExtensions.cs (3 methods) - TextSearchKernelBuilderExtensions.cs (1 method) - Deleted 7 Option 2 translation tests - Added 7 LINQ filtering tests - Added DataModelWithTags to test base - Reverted Program.cs to original state ## Files Changed 8 files, +144 insertions, -395 deletions - ITextSearch.cs - TextSearchOptions.cs (added generic class) - VectorStoreTextSearch.cs (removed translation layer, added dual interface) - TextSearchServiceCollectionExtensions.cs (removed PublicMethods annotation) - TextSearchKernelBuilderExtensions.cs (removed PublicMethods annotation) - VectorStoreTextSearchTestBase.cs (added DataModelWithTags) - VectorStoreTextSearchTests.cs (removed 7 tests, added 7 tests) - Program.cs in AotTests (removed suppression, restored tests)
We typically follow the pattern of obsoleting the old API when we introduce the new pattern. This avoids breaking changes which are very disruptive for projects that have a transient dependency. |
|
@alzarei, I prefer a clean separation between the old and new abstractions. Being able to obsolete the old ones and point users at the new ones is definitely valuable. Then we can remove the obsoleted code after a period of time. |
- Add [Obsolete] attribute to ITextSearch interface - Add pragma suppressions in production classes for backward compatibility - Add pragma suppressions in test/sample files - Follows Microsoft pattern: introduce new API, deprecate old - ITextSearch<TRecord> is the replacement with LINQ filtering Build: 0 errors, 0 warnings Tests: 1,581 passed
88244e4 to
e333850
Compare
|
@roji @westey-m @markwallace-microsoft Thank you all for the excellent feedback! After your review and architectural guidance, we pivoted from Option 2 (translation layer, commits c2c783b - a802eaa) to Option 3 (dual interface pattern) in commit d1f2733. This removed the ~150 lines of translation layer code that had the issues you identified. Current implementation: Legacy path: Direct OldFilter usage with pragma (lines 284-289) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alzarei can you please push your changes into a single PR? I'm getting a bit confused as to which branch corresponds to which PRs (there are quite a few), and this PR represents incremental work on top of something else. This is all making reviewing quite complex.
Just to raise the obvious: this PR adds the new LINQ-based filter support to VectorStoreTextSearch, but all the other implementations are left implementing the now-obsolete non-generic ITextSearch; it's problematic for these implementations to not have a non-obsolete way of being used. I can see that ultimately a feature-text-search-linq is being targeted - it's indeed a good idea to do this work in a feature branch so that everything can be done across all providers and samples before releasing anything.
dotnet/src/SemanticKernel.Abstractions/Data/TextSearch/ITextSearch.cs
Outdated
Show resolved
Hide resolved
dotnet/src/SemanticKernel.Abstractions/Data/TextSearch/ITextSearch.cs
Outdated
Show resolved
Hide resolved
|
@roji @westey-m @markwallace-microsoft All review feedback has been addressed. PR is ready for final review. Validation ResultsAll validation steps completed successfully: Format Check - PASSED
Build with Warnings as Errors - PASSED
Core Unit Tests - PASSED
AOT Compatibility - PASSED
|
- Add using System.Threading and System.Threading.Tasks directives - Replace fully-qualified type names with short names (Task, CancellationToken) - Remove repetitive documentation line about LINQ filtering Addresses inline code feedback from @roji in PR microsoft#13179
@roji Thanks for the feedback! Let me clarify the multi-PR strategy: This is Part 2 of 6 in Issue #10456PR Chain (all target
Your Concern About Obsolete Implementations
You're absolutely right to raise this! PR3-PR6 address exactly this issue. After all 6 PRs merge, every implementation will have both interfaces:
Pattern (Option 3 - applied consistently in all PRs): #pragma warning disable CS0618 // ITextSearch is obsolete - backward compatibility
public sealed class BingTextSearch : ITextSearch, ITextSearch<BingWebPage>
#pragma warning restore CS0618
{
// Modern LINQ path
Task<KernelSearchResults<string>> ITextSearch<BingWebPage>.SearchAsync(...) { }
// Legacy path (backward compatible)
Task<KernelSearchResults<string>> ITextSearch.SearchAsync(...) { }
}Why Separate PRs?
Review ApproachSuggested: Review PR2 for the deprecation pattern architecture, then PR3-6 will apply it consistently across connectors. Alternative: I can consolidate if you prefer to see everything in one PR. Does this clarify the strategy? Happy to adjust if you have an approach in mind that makes it easier to reviewe! Thanks! |
|
@westey-m @roji @markwallace-microsoft The integration deployment went through successfully 🎉 and all feedback has been addressed. Feel free to take a look, and if everything looks good, let's move forward. I'm wrapping up the next PRs in this batch to stay aligned with the ADRs we made here. Thank you! |
ebd579d
into
microsoft:feature-text-search-linq
…arch.GetSearchResultsAsync (#13318) This PR enhances the type safety of the `ITextSearch<TRecord>` interface by changing the `GetSearchResultsAsync` method to return `KernelSearchResults<TRecord>` instead of `KernelSearchResults<object>`. This improvement eliminates the need for manual casting and provides better IntelliSense support for consumers. ## Motivation and Context The current implementation of `ITextSearch<TRecord>.GetSearchResultsAsync` returns `KernelSearchResults<object>`, which requires consumers to manually cast results to the expected type. This reduces type safety and degrades the developer experience by losing compile-time type checking and IntelliSense support. This change aligns the return type with the generic type parameter `TRecord`, providing the expected strongly-typed results that users of a generic interface would anticipate. ## Changes Made ### Interface (ITextSearch.cs) - Changed `ITextSearch<TRecord>.GetSearchResultsAsync` return type from `KernelSearchResults<object>` to `KernelSearchResults<TRecord>` - Updated XML documentation to reflect strongly-typed return value - Legacy `ITextSearch` interface (non-generic) remains unchanged, continuing to return `KernelSearchResults<object>` for backward compatibility ### Implementation (VectorStoreTextSearch.cs) - Added new `GetResultsAsTRecordAsync` helper method returning `IAsyncEnumerable<TRecord>` - Updated generic interface implementation to use the new strongly-typed helper - Retained `GetResultsAsRecordAsync` method for the legacy non-generic interface ### Tests (VectorStoreTextSearchTests.cs) - Updated 3 unit tests to use strongly-typed `DataModel` or `DataModelWithRawEmbedding` instead of `object` - Improved test assertions to leverage direct property access without casting - All 19 tests pass successfully ## Breaking Changes **Interface Change (Experimental API):** - `ITextSearch<TRecord>.GetSearchResultsAsync` now returns `KernelSearchResults<TRecord>` instead of `KernelSearchResults<object>` - This interface is marked with `[Experimental("SKEXP0001")]`, indicating that breaking changes are expected during the preview period - Legacy `ITextSearch` interface (non-generic) is unaffected and maintains full backward compatibility ## Benefits - **Improved Type Safety**: Eliminates runtime casting errors by providing compile-time type checking - **Enhanced Developer Experience**: Full IntelliSense support for TRecord properties and methods - **Cleaner Code**: Consumers no longer need to cast results from object to the expected type - **Consistent API Design**: Generic interface now behaves as expected, returning strongly-typed results - **Zero Impact on Legacy Code**: Non-generic ITextSearch interface remains unchanged ## Testing - All 19 existing unit tests pass - Updated tests demonstrate improved type safety with direct property access - Verified both generic and legacy interfaces work correctly - Confirmed zero breaking changes to non-generic ITextSearch consumers ## Related Work This PR is part of the Issue #10456 multi-PR chain for modernizing ITextSearch with LINQ-based filtering: - PR #13175: Foundation (ITextSearch<TRecord> interface) - Merged - PR #13179: VectorStoreTextSearch + deprecation pattern - In Review - **This PR (2.1)**: API refinement for improved type safety - PR #13188-13191: Connector migrations (Bing, Google, Tavily, Brave) - Pending - PR #13194: Samples and documentation - Pending All PRs target the `feature-text-search-linq` branch for coordinated release. ## Migration Guide for Consumers ### Before (Previous API) ```csharp ITextSearch<DataModel> search = ...; KernelSearchResults<object> results = await search.GetSearchResultsAsync("query", options); foreach (var obj in results.Results) { var record = (DataModel)obj; // Manual cast required Console.WriteLine(record.Name); } ``` ### After (Improved API) ```csharp ITextSearch<DataModel> search = ...; KernelSearchResults<DataModel> results = await search.GetSearchResultsAsync("query", options); foreach (var record in results.Results) // Strongly typed! { Console.WriteLine(record.Name); // Direct property access with IntelliSense } ``` ## Checklist - [x] Changes build successfully - [x] All unit tests pass (19/19) - [x] XML documentation updated - [x] Breaking change documented (experimental API only) - [x] Legacy interface backward compatibility maintained - [x] Code follows project coding standards Co-authored-by: Alexander Zarei <alzarei@users.noreply.github.com>
.NET: Add LINQ-based ITextSearch interface and deprecate legacy ITextSearch (#10456)
Summary
This PR implements Option 3 from the architectural decision process for Issue #10456: introduces a new generic
ITextSearch<TRecord>interface with type-safe LINQ filtering while maintaining the legacyITextSearchinterface marked as[Obsolete]for backward compatibility.Zero breaking changes - existing code continues working unchanged.
What Changed
New Generic Interface (Recommended Path)
Benefits:
Legacy Interface (Deprecated)
Migration Message: Users see deprecation warning directing them to modern
ITextSearch<TRecord>with LINQ filtering.Implementation Details
Dual-Path Architecture
VectorStoreTextSearch<TRecord>implements both interfaces with independent code paths:Legacy Path (Non-Generic):
Modern Path (Generic):
Key Characteristics:
VectorSearchFilterwith pragma suppressions (temporary during transition)Files Changed
Interfaces & Options
ITextSearch.cs: AddedITextSearch<TRecord>interface, marked legacyITextSearchas[Obsolete]TextSearchOptions.cs: Added genericTextSearchOptions<TRecord>classImplementation
VectorStoreTextSearch.cs: Implemented dual interface pattern (~30 lines for both paths)Backward Compatibility (Pragma Suppressions)
Added
#pragma warning disable CS0618to 27 files that use the obsolete interface:Production (11 files):
Tests/Samples (16 files):
Tests
DataModelWithTagsto test base for collection filteringValidation Results
--warnaserrorBreaking Changes
None. This is a non-breaking addition:
ITextSearchinterface continues working (marked[Obsolete])ITextSearch<TRecord>is opt-in via deprecation warningMulti-PR Context
This is PR 2 of 6 in the structured implementation for Issue #10456:
Architectural Decision
Option 3 Approved by Mark Wallace and Westey-m:
Options Considered:
TextSearchFilterentirely (breaking change)TextSearchFilterto LINQ internally (RequiresDynamicCode cascade, AOT issues)ITextSearch<TRecord>+ deprecate legacy (no breaking changes, clean separation)See ADR comments in conversation for detailed architectural analysis.
Migration Guide
Before (Legacy - Now Obsolete):
After (Modern - Recommended):
Next Steps
PR3-PR6 will migrate connector implementations (Bing, Google, Brave, Azure AI Search) to use
ITextSearch<TRecord>with LINQ filtering, demonstrating the modern pattern while maintaining backward compatibility.