Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apply_filter_policy and FilterPolicy.MERGE don't work properly for the new filter syntax #7995

Closed
vblagoje opened this issue Jul 8, 2024 · 3 comments · Fixed by #8042
Closed
Assignees
Labels
P1 High priority, add to the next sprint
Milestone

Comments

@vblagoje
Copy link
Member

vblagoje commented Jul 8, 2024

Is your feature request related to a problem? Please describe.
The current apply_filter_policy function in our retrieval system only supports legacy dictionary-based filters. This limitation becomes problematic for new filters syntax. For example, we can't merge filters like:
{"field": "meta.name", "operator": "==", "value": "John"} and
{"field": "meta.year", "operator": "==", "value": "2022"} using the current implementation.

Describe the solution you'd like
Update the apply_filter_policy function to support both the legacy filters and a new filter format merge functionality. The function should:

Accept both old and new filters as input for init_filters and runtime_filters.
Properly merge (or replace) filters based on the FilterPolicy for both formats.

Describe alternatives you've considered
Not considered

Additional context
This change is crucial for improving handling both legacy and new filters and applying FilterPolicy correctly.

@vblagoje vblagoje changed the title apply_filter_policy doesn't work properly for new filter syntax apply_filter_policy and FilterPolicy.MERGE don't work properly for the new filter syntax Jul 8, 2024
@vblagoje vblagoje added this to the 2.3.0 milestone Jul 8, 2024
@vblagoje
Copy link
Member Author

vblagoje commented Jul 8, 2024

@julian-risch @mrm1001 this one is needed for the final 2.3.0 release cc @sjrl

@vblagoje
Copy link
Member Author

vblagoje commented Jul 8, 2024

I have the most context to fix this one and @sjrl will review

@sjrl
Copy link
Contributor

sjrl commented Jul 9, 2024

So I think there are three basic scenarios that we might like to cover:

  1. Combining two comparison filters together where a logical operator needs to be provided by the user. For example,
comparison_filter_1 = {"field": "meta.name", "operator": "==", "value": "John"}
comparison_filter_2 = {"field": "meta.year", "operator": "==", "value": "2022"}
user_provided_logical_filter = "AND"
combined_filter = {
    "operator": "AND",
    "conditions": [
      {"field": "meta.name", "operator": "==", "value": "John"},
      {"field": "meta.year", "operator": "==", "value": "2022"}
    ]
}
  • An edge case we should consider for this scenario is what we do if both filters have the same field. For example, if the fields are both meta.name which one should be used over the other.
  1. Combining a comparison filter and a logical filter
  • I think in this scenario a user provided logical filter should still be provided. Perhaps if it matches the existing logical operator we should merge them, but I'm not entirely sure how we should handle when they differ.
comparison_filter = {"field": "meta.type", "operator": "==", "value": "pdf"}
logical_filter = {
    "operator": "AND",
    "conditions": [
      {"field": "meta.name", "operator": "==", "value": "John"},
      {"field": "meta.year", "operator": "==", "value": "2022"}
    ]
}
user_provided_logical_filter = "AND"
combined_filter = {
    "operator": "AND",
    "conditions": [
      {"field": "meta.name", "operator": "==", "value": "John"},
      {"field": "meta.year", "operator": "==", "value": "2022"},
      {"field": "meta.type", "operator": "==", "value": "pdf"}
    ]
}
  1. Combining two logical filters
  • I'm less certain how to fully handle all situations in this case. I believe if both logical filters have the "AND" operator it should be fine to merge both of them. However, if one is a "OR" and the other and "AND" I'm not certain how they should be combined, some advice here would be greatly appreciated.
logical_filter_1 = {
    "operator": "AND",
    "conditions": [
      {"field": "meta.type", "operator": "==", "value": "pdf"},
      {"field": "meta.month", "operator": "==", "value": "April"}
    ]
}
logical_filter_2 = {
    "operator": "AND",
    "conditions": [
      {"field": "meta.name", "operator": "==", "value": "John"},
      {"field": "meta.year", "operator": "==", "value": "2022"}
    ]
}
combined_filter = {
    "operator": "AND",
    "conditions": [
      {"field": "meta.name", "operator": "==", "value": "John"},
      {"field": "meta.year", "operator": "==", "value": "2022"},
      {"field": "meta.type", "operator": "==", "value": "pdf"},
      {"field": "meta.month", "operator": "==", "value": "April"}
    ]
}

@vblagoje vblagoje modified the milestones: 2.3.0, 2.3.1 Jul 9, 2024
@mrm1001 mrm1001 added the P1 High priority, add to the next sprint label Jul 10, 2024
@shadeMe shadeMe modified the milestones: 2.3.2, 2.4.0 Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 High priority, add to the next sprint
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants