Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement: Keyword Query Language (KQL) search syntax support #7043

Merged
merged 2 commits into from
Aug 28, 2023

Conversation

fschade
Copy link
Contributor

@fschade fschade commented Aug 15, 2023

Description

Introduce support for KQL syntax for search.

The functionality consists of a kql lexer and a bleve query compiler

Supported field queries:

  • Tag search tag:golden tag:"silver"
  • Filename search name:file.txt name:"file.docx"
  • Content search content:ahab content:"captain aha*"

Supported conjunctive normal form queries:

  • Boolean operators AND, OR, NOT,
  • Nesting ( ...SUB_QUERY... )

for example:

Query:

(name:"moby di*" OR tag:bestseller) AND tag:book NOT tag:read

Result:

  • Resources with name: moby di* OR tag: bestseller.
  • AND with tag:book.
  • NOT with tag:read.

AST:

&ast.Ast{
  Nodes: []ast.Node{
    &ast.GroupNode{
      Base: &ast.Base{Loc: &ast.Location{...}},
      Nodes: []ast.Node{
        &ast.StringNode{...},
        &ast.OperatorNode{...},
        &ast.StringNode{...},
      },
    },
    &ast.OperatorNode{Base: &ast.Base{Loc: &ast.Location{...}}, Value: "AND"},
    &ast.StringNode{Base: &ast.Base{Loc: &ast.Location{...}}, Key: "tag", Value: "book"},
    &ast.OperatorNode{Base: &ast.Base{Loc: &ast.Location{...}}, Value: "NOT"},
    &ast.StringNode{Base: &ast.Base{Loc: &ast.Location{...}}, Key: "tag", Value: "read"},
  },
}

Todos:

  • KQL-STRING-QUERY to AST kql syntax subset [KQL-TO-AST-LEXER]
    • name (ast.StringNode)
    • tag (ast.StringNode)
    • content (ast.StringNode)
    • free text-keywords (ast.StringNode)
    • conjunctive normal form (ast.GroupNode)
    • operators AND, OR, NOT (ast.OperatorNode)
  • AST to BLEVE-STRING-QUERY compiler [AST-TO-BLEVE-COMPILER]
    • name (ast.StringNode)
    • tag (ast.StringNode)
    • content (ast.StringNode)
    • free text-keywords (ast.StringNode)
    • conjunctive normal form (ast.GroupNode)
    • operators AND, OR, NOT (ast.OperatorNode)

Related Issue

How Has This Been Tested?

  • unit tests

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Technical debt
  • Tests only (no source changes)

Checklist:

  • Code changes
  • Unit tests added
  • Acceptance tests added
  • Documentation ticket raised:

@2403905
Copy link
Contributor

2403905 commented Aug 25, 2023

We should discuss how to handle the implicit boolean operators and NOT together. It is not explicitly described in the documentation. The rules are different for the multiple free-text expressions, multiple instances of the same property restriction, multiple instances of the different property and grouping-property.

// https://learn.microsoft.com/en-us/sharepoint/dev/general-development/keyword-query-language-kql-syntax-reference#constructing-free-text-queries-using-kql
// If there are multiple free-text expressions without any operators in between them, the query behavior is the same as using the AND operator.
// "John Smith" "Jane Smith"
// This functionally is the same as using the OR Boolean operator, as follows:
// "John Smith" AND "Jane Smith"
//
// https://learn.microsoft.com/en-us/sharepoint/dev/general-development/keyword-query-language-kql-syntax-reference#using-multiple-property-restrictions-within-a-kql-query
// When you use multiple instances of the same property restriction, matches are based on the union of the property restrictions in the KQL query.
// author:"John Smith" author:"Jane Smith"
// This functionally is the same as using the OR Boolean operator, as follows:
// author:"John Smith" OR author:"Jane Smith"
//
// When you use different property restrictions, matches are based on an intersection of the property restrictions in the KQL query, as follows:
// author:"John Smith" filetype:docx
// This is the same as using the AND Boolean operator, as follows:
// author:"John Smith" AND filetype:docx
//
// https://learn.microsoft.com/en-us/sharepoint/dev/general-development/keyword-query-language-kql-syntax-reference#grouping-property-restrictions-within-a-kql-query
// author:("John Smith" "Jane Smith")
// This is the same as using the AND Boolean operator, as follows:
// author:"John Smith" AND author:"Jane Smith"

Should we support and how do we represent?
For example:

 author:"John Smith" author:"Jane Smith"
This functionally is the same as using the OR Boolean operator, as follows:
author:"John Smith" OR author:"Jane Smith

Is NOT author:"John Smith" NOT author:"Jane Smith" became NOT author:"John Smith" OR NOT author:"Jane Smith ?

author:"John Smith" filetype:docx
This is the same as using the AND Boolean operator, as follows:
author:"John Smith" AND filetype:docx

Is NOT author:"John Smith" NOT filetype:docx" became NOT author:"John Smith" AND NOT filetype:docx ?

With the increasing complexity of how we organize our resources, the search must also be able to find them using entity properties.

The query package provides the necessary functionality to do this.

This makes it possible to search for resources via KQL, the microsoft spec is largely covered and can be used for this.

In the current state, the legacy query language is still used, in a future update this will be deprecated and KQL will become the standard
@fschade
Copy link
Contributor Author

fschade commented Aug 28, 2023

author

the ms spec defines when to use a implicit AND and when to use a implicit OR, but negotiations aren't mentioned.

my 5 cents: since no description is available for those cases i would say, as soon as someone uses an operator as edge between two nodes:

┌─────────────┐                            ┌───────────────┐
│             │        ┌─────────┐         │               │
│QUERY-NODE-1 ├────────┤ OP-EDGE ├─────────┤ QUERY-NODE-2  │
│             │        └─────────┘         │               │
└─────────────┘                            └───────────────┘

we do not add any other implicit type there. cC.: @tbsbdr

Copy link
Collaborator

@kobergj kobergj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fluffy 👍 Just one question and one typo.

I assume documentation is done as a follow up?

services/search/Makefile Show resolved Hide resolved
services/search/pkg/query/kql/normalize.go Outdated Show resolved Hide resolved
@fschade
Copy link
Contributor Author

fschade commented Aug 28, 2023

I assume documentation is done as a follow up?

🤣 SURE, to be more exact, KQL is not enabled at the moment, but in the follow up we will enable it and provide a readme and with the help of @mmattel a even more fluffy docs page 💋

@fschade fschade requested a review from kobergj August 28, 2023 13:49
@sonarcloud
Copy link

sonarcloud bot commented Aug 28, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 9 Code Smells

68.4% 68.4% Coverage
0.0% 0.0% Duplication

@fschade fschade merged commit ed0dbce into owncloud:master Aug 28, 2023
3 checks passed
ownclouders pushed a commit that referenced this pull request Aug 28, 2023
* feat(search): introduce search query package

With the increasing complexity of how we organize our resources, the search must also be able to find them using entity properties.

The query package provides the necessary functionality to do this.

This makes it possible to search for resources via KQL, the microsoft spec is largely covered and can be used for this.

In the current state, the legacy query language is still used, in a future update this will be deprecated and KQL will become the standard
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ocis] Use KQL for existing search syntax
3 participants