Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethink Query hierarchy #142

Closed
valencik opened this issue May 22, 2024 · 0 comments · Fixed by #173
Closed

Rethink Query hierarchy #142

valencik opened this issue May 22, 2024 · 0 comments · Fixed by #173
Assignees
Milestone

Comments

@valencik
Copy link
Collaborator

It is currently possible to construct queries using the case classes that don't make sense.

I see three current problems

Problem 1

Boolean queries with a single branch:

Query.And(NonEmptyList.of(Query.Term("a")))

Similarly with Query.Or

I think we should change the Query.And and Query.Or constructors to ensure that there are always two Nodes.
Perhaps something like this:

  final case class Or private[lucille] (qs: NonEmptyList[Query]) extends Query {
    def mapLastTerm(f: Query.Term => Query): Or =
      Or(rewriteLastTerm(qs, f))
  }
  object Or {
    def apply(left: Query, right: Query, tail: Query*): Or =
      Or(NonEmptyList(left, right :: tail.toList))
  }

I am actually not sure yet if we need private[lucille] or not. It depends on whatever is convenient from the QueryParser.
We may also want another constructor that takes in a Query and a NonEmptyList[Query].

Problem 2

Group queries without a clear boolean operator:

Query.Group(NonEmptyList.of(Query.Term("a"), Query.Term("b")))

This ambiguity requires protosearch to have a default combine method.

This is actually possible in the parser too.
(term1 term2 term3) will happily parse to a Group query without any And or Or node.
I think this is also a problem.
The parser should use the default configured boolean and put the terms into either an And or Or node based on that default.
(painful note: currently that default is not configurable and defaults to Or. #124)

Query.Group should change to take in only a single Query and is there simply to deal with the order of operations via adding parenthesis.
Basically it's motivation is to allow someone to construct a query that when printed looks like (term1 OR term2) AND term3.
Or similarly one that looks like term1 OR (term2 AND term3)

What if you wanted a query to print (term1 term2) AND term3?
Well, you would only reparse that printed output into the same query if you knew the parsers configured default boolean.
So opting into that kind of printing should be an explicit setting, if we support it.

Problem 3

Similarly to Query.Group it's ambiguous how to interpret a Query.MultiQuery, it has no explicit boolean operation.
I don't think we should have a MultiQuery.

We use it in the parser like this:

  def parseQ(s: String): Either[cats.parse.Parser.Error, MultiQuery] =
    fullQuery.parseAll(s).map(MultiQuery.apply)

Instead of sticking the list of queries into a MultiQuery, use the configured default boolean and then stick them into an And or Or node.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant