Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query builder #842

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Conversation

wickedOne
Copy link
Collaborator

helper class to help writing and maintaining (filter) queries;
replaces #733

i had a look at the SearchApiSolrBackend::createFilterQuery method but concluded that the main difference between that class and this helper class is that the createFilterQuery method does a lot of "filling in the blanks" based on field types which is information not available in this class.

i did however add the Comparison::EMPTY constant and method to cover the edge case mentioned there twice for someone willing to query empty values. to keep things clean and simple, i've added a new comparison method rather than trying to incorporate it in others based on values passed.

also the addition two simple integrations in the query class Query::setQueryFromQueryBuilder and Query::addFilterQueriesFromQueryBuilder.

@codecov-commenter
Copy link

codecov-commenter commented Jul 14, 2020

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.79%. Comparing base (636bfdb) to head (e6bd9b7).
Report is 173 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #842      +/-   ##
==========================================
+ Coverage   92.64%   92.79%   +0.15%     
==========================================
  Files         335      342       +7     
  Lines        8263     8435     +172     
==========================================
+ Hits         7655     7827     +172     
  Misses        608      608              
Flag Coverage Δ
unittests 92.79% <100.00%> (+0.15%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

expression builder to ease the use of solr's query language
- added query builder implementation in query trait
- added query builder implementation for adding filter queries
- added some documentation for the query builder
Complex filter queries
----------------------
While the ``addFilterQueriesFromQueryBuilder`` method only provides in setting the facet query key and actual query, the ``QueryBuilder`` can be used in the construction of more complex facet queries.
If one, for example, need to add a tag to the filter query the following method could be used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If one, for example, need to add a tag to the filter query the following method could be used.
If one, for example, needs to add a tag to the filter query the following method could be used.

*
* @return string
*/
private function typedValueToString($value, string $quote = '', $escape = true): string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One could theoretically wonder if $escape is nullable without a typehint.

Suggested change
private function typedValueToString($value, string $quote = '', $escape = true): string
private function typedValueToString($value, string $quote = '', bool $escape = true): string

Comment on lines +141 to +143
return implode(' AND ', $comparisons);
case CompositeComparison::TYPE_OR:
return implode(' OR ', $comparisons);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use the constants \Solarium\QueryType\Select\Query\Query::QUERY_OPERATOR_AND and QUERY_OPERATOR_OR instead? It seems a bit overkillish and I don't think it has any benefits. Just wondering if you had considered it and what made you decide against it.

}

return sprintf('%s:%s', $field, $this->valueToString($value, ',', '', false));
case Comparison::NIN:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do have a strong opinion about the order of the cases. I think NIN should come right after IN.

Personally, I'd stick to the same order for all the cases as the consts in Comparison are defined in.

Copy link
Member

@thomascorthals thomascorthals left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good as a whole. My remarks are mostly about readability rather than functionality.

Comment on lines +24 to +34
/**
* @param null $x
*
* @throws \Solarium\Exception\RuntimeException
*
* @return \Solarium\Builder\CompositeComparison
*/
public function andX($x = null): CompositeComparison
{
return new CompositeComparison(CompositeComparison::TYPE_AND, \func_get_args());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ... operator makes this typehintable and thus more readable.

Suggested change
/**
* @param null $x
*
* @throws \Solarium\Exception\RuntimeException
*
* @return \Solarium\Builder\CompositeComparison
*/
public function andX($x = null): CompositeComparison
{
return new CompositeComparison(CompositeComparison::TYPE_AND, \func_get_args());
}
/**
* @param Comparison ...$comparison
*
* @throws \Solarium\Exception\RuntimeException
*
* @return \Solarium\Builder\CompositeComparison
*/
public function andX(Comparison ...$comparison): CompositeComparison
{
return new CompositeComparison(CompositeComparison::TYPE_AND, $comparison);
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've been working quite a bit with the FunctionBuilder recently and one of the benefits of not typehinting the arguments is that you can pass a string as argument for the edge cases you run into.

for instance creating this function filter(mult(price,quantity),or(cats:equal(category,_))) would translate into

$expression = FunctionBuilder::create()
            ->where($expr->filter($expr->mult('price', 'quantity'), $expr->or('cats:equal(category,_)')))
        ;

as $expr->equals would render a different expression, it cannot be used to generate the desired result.

as this is a helper class, i'd personally prefer to make it not too strict; what do you think?

Comment on lines +36 to +46
/**
* @param null $x
*
* @throws \Solarium\Exception\RuntimeException
*
* @return \Solarium\Builder\CompositeComparison
*/
public function orX($x = null): CompositeComparison
{
return new CompositeComparison(CompositeComparison::TYPE_OR, \func_get_args());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/**
* @param null $x
*
* @throws \Solarium\Exception\RuntimeException
*
* @return \Solarium\Builder\CompositeComparison
*/
public function orX($x = null): CompositeComparison
{
return new CompositeComparison(CompositeComparison::TYPE_OR, \func_get_args());
}
/**
* @param Comparison ...$comparison
*
* @throws \Solarium\Exception\RuntimeException
*
* @return \Solarium\Builder\CompositeComparison
*/
public function orX(Comparison ...$comparison): CompositeComparison
{
return new CompositeComparison(CompositeComparison::TYPE_OR, $comparison);
}

@thomascorthals
Copy link
Member

Sorry for messing that up as multiple comments instead of a single review!

@wickedOne
Copy link
Collaborator Author

@thomascorthals thanx for your review. i do agree with all of your suggestions.

as for the usage of constants: initially this was written to be a completely seperate component outside of the Solarium namespace thus no dependecies and it can be considered a leftover.

nevertheless i do prefer to have my constants defined in the namespace handling them for clarity.
but i'm fine either way...

@thomascorthals
Copy link
Member

@wickedOne On the other hand, I don't think a hardcoded AND and OR will get us into trouble with Solr. If they change that syntax, we're probably in for one hell of a ride.

Copy link
Member

@mkalkbrenner mkalkbrenner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to mention that this query builder must only be used in combination with the standard query parser (lucene) or edismax.

The default operator is "OR". The resulting string will be shorter if we use a single space as glue instead of " OR ".
And I prefer to do +a:1 +b:2instead of a:1 AND b:1.
This way you can reduce the length of the query string. This way you could avoid switching to POST requests more often and keep GET. For POST requests some of Solr's caches are disabled.

In general I consider the query builder to be useful. But we need to add documentation. Having a query with much boolean operators is slow. In most cases it makes a lot of sense to split the query in multiple filter queries which could be cached individually. So the examples should consist of code where we use this query builder to create a query in combination with multiple filter queries.

$filter = QueryBuilder::create()
->where(QueryBuilder::expr()->notIn('foo', 'bar'));

$this->assertSame('-foo:"bar"', $this->visitor->dispatch($filter->getExpressions()[0]));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In most cases you want *:* -foo:"bar"here.

@wickedOne
Copy link
Collaborator Author

We need to mention that this query builder must only be used in combination with the standard query parser (lucene) or edismax.

i think there's quite a few query parsers which also work with the standard query language?

The default operator is "OR". The resulting string will be shorter if we use a single space as glue instead of " OR ".
And I prefer to do +a:1 +b:2instead of a:1 AND b:1.
This way you can reduce the length of the query string. This way you could avoid switching to POST requests more often and keep GET. For POST requests some of Solr's caches are disabled.

fine with me, but are you sure that also works with the Lucene, Payload Score or Surround Query Parser to name a few?
perhaps better to instantiate the builder with the operators you want?

In general I consider the query builder to be useful. But we need to add documentation. Having a query with much boolean operators is slow. In most cases it makes a lot of sense to split the query in multiple filter queries which could be cached individually. So the examples should consist of code where we use this query builder to create a query in combination with multiple filter queries.

you think this is the responsibility of the solarium library? if so, we might want to consider a "best practices" section in the documentation (which i think would be very helpful) as these considerations are not limited to using a query builder

@mkalkbrenner
Copy link
Member

We need to mention that this query builder must only be used in combination with the standard query parser (lucene) or edismax.

i think there's quite a few query parsers which also work with the standard query language?

Yes, but not all.

The default operator is "OR". The resulting string will be shorter if we use a single space as glue instead of " OR ". And I prefer to do +a:1 +b:2instead of a:1 AND b:1. This way you can reduce the length of the query string. This way you could avoid switching to POST requests more often and keep GET. For POST requests some of Solr's caches are disabled.

fine with me, but are you sure that also works with the Lucene, Payload Score or Surround Query Parser to name a few?
perhaps better to instantiate the builder with the operators you want?

It is the opposite. AND and OR are not accepted by all parsers. See https://lucene.apache.org/solr/guide/7_3/the-standard-query-parser.html#the-standard-query-parser

The standard query parser supports all the Boolean operators listed in the table above. The DisMax query parser supports only + and -.

In general I consider the query builder to be useful. But we need to add documentation. Having a query with much boolean operators is slow. In most cases it makes a lot of sense to split the query in multiple filter queries which could be cached individually. So the examples should consist of code where we use this query builder to create a query in combination with multiple filter queries.

you think this is the responsibility of the solarium library? if so, we might want to consider a "best practices" section in the documentation (which i think would be very helpful) as these considerations are not limited to using a query builder

You're right.

@wickedOne
Copy link
Collaborator Author

would you agree to a solution which by default uses + & - as operators which can be turned into AND and OR by using a flag on instantiation?

@mkalkbrenner
Copy link
Member

Sure!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants