Skip to content

Conversation

@jimczi
Copy link
Contributor

@jimczi jimczi commented Mar 1, 2018

This change ensures that we ignore terms removed from the analysis rather than returning a match_no_docs query for the part
that contain the stop word. For instance a query like "the AND fox" should ignore "the" if it is considered as a stop word instead of
adding a match_no_docs query.
This change also fixes the analysis of prefix terms that start with a stop word (e.g. the*). In such case if analyze_wildcard is true and the
is considered as a stop word this part of the query is rewritten into a match_no_docs query. Since it's a prefix query this change forces the prefix query
on the even if it is removed from the analysis.

Fixes #28855
Fixes #28856

This change ensures that we ignore terms removed from the analysis rather than returning a match_no_docs query for the part
that contain the stop word. For instance a query like "the AND fox" should ignore "the" if it is considered as a stop word instead of
adding a match_no_docs query.
This change also fixes the analysis of prefix terms that start with a stop word (e.g. `the*`). In such case if `analyze_wildcard` is true and `the`
is considered as a stop word this part of the query is rewritten into a match_no_docs query. Since it's a prefix query this change forces the prefix query
on `the` even if it is removed from the analysis.

Fixes elastic#28855
Fixes elastic#28856
@jimczi jimczi added >bug :Search/Search Search-related issues that do not fall into other categories v7.0.0 v6.3.0 labels Mar 1, 2018
Copy link
Contributor

@colings86 colings86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jimczi jimczi changed the title Fix query_string and simple_query_string to ignore removed terms Fix (simple)_query_string to ignore removed terms Mar 1, 2018
@jimczi jimczi merged commit c26bd60 into elastic:master Mar 4, 2018
@jimczi jimczi deleted the bugs/zero_term_query_stop_words branch March 4, 2018 21:25
jimczi added a commit that referenced this pull request Mar 4, 2018
This change ensures that we ignore terms removed from the analysis rather than returning a match_no_docs query for the part
that contain the stop word. For instance a query like "the AND fox" should ignore "the" if it is considered as a stop word instead of
adding a match_no_docs query.
This change also fixes the analysis of prefix terms that start with a stop word (e.g. `the*`). In such case if `analyze_wildcard` is true and `the`
is considered as a stop word this part of the query is rewritten into a match_no_docs query. Since it's a prefix query this change forces the prefix query
on `the` even if it is removed from the analysis.

Fixes #28855
Fixes #28856
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Mar 7, 2018
* master:
  [TEST] AwaitsFix QueryRescorerIT.testRescoreAfterCollapse
  Decouple XContentType from StreamInput/Output (elastic#28927)
  Remove BytesRef usage from XContentParser and its subclasses (elastic#28792)
  [DOCS] Correct typo in configuration (elastic#28903)
  Fix incorrect datemath example (elastic#28904)
  Add a usage example of the JLH score (elastic#28905)
  Wrap stream passed to createParser in try-with-resources (elastic#28897)
  Rescore collapsed documents (elastic#28521)
  Fix (simple)_query_string to ignore removed terms (elastic#28871)
  [Docs] Fix typo in composite aggregation (elastic#28891)
  Try if tombstone is eligable for pruning before locking on it's key (elastic#28767)
martijnvg added a commit that referenced this pull request Mar 8, 2018
* es/master: (48 commits)
  Update bucket-sort-aggregation.asciidoc (#28937)
  [Docs] REST high-level client: Fix code for most basic search request (#28916)
  Improved percolator's random candidate query duel test and fixed bugs that were exposed by this:
  Revert "Rescore collapsed documents (#28521)"
  Build: Fix test logger NPE when no tests are run (#28929)
  [TEST] AwaitsFix QueryRescorerIT.testRescoreAfterCollapse
  Decouple XContentType from StreamInput/Output (#28927)
  Remove BytesRef usage from XContentParser and its subclasses (#28792)
  [DOCS] Correct typo in configuration (#28903)
  Fix incorrect datemath example (#28904)
  Add a usage example of the JLH score (#28905)
  Wrap stream passed to createParser in try-with-resources (#28897)
  Rescore collapsed documents (#28521)
  Fix (simple)_query_string to ignore removed terms (#28871)
  [Docs] Fix typo in composite aggregation (#28891)
  Try if tombstone is eligable for pruning before locking on it's key (#28767)
  Limit analyzed text for highlighting (improvements) (#28808)
  Missing `timeout` parameter from the REST API spec JSON files (#28328)
  Clarifies how query_string splits textual part (#28798)
  Update outdated java version reference (#28870)
  ...
martijnvg added a commit that referenced this pull request Mar 8, 2018
* es/6.x: (48 commits)
  Update bucket-sort-aggregation.asciidoc (#28937)
  [Docs] REST high-level client: Fix code for most basic search request (#28916)
  Improved percolator's random candidate query duel test and fixed bugs that were exposed by this:
  Revert "Rescore collapsed documents (#28521)"
  Build: Fix test logger NPE when no tests are run (#28929)
  [TEST] AwaitsFix QueryRescorerIT.testRescoreAfterCollapse
  Decouple XContentType from StreamInput/Output (#28927)
  Remove BytesRef usage from XContentParser and its subclasses (#28792)
  Add doc note for -server flag on Windows service
  [DOCS] Correct typo in configuration (#28903)
  Fix incorrect datemath example (#28904)
  Add a usage example of the JLH score (#28905)
  Limit analyzed text for highlighting (improvements) (#28907)
  Wrap stream passed to createParser in try-with-resources (#28897)
  [Docs] Fix typo in composite aggregation (#28891)
  Rescore collapsed documents (#28521)
  Fix (simple)_query_string to ignore removed terms (#28871)
  Missing `timeout` parameter from the REST API spec JSON files (#28328)
  Clarifies how query_string splits textual part (#28798)
  Update outdated java version reference (#28870)
  ...
sebasjm pushed a commit to sebasjm/elasticsearch that referenced this pull request Mar 10, 2018
This change ensures that we ignore terms removed from the analysis rather than returning a match_no_docs query for the part
that contain the stop word. For instance a query like "the AND fox" should ignore "the" if it is considered as a stop word instead of
adding a match_no_docs query.
This change also fixes the analysis of prefix terms that start with a stop word (e.g. `the*`). In such case if `analyze_wildcard` is true and `the`
is considered as a stop word this part of the query is rewritten into a match_no_docs query. Since it's a prefix query this change forces the prefix query
on `the` even if it is removed from the analysis.

Fixes elastic#28855
Fixes elastic#28856
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Search/Search Search-related issues that do not fall into other categories v6.3.0 v7.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MatchNoDocsQuery from stop words with wildcards in query_string Stop-words not removed during simple-query query-time analysis

2 participants