Skip to content

query_string and simple_query_string cannot apply minimum should match consistently #23966

@jimczi

Description

@jimczi

We apply the requested minimum_should_match on parsed queries produced by query_string and simple_query_string. Since these queries can target multiple fields we used to rely on the disableCoord property of the Boolean query. Now that coords are gone in Lucene 7 we are not able to distinguish queries with multiple positions like:

"query": {
	"simple_query_string": {
		"fields": ["field1", "field2"],
		"query": "foo bar",
		"minimum_should_match": 1
	}
}

.... with single position queries on multiple fields:

"query": {
	"simple_query_string": {
		"fields": ["field1", "field2", "field3],
		"query": "foo",
		"minimum_should_match": 2
	}
}

In 5.x we were able to detect that the later query produces a single position and that minimum_should_matchshould not be applied.

In 6.0 this query would be rewritten in:
foo must appear in 2 out of the 3 provided fields

This is problematic but I can't think of a good fix for this.
Furthermore applying minimum_should_match on query_string and simple_query_string is difficult to understand for users (it depends on settings like split_on_whitespace, tokenization, ...) so I wonder if we should just remove the support for this option in these query parsers.
The match and multi_match_query are not impacted since we apply minimum_should_match per analyzer group.

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Search/SearchSearch-related issues that do not fall into other categoriesblocker

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions