-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broken sort on multiple-level nested documents #32130
Comments
Pinging @elastic/es-search-aggs |
@JulienColin thanks for raising this here, and thanks for the great reproduction. I was able to see similiar behaviour locally on 6.3.0. |
Whats indeed weird is that in the case of bulk indexing, the sort-value for document "2" seems to get picked up from the smallest "strength"-value in document "1". If I e.g. change this to |
This might be related to the problem under discussion in #31554 Not quite the same (no missing fields), but similar symptoms: wrong sort values getting picked up. |
Hello , thank you @cbuescher for your help, and @polyfractal for pointing out the similarities. |
The parent filter for nested sort should always match **all** parents regardless of the child queries. It is used to find the boundaries of a single parent and we use the child query to match all the filters set in the nested tree so there is no need to repeat the nested filters. With this change we ensure that we build bitset filters only to find the root docs (or the docs at the level where the sort applies) that can be reused among queries. Closes #31554 Closes #32130 Closes #31783 Co-authored-by: Dominic Bevacqua <bev@treatwell.com>
The parent filter for nested sort should always match **all** parents regardless of the child queries. It is used to find the boundaries of a single parent and we use the child query to match all the filters set in the nested tree so there is no need to repeat the nested filters. With this change we ensure that we build bitset filters only to find the root docs (or the docs at the level where the sort applies) that can be reused among queries. Closes #31554 Closes #32130 Closes #31783 Co-authored-by: Dominic Bevacqua <bev@treatwell.com>
The parent filter for nested sort should always match **all** parents regardless of the child queries. It is used to find the boundaries of a single parent and we use the child query to match all the filters set in the nested tree so there is no need to repeat the nested filters. With this change we ensure that we build bitset filters only to find the root docs (or the docs at the level where the sort applies) that can be reused among queries. Closes #31554 Closes #32130 Closes #31783 Co-authored-by: Dominic Bevacqua <bev@treatwell.com>
ES 6.2, 6.3.0, 6.3.1 and 6.3.2 and below have a bug that prevents double-nested sorts from working: elastic/elasticsearch#32130 In our case, DistanceSearchSortBaseIT and FieldSearchSortBaseIT were failing with parameter IndexFieldLocation.IN_NESTED_TWICE.
ES 6.2, 6.3.0, 6.3.1 and 6.3.2 and below have a bug that prevents double-nested sorts from working: elastic/elasticsearch#32130 In our case, DistanceSearchSortBaseIT and FieldSearchSortBaseIT were failing with parameter IndexFieldLocation.IN_NESTED_TWICE.
ES 6.2, 6.3.0, 6.3.1 and 6.3.2 and below have a bug that prevents double-nested sorts from working: elastic/elasticsearch#32130 In our case, DistanceSearchSortBaseIT and FieldSearchSortBaseIT were failing with parameter IndexFieldLocation.IN_NESTED_TWICE.
ES 6.2, 6.3.0, 6.3.1 and 6.3.2 and below have a bug that prevents double-nested sorts from working: elastic/elasticsearch#32130 In our case, DistanceSearchSortBaseIT and FieldSearchSortBaseIT were failing with parameter IndexFieldLocation.IN_NESTED_TWICE.
ES 6.2, 6.3.0, 6.3.1 and 6.3.2 and below have a bug that prevents double-nested sorts from working: elastic/elasticsearch#32130 In our case, DistanceSearchSortBaseIT and FieldSearchSortBaseIT were failing with parameter IndexFieldLocation.IN_NESTED_TWICE.
Elasticsearch version : 6.3.1 and below
JVM version : 1.8.0_171
OS version : Ubuntu 16.04 LTS
Expected behaviours :
correct sort. The family with id=2 should get a sort value of 30 in the example below.
Problem description :
faulty sort when querying on a 3-levels nested objects model, and sorting parent objects on a field from the lower level. In the example below, family with id=2 is getting a sort value of 10 while it should be 30 (the value 10 doesn't even appear in the document with id=2).
Steps to reproduce:
Create index
PUT tree { "settings": {"number_of_shards": 1,"number_of_replicas": 0 } }
Put mapping
PUT tree/family/_mapping {"properties":{"name":{"type":"keyword"},"members":{"type":"nested","properties":{"firstname":{"type":"keyword"},"color":{"type":"keyword"},"levels":{"type":"nested","properties":{"strength":{"type":"integer"}}}}}}}
Insert data (bulk index API)
POST _bulk { "index" : { "_index" : "tree", "_type" : "family", "_id" : "1" } } {"name":"Doe","members":[{"firstName":"John","color":"brown","levels":{"strength":10}},{"firstName":"Serge","color":"brown","levels":{"strength":15}},{"firstName":"Marie","color":"brown","levels":{"strength":20}}]} { "index" : { "_index" : "tree", "_type" : "family", "_id" : "2" } } {"name":"Simpson","members":[{"firstName":"Homer","color":"brown","levels":{"strength":30}},{"firstName":"Lisa","color":"brown","levels":{"strength":40}},{"firstName":"Marge","color":"brown","levels":{"strength":60}}]} { "index" : { "_index" : "tree", "_type" : "family", "_id" : "3" } } {"name":"Simpson","members":[{"firstName":"Bart","color":"yellow","levels":{"strength":70}},{"firstName":"Snowball","color":"yellow","levels":{"strength":80}},{"firstName":"Maggie","color":"yellow","levels":{"strength":90}},{"firstName":"Gandpa","color":"brown","levels":{"strength":95}}]}
Query
GET tree/_search { "query": { "bool": { "filter": [ { "term": { "name": { "value": "Simpson" } } }, { "nested": { "path" : "members", "query": { "bool" : { "filter" : [ { "term" : { "members.color" : { "value" : "brown" } } } ] } } } } ] } }, "sort": [ { "members.levels.strength": { "order": "asc", "nested": { "path": "members", "filter": { "term" : { "members.color" : { "value" : "brown" } } }, "nested": { "path": "members.levels" } } } } ] }
Results
{ "hits": { "total": 2, "max_score": null, "hits": [ { "_index": "tree", "_type": "family", "_id": "2", "_score": null, "_source": { "name": "Simpson", "members": [ { "firstName": "Homer", "color": "brown", "levels": { "strength": 30 } }, { "firstName": "Lisa", "color": "brown", "levels": { "strength": 40 } }, { "firstName": "Marge", "color": "brown", "levels": { "strength": 60 } } ] }, "sort": [ 10 ] }, ... ] } }
Note that the result of the query above is correct if the index API was used instead of the bulk API, using the commands below :
POST tree/family {"name":"Doe","members":[{"firstName":"John","color":"brown","levels":{"strength":10}},{"firstName":"Serge","color":"brown","levels":{"strength":15}},{"firstName":"Marie","color":"brown","levels":{"strength":20}}]} POST tree/family {"name":"Simpson","members":[{"firstName":"Homer","color":"brown","levels":{"strength":30}},{"firstName":"Lisa","color":"brown","levels":{"strength":40}},{"firstName":"Marge","color":"brown","levels":{"strength":60}}]} POST tree/family {"name":"Simpson","members":[{"firstName":"Bart","color":"yellow","levels":{"strength":70}},{"firstName":"Snowball","color":"yellow","levels":{"strength":80}},{"firstName":"Maggie","color":"yellow","levels":{"strength":90}},{"firstName":"Gandpa","color":"brown","levels":{"strength":95}}]}
See following discussion : https://discuss.elastic.co/t/issue-sorting-nested-documents-indexed-via-bulk/139164
The text was updated successfully, but these errors were encountered: