Skip to content

[BUG] Wildcard sort error when doc_value enabled and size less than total docs #18461

@HUSTERGS

Description

@HUSTERGS

Describe the bug

Current wildcard field implementation uses SortedSet as the doc value type, which will trigger an dynamic pruning in lucene when the size is less than total docs and query contains sort.

Spefically, lucene expect the term dict also contains the doc values, but this is not true for wildcard, because it uses N-gram to generate terms, but full value in doc value, which will cause an IllegalState Exception

Image

https://github.com/apache/lucene/blob/485141dd34ea866ad9dc59843770969d1b0c8fa2/lucene/core/src/java/org/apache/lucene/search/comparators/TermOrdValComparator.java#L569-L572

Related component

Search

To Reproduce

I add another yaml test case under my repo HUSTERGS@8c7cb9b

Expected behavior

Wildcard sort can work properly

Additional Details

Additional context
I'm thinking about backward compatibility, the simplest way is to change SortedSetDocValue to BinaryDocValue, but already written data might have problem when upgrade the OS, we can add some version check to solve this. Another way is to add the full value to the term dict explicitly by WildcardFieldTokenizer which may help address this problem

Metadata

Metadata

Assignees

Labels

SearchSearch query, autocomplete ...etcbugSomething isn't workinglucenev3.3.0

Type

No type

Projects

Status

✅ Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions