Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add index_prefix option to text fields #28222

Closed
wants to merge 5 commits into from
Closed

Add index_prefix option to text fields #28222

wants to merge 5 commits into from

Conversation

romseygeek
Copy link
Contributor

This adds the ability to index term prefixes into a hidden subfield, enabling prefix queries to be run without multitermquery rewrites. The subfield reuses the analysis chain of its parent text field, appending an EdgeNGramTokenFilter. It can be configured with minimum and maximum ngram lengths. Query terms with lengths outside this min-max range fall back to using prefix queries against the parent text field.

The mapping looks like this:

"my_text_field" : {
    "type" : "text",
    "analyzer" : "english",
    "index_prefix" : { "min_chars" : 1, "max_chars" : 10 }
}

@romseygeek romseygeek added >enhancement :Search/Search Search-related issues that do not fall into other categories v7.0.0 v6.3.0 labels Jan 15, 2018
@romseygeek romseygeek self-assigned this Jan 15, 2018
@romseygeek romseygeek requested review from jpountz and jimczi January 15, 2018 14:05
@romseygeek
Copy link
Contributor Author

This is still a work-in-progress, and needs more comprehensive tests + docs, but I'd like to get some feedback on whether or not this is a sensible implementation.

if (prefixAnalyzer == null || prefixAnalyzer.accept(value.length()) == false) {
return super.prefixQuery(value, method, context);
}
TermQuery q = new TermQuery(new Term(name() + "._prefix", indexedValueForSearch(value)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think anything prevents a user from creating an explicit field with the same name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, no. Do we have a way of reserving field names elsewhere?

@jpountz
Copy link
Contributor

jpountz commented Jan 16, 2018

I think you are on the right track.

@rjernst raises a good point that there could be conflicts if a user configures a multi-field that also has _prefix as a name.

Do we have a way of reserving field names elsewhere?

I don't think we do. We only reserve fields that start with _ on the top level. I think the only restriction that we put on inner levels is that fields cannot contain a dot. Thinking out loud: would calling the field ${field_name}..prefix be a viable option? Such a field name should be illegal for regular fields.

@romseygeek
Copy link
Contributor Author

Closing in favour of #28290

@romseygeek romseygeek closed this Jan 29, 2018
@romseygeek romseygeek deleted the topic/27049-prefix-index-field branch January 29, 2018 10:01
@jimczi jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories v6.3.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants