Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify query-time analysis #370

Merged
merged 2 commits into from
Jun 19, 2019
Merged

Simplify query-time analysis #370

merged 2 commits into from
Jun 19, 2019

Conversation

missinglink
Copy link
Member

@missinglink missinglink commented Jun 10, 2019

We have two 'query-time' analyzers right now:

  • peliasQueryPartialToken
  • peliasQueryFullToken

The difference between these analyzers is due to how the synonyms substitutions are handled.

Some time ago we made a change to how synonyms are applied in schema, so instead of 'expanding' or 'contracting' like we used to do, we now allow pure synonym substitution where either/or version of the token will match.

So this means that query-time synonym generation is no longer required as all versions are available in the index.

I started doing the work to remove the synonyms filter and realised quickly that peliasQueryPartialToken and peliasQueryFullToken would now do exactly the same thing, albeit under a different name.

I would like to remove both of those analyzers as they are just confusing and don't serve any functional purpose, plus in their current form they will have a negative performance impact.

Removing or renaming the analyzers would be a breaking change for users who are using the old analyzer names in their defaults configs.

So this PR adds a new analyzer (ironic I know!), which we can migrate the API code to use, and then later, after a courtesy window we can remove the old analyzers.

note: I wouldn't merge this until after the ampersand PR is merged, because that is the last synonyms file which does expansion/contraction

@missinglink
Copy link
Member Author

It might also be nice to set the search-analyzer for each field, this would remove the need to specify one at query-time.

@missinglink
Copy link
Member Author

missinglink commented Jun 14, 2019

Now that #369 has been merged we can also remove the ampersand substitution at query time.

Edit: actually no changes are required, I already removed that token filter.

@missinglink missinglink merged commit e97a012 into master Jun 19, 2019
@Joxit Joxit deleted the peliasQueryAnalyzer branch June 21, 2019 15:38
orangejulius added a commit to pelias/acceptance-tests that referenced this pull request Aug 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant