Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use exponential scoring and consistent scale values for focus.point #1209

Merged
merged 3 commits into from
Nov 8, 2018

Conversation

orangejulius
Copy link
Member

@orangejulius orangejulius commented Oct 15, 2018

Our scale values for the center_point query used by Elasticsearch to score our focus.point queries was inconsistent between search and autocomplete. (50km for search, 250km for autocomplete).

This PR changes autocomplete to 50km, to be consistent with search. It's tough to judge which is the better value, but here's my reasoning:

  • a smaller scale means scores drop off faster. So a smaller value means that only very close records would have a high enough distance score to outweigh a far away record with high importance or population.
  • We have tested the search endpoint much more in general, so this 50km value might be more well tested

Additionally, this PR changes the decay function from linear to exponential. I don't recall why we settled on linear, as it was very long ago, but I suspect it might have been an attempt to prevent very far away records from being scored at all. In that case, our understanding of Elasticsearch at the time was incorrect.

By using exponential scoring, records of any distance will receive a non-zero score for the distance query. This means we can differentiate between otherwise identically-scoring records that are very far away from the focus point. This is a huge help when searching for administrative areas like localities, as well as postal codes.

Looking at the acceptance tests, I could not find any cases where these changes cause a failure. However, some newly added acceptance tests now pass.

connects #1206

@orangejulius orangejulius changed the title Use exponential scoring and consistent scale values for focus.point Use exponential scoring and consistent scale values for focus.point Oct 15, 2018
@missinglink
Copy link
Member

I think this is great, the only concern I have is potential performance regression, however, the other PRs we discussed today regarding performance will probably more than makeup for it.

@orangejulius
Copy link
Member Author

orangejulius commented Oct 29, 2018

It turns out from our testing that there is a small performance hit for this change (we saw a query have average times go from around 700 to around 800ms). Since some of our shorter text length autocomplete queries are already quite slow, we're going to look at some performance improvements like #1219 or #1215 before merging this.

@orangejulius orangejulius force-pushed the reset-focus-point-settings branch from dc890df to a9da289 Compare October 29, 2018 12:08
@orangejulius
Copy link
Member Author

We have made great improvements in slow autocomplete in #1219, I think this can be merged now.

@orangejulius orangejulius self-assigned this Nov 5, 2018
@orangejulius orangejulius force-pushed the reset-focus-point-settings branch 2 times, most recently from 88a981a to 982221f Compare November 7, 2018 14:57
Linear scoring, by design, gives all records the same score past a
certain point.

This has the disadvantage that identical records that are very far away
cannot be sorted by distance.

By using exponential scoring, we can achieve decent sorting of even very
far away records. This is very helpful for cities and postalcodes.

Connects #1206
The `scale` parameter controls how quickly scores decrease from the
maximum as the distance from the `center_point` to the record in
question increases.

Set this to 50km, which is the same as search.

Connects #1206
@orangejulius orangejulius force-pushed the reset-focus-point-settings branch from 982221f to 6d9e511 Compare November 8, 2018 16:21
@orangejulius orangejulius merged commit 68861d4 into master Nov 8, 2018
@orangejulius orangejulius deleted the reset-focus-point-settings branch November 8, 2018 17:38
orangejulius added a commit to pelias/acceptance-tests that referenced this pull request Nov 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants