Use exponential scoring and consistent scale values for `focus.point` #1209

orangejulius · 2018-10-15T14:58:43Z

Our scale values for the center_point query used by Elasticsearch to score our focus.point queries was inconsistent between search and autocomplete. (50km for search, 250km for autocomplete).

This PR changes autocomplete to 50km, to be consistent with search. It's tough to judge which is the better value, but here's my reasoning:

a smaller scale means scores drop off faster. So a smaller value means that only very close records would have a high enough distance score to outweigh a far away record with high importance or population.
We have tested the search endpoint much more in general, so this 50km value might be more well tested

Additionally, this PR changes the decay function from linear to exponential. I don't recall why we settled on linear, as it was very long ago, but I suspect it might have been an attempt to prevent very far away records from being scored at all. In that case, our understanding of Elasticsearch at the time was incorrect.

By using exponential scoring, records of any distance will receive a non-zero score for the distance query. This means we can differentiate between otherwise identically-scoring records that are very far away from the focus point. This is a huge help when searching for administrative areas like localities, as well as postal codes.

Looking at the acceptance tests, I could not find any cases where these changes cause a failure. However, some newly added acceptance tests now pass.

connects #1206

missinglink · 2018-10-24T14:44:42Z

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html

missinglink · 2018-10-24T15:26:43Z

I think this is great, the only concern I have is potential performance regression, however, the other PRs we discussed today regarding performance will probably more than makeup for it.

orangejulius · 2018-10-29T12:07:05Z

It turns out from our testing that there is a small performance hit for this change (we saw a query have average times go from around 700 to around 800ms). Since some of our shorter text length autocomplete queries are already quite slow, we're going to look at some performance improvements like #1219 or #1215 before merging this.

orangejulius · 2018-11-05T17:35:05Z

We have made great improvements in slow autocomplete in #1219, I think this can be merged now.

Linear scoring, by design, gives all records the same score past a certain point. This has the disadvantage that identical records that are very far away cannot be sorted by distance. By using exponential scoring, we can achieve decent sorting of even very far away records. This is very helpful for cities and postalcodes. Connects #1206

The `scale` parameter controls how quickly scores decrease from the maximum as the distance from the `center_point` to the record in question increases. Set this to 50km, which is the same as search. Connects #1206

Passes as of pelias/api#1209

orangejulius requested a review from missinglink October 15, 2018 14:58

orangejulius changed the title ~~Use exponential scoring and consistent scale values for focus.point~~ Use exponential scoring and consistent scale values for focus.point Oct 15, 2018

orangejulius force-pushed the reset-focus-point-settings branch from dc890df to a9da289 Compare October 29, 2018 12:08

orangejulius self-assigned this Nov 5, 2018

orangejulius force-pushed the reset-focus-point-settings branch 2 times, most recently from 88a981a to 982221f Compare November 7, 2018 14:57

orangejulius added 3 commits November 8, 2018 11:20

fix(autocomplete focus.point): Use 50km scale parameter

1fa50de

The `scale` parameter controls how quickly scores decrease from the maximum as the distance from the `center_point` to the record in question increases. Set this to 50km, which is the same as search. Connects #1206

Use test description consistent with fixture file

6d9e511

orangejulius force-pushed the reset-focus-point-settings branch from 982221f to 6d9e511 Compare November 8, 2018 16:21

orangejulius merged commit 68861d4 into master Nov 8, 2018

orangejulius deleted the reset-focus-point-settings branch November 8, 2018 17:38

orangejulius added a commit to pelias/acceptance-tests that referenced this pull request Nov 8, 2018

Mark searhch focus.point test passing!

8a03bd1

Passes as of pelias/api#1209

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use exponential scoring and consistent scale values for `focus.point` #1209

Use exponential scoring and consistent scale values for `focus.point` #1209

orangejulius commented Oct 15, 2018 •

edited

Loading

missinglink commented Oct 24, 2018

missinglink commented Oct 24, 2018

orangejulius commented Oct 29, 2018 •

edited

Loading

orangejulius commented Nov 5, 2018

Use exponential scoring and consistent scale values for focus.point #1209

Use exponential scoring and consistent scale values for focus.point #1209

Conversation

orangejulius commented Oct 15, 2018 • edited Loading

missinglink commented Oct 24, 2018

missinglink commented Oct 24, 2018

orangejulius commented Oct 29, 2018 • edited Loading

orangejulius commented Nov 5, 2018

Use exponential scoring and consistent scale values for `focus.point` #1209

Use exponential scoring and consistent scale values for `focus.point` #1209

orangejulius commented Oct 15, 2018 •

edited

Loading

orangejulius commented Oct 29, 2018 •

edited

Loading