-
-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use libpostal parses for venue queries where available #1380
base: master
Are you sure you want to change the base?
Conversation
The only reason I can think of for the existing behaviour is if libpostal erroneously identifies things as |
d1d1208
to
d74b893
Compare
I rebased this and put it up on dev today, it fixes the "vanity addresses" issue we've been discussing: cc/ @blackmad |
linked pelias/acceptance-tests#533 |
I ran the full acceptance test suite on this today and there were actually quite a few improvements, but at the same time it highlighted some issues. diff of changes vs. production: https://www.diffchecker.com/5Faotyih (ignore any errors related to screenshots of some issues inherited from |
Yeah, I suspect there are two reasons why this was never implemented in the past:
The first reason is obviously not a good one, but I imagine the hard part of actually merging this will be ensuring there aren't too many cases where, for example, something that is very much not a venue query, like one for an admin area or address, will be made worse. |
Right, so the question is "which parser does a better job of venues?" and the answer is "no" 😆 |
We're currently not using
libpostal
parses for venues, if we see a venue parse we're falling back to the native parser.I don't remember the history of this but it seems wrong to me 🤷♂
I noticed this when looking into some bug reports, one example being "Café Pelias".
There are two things currently going wrong with this query:
No query to call ES with. Skipping
So regarding the first point, I don't see why we would throw away the venue parse here from libpostal:
The
query
label has actually been mapped from the libpostalhouse
field incontroller/libpostal
, but this field indicates a venue name.As you can see from the PR edits, we don't currently consider these venue parses for query generation and I'm not sure why, I believe that libpostal is still superior to the native parser when it comes to venue queries and has always been better than addressit was?
Thoughts?