No matches for a pair you'd expect (?) to find matches for #6

thisisaaronland · 2018-04-17T18:01:54Z

Passing this along, FYI. These two venues were in the same set of candidates to dedupe but lieu didn't seem to pick up on them.

Isamu Noguchi Museum And Garden
32 37 Vernon Blvd Long Island City, Ny 11106
40.76713, -73.93773

Isamu Noguchi Foundation And Garden Museum
3237 Vernon Blvd Long Is City, Ny 11106-4926
40.76713, -73.93773

The text was updated successfully, but these errors were encountered:

albarrentine · 2018-04-17T19:23:12Z

House number doesn't match because of the space. We don't currently touch house numbers except for basic normalization and removing phrases like "#", "House No.", etc.

There are too many variants and edge cases around the world to use a one-size-fits-all approach for spaces/hyphens without losing information. If you know in advance the data set comes from one country or place, removing spaces and hyphens in preprocessing is an option.

thisisaaronland · 2018-04-17T19:43:38Z

That's good to know. I can update the "prepare" tool accordingly. Are there any other similar rules / gotchas to be aware of or account for?

albarrentine · 2018-04-17T20:21:40Z

In data sets without lat/lons, postcodes are often useful, so with voter files for instance I've been normalizing to the ZIP5 in the US. Similarly in GB/CA might want to strip spaces as there can be a variety of different formats.

Some of that we may be able to implement on the libpostal side at some point if the postcode format is unique to a country (GB/CA are definitely unique and I don't think anywhere else in the world has DDDDD-DDDD, so might be fine to assume it's the US and match on the first 5).

thisisaaronland mentioned this issue Apr 17, 2018

Normalize house numbers in prepare tool(s) whosonfirst/go-whosonfirst-lieu#1

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No matches for a pair you'd expect (?) to find matches for #6

No matches for a pair you'd expect (?) to find matches for #6

thisisaaronland commented Apr 17, 2018

albarrentine commented Apr 17, 2018

thisisaaronland commented Apr 17, 2018

albarrentine commented Apr 17, 2018 •

edited

Loading

No matches for a pair you'd expect (?) to find matches for #6

No matches for a pair you'd expect (?) to find matches for #6

Comments

thisisaaronland commented Apr 17, 2018

albarrentine commented Apr 17, 2018

thisisaaronland commented Apr 17, 2018

albarrentine commented Apr 17, 2018 • edited Loading

albarrentine commented Apr 17, 2018 •

edited

Loading