Matching of non-accented letters to accented letters #17

jonathanheron · 2011-11-21T12:16:50Z

Some countries' alternative spellings (Éire, Österreich etc) include accented characters that are not matched by their non-accented equivalents.

For example, entering 'eire' for Ireland does not result in a match for Ireland.

Rather than amending the data-alternative-spellings to account for accent variations, would it be possible to having broad string matching that works with-without accents?

jamieholst · 2011-11-21T16:01:49Z

Theoretically, yes. For performance reasons this would probably have to be done on initialization. Also, we'd also need a good list of mappings..

Have you seen a similar implementation in another front-end framework – maybe there's some experience and approaches we can get some inspiration from.

timcooper · 2012-03-27T10:18:43Z

I've implemented this, the only difference being my options are French wines instead of countries, so I only cover French diacritics.

Basically, I run the original label/alternative_spellings through my convert function and then add it to the 'matches' string on initialization. Not sure if I do the converting in the best way but it's just a bunch of chained .replace()s

I can pull request if you would like, but I imagine you would want to add an option for it and come up with a better convert function that covers a broader range of accents.

The way it works is if you type eg. "Le" it will match all the "le"s and "lé"s, however if you type "Lé" it will only match the "lé"s.

jamieholst · 2012-04-05T09:22:16Z

I'd like this feature to be some sort of a map so the user can control the behavior. So the user could pass in an 'accented-letters' option which would be an array of character mappings. In this array there could then be either arrays of hashes. Arrays would be a bi-directional mapping, so any character in the sub-array would map to any of the other characters in that sub-array. Hashes would be a one-directional mapping, so the key would match all the values but the values would not match the key. Example:

'accented-letters': [
  ['ss', 'ß'],
  { 'e': ['é', 'è'] }
]

In this scenario, ss and ß would be interchangeable – it doesn't matter what you use when searching. Furthermore, when searching for Le you would get results for Le, Lé and Lè, whereas if you searched for Lé you would only get results for Lé (that is, neither Le nor Lè would match because it is a one-directional mapping).

Finally, if the 'accented-letters' has a value of false then the feature should be completely disabled for performance gains.

Sjord · 2015-02-21T10:39:43Z

The correct way to do this in javascript is to use the Intl.Collator with a sensitivity of base. This provides a compare function to compare two strings, where any diacritics are ignored in the comparison.

E.g.
Intl.Collator('de', {'sensitivity': 'base'}).compare('Österreich', 'Osterreich') == 0

Canisue.com has info on browser support for this.

The compare function can only compare two whole strings, and selectToAutocomplete we want to match parts of country names, so it is not so easy to replace the current regexes with this function. However, I think letting a user specify a locale and sensitivity is a better way than using a hardcoded character map to convert accented characters into ascii.

Sjord · 2015-02-21T11:33:56Z

I made a proof of concept with PR #84.

jamieholst removed the Being debated label Aug 12, 2014

brandoncarl added a commit to brandoncarl/country-selector that referenced this issue Apr 7, 2015

Adds Intl.Collator support w/fallback (fixes jamieholst#17)

7535b3c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Matching of non-accented letters to accented letters #17

Matching of non-accented letters to accented letters #17

jonathanheron commented Nov 21, 2011

jamieholst commented Nov 21, 2011

timcooper commented Mar 27, 2012

jamieholst commented Apr 5, 2012

Sjord commented Feb 21, 2015

Sjord commented Feb 21, 2015

Matching of non-accented letters to accented letters #17

Matching of non-accented letters to accented letters #17

Comments

jonathanheron commented Nov 21, 2011

jamieholst commented Nov 21, 2011

timcooper commented Mar 27, 2012

jamieholst commented Apr 5, 2012

Sjord commented Feb 21, 2015

Sjord commented Feb 21, 2015