Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to ignore accents (diacritics) [feature] #415

Closed
DocBouCou opened this issue May 6, 2020 · 4 comments
Closed

Option to ignore accents (diacritics) [feature] #415

DocBouCou opened this issue May 6, 2020 · 4 comments

Comments

@DocBouCou
Copy link

DocBouCou commented May 6, 2020

I am using fuse.js to fuzzy-search in french language. In french (as in other languages) we have accentuated characters (é, è, ê, and e for instance).
Some users will search with no accents, others using wrong ones (quite common misspelling), and others with the exact correct spelling, leading to different search score. I would like, if possible, an option "isAccentSensitive" (for instance) so I can choose if the score should be the same even with the wrong accent-spelling.
Thank you. It would really help if it is by default included in fuse :) .
(edit: sorry, close to #181 )

@mathieutu
Copy link

mathieutu commented May 27, 2020

Hello!
My deux centimes two cents on this topic:
Here is a good method I use to remove accents:

export const removeAccents = str => str.normalize('NFD').replace(/[\u0300-\u036F]/g, '')

@mjbcopland
Copy link

You could use a custom getFn which wraps the existing implementation. This will work for any method of normalising strings, so it's more flexible than a built-in isAccentSensitive option, but there is already a built-in isCaseSensitive so it could be worth including too if there's high demand.

https://fusejs.io/api/options.html#getfn
https://fusejs.io/api/config.html

import diacritics from 'diacritics';

function getFn() {
  return diacritics.remove(Fuse.config.getFn.apply(this, arguments));
}

You'll probably also want to remove diacritics from the search query.

fuse.search(diacritics.remove(query))

@marceloverdijk
Copy link

I had some issues using above code with undefined values and string arrays like:

const list = [
  {
    title: "Old Man's War",
    author: 'John Scalzi',
    tags: ['fiction']
  },
  {
    title: 'The Lock Artist',
    // no author defined but in keys
    tags: ['thriller']
  }
]

So I did it as below.

function removeAccents(obj) {
  if (typeof obj === 'string' || obj instanceof String) {
    return obj.normalize("NFD").replace(/[\u0300-\u036f]/g, "");
  }
  return obj
}

function getFn: (obj, path) => {
  var value = Fuse.config.getFn(obj, path);
  if (Array.isArray(value)) {
    return value.map(el => removeAccents(el));
  }
  return removeAccents(value);
}

Maybe this is helpful for others.

@craPkit
Copy link

craPkit commented Sep 22, 2023

One problem with the proposed workarounds is that you lose diacritic characters from the original value in the returned matches array (when using option includeMatches).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants