Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Higher Levenshtein Distance than 2 #72

Open
fwollatz opened this issue Aug 19, 2020 · 2 comments
Open

Higher Levenshtein Distance than 2 #72

fwollatz opened this issue Aug 19, 2020 · 2 comments

Comments

@fwollatz
Copy link

It is not possible to correct Words, that have a higher Levenshtein-Distance than 2. (At least in German).

A parameter to change this would be much appreciated.

@barrust
Copy link
Owner

barrust commented Aug 22, 2020

Currently I limit the Levenshtein-Distance to 2 for all languages, including German. The reason is that for longer words (say 10 characters or longer) getting past 2 creates a very long list of possibles to check and the system slows down. It is feasible to allow larger Levenshtein-Distances and I would appreciate any help or pull requests that make that possible.

@maayanorner
Copy link

I think it's possibly necessary to shift from creating all candidates to O(n^2) loop on the vocabulary with some efficient variant (Mbleven, for example) of edit distance.
Then the complexity will be (~) a function that depends more on the vocabulary size rather than the edit distance.

Though, from experience - I have to say it does not scale well (but better than creating all candidates, at least for long words).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants