Higher Levenshtein Distance than 2 #72

fwollatz · 2020-08-19T12:50:07Z

It is not possible to correct Words, that have a higher Levenshtein-Distance than 2. (At least in German).

A parameter to change this would be much appreciated.

barrust · 2020-08-22T12:42:53Z

Currently I limit the Levenshtein-Distance to 2 for all languages, including German. The reason is that for longer words (say 10 characters or longer) getting past 2 creates a very long list of possibles to check and the system slows down. It is feasible to allow larger Levenshtein-Distances and I would appreciate any help or pull requests that make that possible.

maayanorner · 2020-08-29T14:17:58Z

I think it's possibly necessary to shift from creating all candidates to O(n^2) loop on the vocabulary with some efficient variant (Mbleven, for example) of edit distance.
Then the complexity will be (~) a function that depends more on the vocabulary size rather than the edit distance.

Though, from experience - I have to say it does not scale well (but better than creating all candidates, at least for long words).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Higher Levenshtein Distance than 2 #72

Higher Levenshtein Distance than 2 #72

fwollatz commented Aug 19, 2020

barrust commented Aug 22, 2020

maayanorner commented Aug 29, 2020

Higher Levenshtein Distance than 2 #72

Higher Levenshtein Distance than 2 #72

Comments

fwollatz commented Aug 19, 2020

barrust commented Aug 22, 2020

maayanorner commented Aug 29, 2020