Releases: marijnkoolen/fuzzy-search
Releases · marijnkoolen/fuzzy-search
v2.4.4
Improve tokenizers (introduced in 2.4.1)
- Improve efficiency of default tokenizer
- Add option to
RegExTokenizer
to usesplit_pattern
(the pattern that separates tokens, and that will be removed) ortoken_pattern
(the pattern for tokens and that will be retained) - Make boundary tokens have length zero so that char indexes of text tokens correspond to original text
Full Changelog: 2.4.3...v2.4.4
v2.4.0
- Add a vocabulary to allow setting distractor pairs for common text terms matching phrase terms, to do early pruning of pairs of text tokens and phrase tokens.
- Switch to using
python-levelshtein
for faster Levenshtein computation and early stopping. - Add an option to pad text and phrase tokens with boundary characters (
#
) to increase matches when one of the beginning or ending characters matches, or for very short words (shorter than ngram size).
Full Changelog: v2.3.0...v2.4.0
v2.3.0
What's Changed
- Update
FuzzyTokenSearcher
to be more exhaustive by @marijnkoolen in #1
Full Changelog: 1.4.3...v2.3.0
fuzzy-search 1.4.3
Various bug fixes.
1.0.0
This release adds fuzzy search templates and template searching, as well as numerous bug fixes and improved documentation.
0.2.0
This release contains a complete rewrite of the fuzzy search code, with a cleaner API, proper documentation and unit tests.
Initial release
This is the first release with the basic fuzzy string searching functionality in place.