Releases · gandersen101/spaczz

12 Mar 01:16

gandersen101

v0.6.1

0b774b6

v0.6.1 Regex[Searcher/Matcher] Bugfix Latest

Latest

What’s Changed

🪲 Fixes

Updating readthedocs config (#86) @gandersen101
Partial regex matcher doesn't work if the found token has index 0 (#82) @adinowi

🚨 Testing

Adding Test for Partial Regex Search at 0 Index (#85) @gandersen101
Updating Dependencies to Test Against (#83) @gandersen101

👷 Continuous Integration

Updating GH action versions (#84) @gandersen101
Updating Dependencies to Test Against (#83) @gandersen101

📚 Documentation

Updating readthedocs config (#86) @gandersen101

Contributors

adinowi and gandersen101

Assets 2

01 May 13:05

github-actions

v0.6.0

5bb2ad5

v0.6.0 Returning Patterns, Consistency and Support Updates

Returning the matching pattern for all matchers, this is a breaking change as matches are now tuples of length 5 instead of 4.
Regex and token matches now return match ratios.
Support for python<=3.11,>=3.7, along with rapidfuzz>=1.0.0.
Dropped support for spaCy v2. Sorry to do this without a deprecation cycle, but I stepped away from this project for a long time.
Removed support of "spaczz_" preprended optional SpaczzRuler init arguments. Also, sorry to do this without a deprecation cycle.
Matcher.pipe methods, which were deprecated, are now removed.
spaczz_span custom attribute, which was deprecated, is now removed.

Assets 2

23 Dec 19:29

github-actions

v0.5.4

1b6f797

v0.5.4 RegexSearcher Bugfix

What’s Changed

BugFix for german Combination words for RegexSearcher (#66) @JonasHablitzel
Including flake8 plugins in pre-commit (#63) @gandersen101

📚 Documentation

Updating available fuzzyfuncs in docs (#62) @gandersen101

Contributors

gandersen101 and JonasHablitzel

Assets 2

22 May 21:39

gandersen101

v0.5.3

c2895f6

v0.5.3 Bugfix: TokenMatcher Match Order

Fixed a "bug" in the TokenMatcher. Spaczz expects token matches returned in order of ascending match start, then descending match length. However, spaCy's Matcher does not return matches in this order by default. Added a sort in the TokenMatcher to ensure this.

Assets 2

04 May 15:34

gandersen101

v0.5.2

13fced9

v0.5.2 CI/Dev Updates

Minor updates to pre-commits and noxfile.

Assets 2

25 Apr 15:52

gandersen101

v0.5.1

d2161b5

v0.5.1 Dependency and Typing Updates

Minor updates to allowed dependency versions and CI.
Switched back to using typing types instead of generic types because spaCy v3 uses Pydantic and Pydantic does not support generic types in Python < 3.9. I don't know if this would actually cause any issues but I am playing it safe. Potentially more changes for spaczz to play nicely with Pydantic to follow.

Assets 2

01 Mar 19:22

github-actions

v0.5.0

de34205

v0.5.0 spaCy v3 Support

What’s Changed

🚀 Features

Enhancement spacy3 support (#52) @gandersen101
- Support for spaCy v3.
- If using spaCy v3, the SpaczzRuler optional arguments no longer need to be prepended with "spaczz_". This will still work in most cases offering some backwards compatibility. However, optional arguments prepended with "spaczz_" will not work with spaCy v3's new spacy.load and nlp.add_pipe config driven APIs. It is therefore recommended that users move away from using the prepended versions if using spaCy v3. It should be noted however that the prepended arguments are still necessary if using spaczz with spaCy v2.
- Matcher.pipe methods are now deprecated in accordance with spaCy v3.
- spaczz_span custom attribute is deprecated in favor of spaczz_ent. They both have the same functionality but the -spaczz_ent name makes more sense.

Assets 2

25 Feb 03:44

gandersen101

0.4.2

e184570

v0.4.2 SpaczzRuler Bug Fixes

Fixed a bug where TokenMatcher callbacks did nothing.
Fixed a bug where spaczz_token_defaults in the SpaczzRuler did nothing.
Fixed a bug where defaults would not be added to their respective matchers when loading from bytes/disk in the SpaczzRuler.
Fixed some inconsistencies in the SpaczzRuler which will be particularly noticeable with ent_ids. See the "Known Issues" section below for more details.
Small tweaks to spaczz custom attributes.
Available fuzzy matching functions have changed in RapidFuzz and have changed in spaczz accordingly.
Preparing for spaCy v3 updates.

Assets 2

31 Jan 00:04

gandersen101

0.4.1

cbd2b19

v0.4.1 Phrasesearch Performance Improvements

Spaczz's phrase searching algorithm has been further optimized so both the FuzzyMatcher and SimilarityMatcher should run considerably faster.
The FuzzyMatcher and SimilarityMatcher now include a thresh parameter that defaults to 100. When matching, if flex > 0 and the match ratio is >= thresh during the initial scan of the document, no optimization will be attempted. By default perfect matches don't need to be run through match optimization.
flex now defaults to len(pattern) // 2. This creates more meaningful difference between "default" and "max" with longer patterns.
PEP585 code updates.

Assets 2

20 Jan 19:24

gandersen101

v0.4.0

1622623

v0.4.0 TokenMatcher

Adds the TokenMatcher to spaczz and integrates it with the SpaczzRuler. Also overhauls spaczz's custom attributes and includes some quality of life improvements and bug fixes.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What’s Changed

🪲 Fixes

🚨 Testing

👷 Continuous Integration

📚 Documentation

Contributors

What’s Changed

📚 Documentation

Contributors

What’s Changed

🚀 Features

Releases: gandersen101/spaczz

v0.6.1 Regex[Searcher/Matcher] Bugfix

What’s Changed

🪲 Fixes

🚨 Testing

👷 Continuous Integration

📚 Documentation

Contributors

v0.6.0 Returning Patterns, Consistency and Support Updates

v0.5.4 RegexSearcher Bugfix

What’s Changed

📚 Documentation

Contributors

v0.5.3 Bugfix: TokenMatcher Match Order

v0.5.2 CI/Dev Updates

v0.5.1 Dependency and Typing Updates

v0.5.0 spaCy v3 Support

What’s Changed

🚀 Features

v0.4.2 SpaczzRuler Bug Fixes

v0.4.1 Phrasesearch Performance Improvements

v0.4.0 TokenMatcher