Releases: MinishLab/model2vec
Releases · MinishLab/model2vec
v0.3.9
What's Changed
- docs: Added new model results by @Pringled in #167
- docs: Update plot by @Pringled in #169
- feat: add trust-remote-code option by @stephantul in #173
- feat: Add SIF-like coef by @stephantul in #174
- increase version by @stephantul in #176
Full Changelog: v0.3.8...v0.3.9
v0.3.8
What's Changed
- docs: fix docstrings in distill by @stephantul in #157
- remove unnecessary import by @stephantul in #161
- remove deduplication tutorial by @stephantul in #159
- fix: issue with modernbert tokenizer, add token pattern to _distill by @stephantul in #158
- fix: fix typing issue by @stephantul in #162
- feat: float pca dims by @stephantul in #163
- feat: Add optional embedding normalization to StaticModel loading by @davidberenstein1957 in #164
- feat: Improve distill for modernBERT by @stephantul in #165
- increase version by @stephantul in #166
New Contributors
- @davidberenstein1957 made their first contribution in #164
Full Changelog: v0.3.7...v0.3.8
v0.3.7
v0.3.6
What's Changed
- Add loading from st by @stephantul in #151
- Bump version by @Pringled in #152
Full Changelog: v0.3.5...v0.3.6
v0.3.5
v0.3.4
What's Changed
- docs: Add txtai integration docs by @Pringled in #130
- docs: Reworked documentation by @Pringled in #131
- feat: Added semantic chunking with chonkie tutorial by @Pringled in #133
- feat: Updated config values by @Pringled in #136
- feat: add support for pattern for unused tokens. by @stephantul in #138
- feat: Add multiprocessing by @Pringled in #141 (suggested by davidmezzetti in #139)
- feat: Added multiprocessing threshold parameter by @Pringled in #142
- docs: Add langchain example by @Pringled in #143
- fix: Removed unneeded tokenize call by @Pringled in #144
- docs: update README.md by @eltociear in #145
- Bump version by @Pringled in #146
New Contributors
- @eltociear made their first contribution in #145
Full Changelog: v0.3.3...v0.3.4
v0.3.3
What's Changed
- feat: Added onnx and tokenizer files support script by @Pringled in #119
- docs: Update readme by @Pringled in #122
- fix: Fixed CI by @Pringled in #124
- docs: Updated results table by @Pringled in #125
- docs: Updated slogan by @Pringled in #127
- fix: Added jinja2 requirement by @Pringled in #128
- Bumped version by @Pringled in #129
Full Changelog: v0.3.2...v0.3.3
v0.3.2
v0.3.1
What's Changed
- fix: update added tokens to be more agnostic by @stephantul in #107
- fix: don't rely on reported vocab size, log warning if inconsistent by @stephantul in #109
- docs: Fixed broken links by @Pringled in #112
- feat: make encode_batch_fast optional by @stephantul in #113
- fix: normalize would lead to NaN for empty docs by @stephantul in #114
- docs: Add tokenlearn results by @Pringled in #116
- docs: Updated plot by @Pringled in #117
- Bump version by @Pringled in #118
Full Changelog: v0.3.0...v0.3.1
v0.3.0
What's Changed
- fix: Fix token type ids not supported by @Pringled in #77
- docs: Add deduplication tutorial by @Pringled in #72
- Fix distill model bos and eos token by @zechengz in #78
- docs: Added Sentence Transformers example code by @Pringled in #80
- docs: Update readme by @Pringled in #81
- docs: Move results and add blogpost by @Pringled in #82
- docs: Fixed broken link by @Pringled in #84
- fix: move tensor to cpu by @stephantul in #86
- feat: Numpy inference by @stephantul in #87
- feat: local loading by @stephantul in #88
- feat: faster tokenization by @stephantul in #89
- enhancement: Add dynamic version by @stephantul in #91
- enhancement: Add explained variance messages by @stephantul in #92
- docs: Updated slogan by @Pringled in #94
- feat: Add python3.9 support by @Pringled in #96
- enh: remove CLI command by @stephantul in #98
- fix: rename show progress bar argument by @stephantul in #99
- fix: Reverted eos bos change by @Pringled in #101
- docs: Added results link by @Pringled in #102
- docs: Fix broken link by @Pringled in #103
- increment version by @stephantul in #104
New Contributors
Full Changelog: v0.2.4...v0.3.0