Releases: winkjs/wink-nlp
Releases · winkjs/wink-nlp
Fixed some type definitions
Version 2.3.1 Nov 24, 2024
🐛 Fixes
- Updated some BM25Vectorizer methods types according to implementation — thanks to @pavloDeshko ✅
Enabled more special space characters handling
Version 2.3.0 May 19, 2024
✨ Features
- Detokenization now restores em/en, third/quarter, thin/hair, medium math space characters & narrow non breaking space characters besides the regular nbsp. 👏 🙌 🛰️
Improved error handling in contextual vectors
Version 2.2.2 May 08, 2024
✨ Features
.contextualVectors()
now throws error if (a) word vectors are not loaded and (b) withlemma: true
, "pos" is missing in the NLP pipe. 🤓
🐛 Fixes
- Refined typescript definitions further. ✅
Added missing typescript definitions
Version 2.2.1 May 06, 2024
🐛 Fixes
- Added missing typescript definitions for word embeddings besides few other typescript fixes. ✅
Added non-breaking space handling capabilities
Version 2.2.0 April 03, 2024
✨ Features
- Detokenization restores both regular and non-breaking spaces to their original positions. 🤓
Introducing cosine similarity for word vectors
Version 2.1.0 March 24, 2024
✨ Features
- You can now use
similarity.vector.cosine( vectorA, vectorB )
to compute similarity between two vectors on a scale of 0 to 1. 🤓
Word embeddings have arrived!
Version 2.0.0 March 24, 2024
✨ Features
- Seamless word embedding integration enhances winkNLP's semantic capabilities. 🎉 👏 🙌
- Pre-trained 100-dimensional word embeddings for over 350,000 English words released: wink-embeddings-sg-100d. 💯
- API remains unchanged — no code updates needed for existing projects. The new APIs include: 🤩
- Obtain vector for a token: Use the
.vectorOf( token )
API. - Compute sentence/document embeddings: Employ the
as.vector
helper: use.out( its.lemma, as.vector )
on tokens of a sentence or document. You can also useits.value
orits.normal
. Tokens can be pre-processed to remove stop words etc using the.filter()
API. Note, theas.vector
helper uses averaging technique. - Generate contextual vectors: Leverage the
.contextualVectors()
method on a document. Useful for pure browser-side applications! Generate custom vectors contextually relevant to your corpus and use them in place of larger pre-trained wink embeddings.
- Obtain vector for a token: Use the
- Comprehensive documentation along with interesting examples is coming up shortly. Stay tuned for updates! 😎
Added Deno example
Version 1.14.3 July 21, 2023
✨ Features
- Added a live example for how to run winkNLP on Deno. 👍
Fixed a bug
Version 1.14.2 July 1, 2023
🐛 Fixes
- Paramteters in
markup()
are optional now in TS code — squashed a typescript declaration bug. 🙌