Release Note Finetuner 0.8.0

This release covers Finetuner version 0.8.0, including dependency finetuner-core 0.13.9.

This release contains 1 new feature and 1 refactoring.

🆕 Features

Add Jina embeddings suite (#757)

We have made contributions to the open-source community by releasing three pre-trained embedding models:

jina-embedding-s-en-v1: 35 million parameter compact embedding model.
jina-embedding-b-en-v1: 110 million parameter standard-sized embedding model.
jina-embedding-l-en-v1: 330 million parameter large embedding model.

We have trained all three models using Jina AI's Linnaeus-Clean dataset. This dataset consists of 380 million pairs of sentences in query-document pairs. These pairs were curated from a variety of domains in the Linnaeus-Full dataset through a thorough cleaning process. The Linnaeus-Full dataset contains 1.6 billion sentence pairs.

If you wish to use these embeddings with Finetuner, follow the instructions below:

!pip install finetuner
import finetuner

model = finetuner.build_model('jinaai/jina-embedding-s-en-v1')
embeddings = finetuner.encode(
    model=model,
    data=['how is the weather today', 'What is the current weather like today?']
)
print(finetuner.cos_sim(embeddings[0], embeddings[1]))

⚙ Refactoring

Change installation behavior (#757)

With the launch of Finetuner 0.8.0, installing it using pip install finetuner will automatically include the necessary torch-related dependencies. This enables Finetuner to function as an optimal provider of embedding models. If you intend to fine-tune an embedding model, make sure that you install Finetuner with all the additional dependencies by using the command pip install "finetuner[full]".

🤟 Contributors

We would like to thank all contributors to this release:

Wang Bo (@bwanglzu)
Louis Milliken (@LMMilliken)
Michael Günther (@guenthermi)
George Mastrapas (@gmastrapas)
Scott Martens (@scott-martens)
Jonathan Geuter (@j-geuter)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.8.0