Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

v0.8.0

Compare
Choose a tag to compare
@bwanglzu bwanglzu released this 13 Jul 13:54
· 4 commits to main since this release
82be27f

Release Note Finetuner 0.8.0

This release covers Finetuner version 0.8.0, including dependency finetuner-core 0.13.9.

This release contains 1 new feature and 1 refactoring.

🆕 Features

Add Jina embeddings suite (#757)

We have made contributions to the open-source community by releasing three pre-trained embedding models:

  1. jina-embedding-s-en-v1: 35 million parameter compact embedding model.
  2. jina-embedding-b-en-v1: 110 million parameter standard-sized embedding model.
  3. jina-embedding-l-en-v1: 330 million parameter large embedding model.

We have trained all three models using Jina AI's Linnaeus-Clean dataset. This dataset consists of 380 million pairs of sentences in query-document pairs. These pairs were curated from a variety of domains in the Linnaeus-Full dataset through a thorough cleaning process. The Linnaeus-Full dataset contains 1.6 billion sentence pairs.

If you wish to use these embeddings with Finetuner, follow the instructions below:

!pip install finetuner
import finetuner

model = finetuner.build_model('jinaai/jina-embedding-s-en-v1')
embeddings = finetuner.encode(
    model=model,
    data=['how is the weather today', 'What is the current weather like today?']
)
print(finetuner.cos_sim(embeddings[0], embeddings[1]))

⚙ Refactoring

Change installation behavior (#757)

With the launch of Finetuner 0.8.0, installing it using pip install finetuner will automatically include the necessary torch-related dependencies. This enables Finetuner to function as an optimal provider of embedding models. If you intend to fine-tune an embedding model, make sure that you install Finetuner with all the additional dependencies by using the command pip install "finetuner[full]".

🤟 Contributors

We would like to thank all contributors to this release: