15 Feb 15:50

5ee6aad

v1.7.0-rc.1 🐇 Pre-release

Pre-release

⚠️ Since this is a release candidate (RC), we do NOT recommend using it in a production environment. Is something not working as expected? We welcome bug reports and feedback about new features.

What's Changed since previous RC

Make several indexing optimizations by @ManyTheFish in #4350
Update charabia by @ManyTheFish in #4365
Implement the experimental log mode cli flag and log level updates at runtime by @irevoire in #4410
Output logs to stderr by @irevoire in #4418

Contributors

ManyTheFish and irevoire

Assets 7

13 Feb 16:04

dureuill

v1.6.2

1a083d5

v1.6.2 🦊

Fixes 🪲

Disable incremental facet update as a stop-gap by @dureuill in #4408

Contributors

dureuill

Assets 8

12 Feb 11:38

irevoire

v1.7.0-rc.0

15dafde

v1.7.0-rc.0 🐇 Pre-release

Pre-release

⚠️ Since this is a release candidate (RC), we do NOT recommend using it in a production environment. Is something not working as expected? We welcome bug reports and feedback about new features.

Meilisearch v1.7.0 mostly focuses on improving v1.6.0 features, indexing speed and hybrid search. GPU computing is now supported.

New features and improvements 🔥

Improve AI with Meilisearch (experimental feature)

🗣️ AI work is still experimental, and we need your help to improve it! Share your thoughts and feedback on this GitHub discussion.

To use it, you need to enable vectorSearch through the /experimental-features route.

💡 More documentation about AI search with Meilisearch here.

Add new OpenAI embedding models & ability to override their models dimensions

When using OpenAi as source in your embedders index settings (an example here), you can now specify two new models:

text-embedding-3-small with a default dimension of 1536.
text-embedding-3-large with a default dimension of 3072.

The new models:

are cheaper
produce more relevant results in standardized tests
allow to set up the dimensions of the embeddings to control the trade-off between accuracy and performance (including storage)

It means that it is now possible to pass the dimensions field when using the OpenAi source. This was previously only available for the userProvided source in the previous releases.

There are some rules, though, which we detail with these examples:

"embedders": {
  "large": {
    "source": "openAi",
    "model": "text-embedding-3-large",
    "dimensions": 512 // must be >0, must be <= 3072 for "text-embedding-3-large"
  },
  "small": {
    "source": "openAi",
    "model": "text-embedding-3-small",
    "dimensions": 1024 // must be >0, must be <= 1536 for "text-embedding-3-small"
  },
  "legacy": {
    "source": "openAi",
    "model": "text-embedding-ada-002",
    "dimensions": 1536 // must =1536  for "text-embedding-ada-002"
  },
  "omitted_dimensions": { // uses the default dimension
    "source": "openAi",
    "model": "text-embedding-ada-002",
  }
}

Done in #4375 by @Gosti.

Add GPU support to compute embeddings

Enabling the CUDA feature allows using an available GPU to compute embeddings with a huggingFace embedder.
On an AWS Graviton 2, this yields a x3 - x5 improvement on indexing time.

👇 How to enable GPU support through CUDA for HuggingFace embedding generation:

Prerequisites

Linux distribution with a compatible CUDA version
NVidia GPU with CUDA support
A recent Rust compiler to compile Meilisearch from source

Steps

Follow the guide to install the CUDA dependencies
Clone Meilisearch: git clone https://github.com/meilisearch/meilisearch.git
Compile Meilisearch with the cuda feature: cargo build --release --package meilisearch --features cuda
In the freshly compiled Meilisearch, enable the vector store experimental feature:

❯ curl \
  -X PATCH 'http://localhost:7700/experimental-features/' \
  -H 'Content-Type: application/json'  \
--data-binary '{ "vectorStore": true }'

Add an HuggingFace embedder to the settings:

curl \
-X PATCH 'http://localhost:7700/indexes/your_index/settings/embedders' \
-H 'Content-Type: application/json' --data-binary \
'{ "default": { "source": "huggingFace" } }'

Done by @dureuill in #4304.

Improve indexing speed & reduce memory crashes

Auto-batch the task deletions to reduce indexing time (#4316) @irevoire
Improve indexing speed for vector store (makes the Hybrid search experimental feature indexing time more than 10 times faster) (#4332) @Kerollmops @irevoire
Reduce memory usage, so reduce the memory crashes, by capping the maximum memory of the grenad sorters (#4388) @Kerollmops

Stabilize `scoreDetails` feature

In v1.3.0, we introduced the experimental feature scoreDetails. We got enough positive feedback on the feature, and we now stabilize it, making this feature enabled by default.

View detailed scores per ranking rule for each document with the showRankingScoreDetails search parameter:

curl \
  -X POST 'http://localhost:7700/indexes/movies/search' \
  -H 'Content-Type: application/json' \
  --data-binary '{ "q": "Batman Returns", "showRankingScoreDetails": true }'

When showRankingScoreDetails is set to true, returned documents include a _rankingScoreDetails field. This field contains score values for each ranking rule.

"_rankingScoreDetails": {
  "words": {
    "order": 0,
    "matchingWords": 1,
    "maxMatchingWords": 1,
    "score": 1.0
  },
  "typo": {
    "order": 1,
    "typoCount": 0,
    "maxTypoCount": 1,
    "score": 1.0
  },
  "proximity": {
    "order": 2,
    "score": 1.0
  },
  "attribute": {
    "order": 3,
    "attributes_ranking_order": 0.8,
    "attributes_query_word_order": 0.6363636363636364,
    "score": 0.7272727272727273
  },
  "exactness": {
    "order": 4,
    "matchType": "noExactMatch",
    "matchingWords": 0,
    "maxMatchingWords": 1,
    "score": 0.3333333333333333
  }
}

Done by @dureuill in #4389.

Logs improvements

We made some changes regarding our logs to help with debugging and bug reporting.

Done by @irevoire in #4391

Log format change

⚠️ If you did any automation based on Meilisearch logs, be aware of the changes. More information here.

The default log format evolved slightly from this:

[2024-02-06T14:54:11Z INFO  actix_server::builder] starting 10 workers

To this:

2024-02-06T13:58:14.710803Z  INFO actix_server::builder: 200: starting 10 workers

Experimental: new routes to manage logs

This new version of Meilisearch introduces 3 new experimental routes

POST /logs/stream: streams the log happening in real-time. Requires two parameters:
- target: selects what logs you’re interested in. It takes the form of code_part=log_level. For example, index_scheduler=info
- mode: selects in what format of log you want. Two options are available: human (basic logs) or profile (a way more complex trace)
DELETE /logs/stream: stops the listener from the meilisearch perspective. Does not require any parameters.

💡 More information in the New experimental routes section of this file.

⚠️ Some remarks on this POST /logs/stream route:

You can have only one listener at a time
Listening to the route doesn’t seem to work with xh or httpie for the moment
When killing the listener, it may stay installed on Meilisearch for some time, and you will need to call the DELETE /logs/stream route to get rid of it.

🗣️ This feature is experimental, and we need your help to improve it! Share your thoughts and feedback on this GitHub discussion.

⚠️ Experimental features may be incompatible between Meilisearch versions.

Other improvements

Related to the Prometheus experimental feature: add job variable to Grafana dashboard (#4330) @capJavert

Misc

Dependencies upgrade
- Bump rustls-webpki from 0.101.3 to 0.101.7 (#4263)
- Bump h2 from 0.3.20 to 0.3.24 (#4345)
- Update the dependencies (#4332) @Kerollmops
CIs and tests
- Update SDK test dependencies (#4293) @curquiza
Documentation
- Add Setting API reminder in issue template (#4325) @ManyTheFish
- Update README (#4319) @codesmith-emmy
Misc
- Fix compilation warnings (#4295) @irevoire

❤️ Thanks again to our external contributors:

Meilisearch: @capJavert, @codesmith-emmy and @Gosti

Contributors

Kerollmops, Gosti, and 6 other contributors

Assets 7

31 Jan 15:45

curquiza

v1.6.1

b26ddfc

v1.6.1 🦊

Fixes 🪲

Restore highlighting when possible for hybrid search by @dureuill in #4352
Fix geo error message by @irevoire in #4366
Fixes embedder issues by @dureuill in #4371
Update mini-dashboard to v0.2.13: reduce the number of requests, by @mdubus in #4378

Thanks @carlosbaraza for the report ❤️

Contributors

carlosbaraza, irevoire, and 2 other contributors

Assets 8

15 Jan 10:43

curquiza

v1.6.0

a6fa0b9

v1.6.0 🦊

Meilisearch v1.6 focuses on improving indexing performance. This new release also adds hybrid search and simplifies the process of generating embeddings for semantic search.

🧰 All official Meilisearch integrations (including SDKs, clients, and other tools) are compatible with this Meilisearch release. Integration deployment happens between 4 to 48 hours after a new version becomes available.

Some SDKs might not include all new features—consult the project repository for detailed information. Is a feature you need missing from your chosen SDK? Create an issue letting us know you need it, or, for open-source karma points, open a PR implementing it (we'll love you for that ❤️).

New features and improvements 🔥

Experimental: Automated embeddings generation for vector search

With v1.6, you can configure Meilisearch so it automatically generates embeddings using either OpenAI or HuggingFace. If neither of these third-party options suit your application, you may provide your own embeddings manually:

openAI: Meilisearch uses the OpenAI API to auto-embed your documents. You must supply an OpenAPI key to use this embedder
huggingFace: Meilisearch automatically downloads the specified model from HuggingFace and generates embeddings locally. This will use your CPU and may impact indexing performance
userProvided: Compute embeddings manually and supply document vectors to Meilisearch. You may be familiar with this approach if you have used vector search in a previous Meilisearch release. Read further for details on breaking changes for user provided embeddings usage

Usage

Use the embedders index setting to configure embedders. You may set multiple embedders for an index. This example defines 3 embedders named default, image and translation:

curl \
  -X PATCH 'http://localhost:7700/indexes/movies/settings' \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "embedders": {
      "default": {
        "source":  "openAi",
        "apiKey": "<your-OpenAI-API-key>",
        "model": "text-embedding-ada-002",
        "documentTemplate": "A movie titled \'{{doc.title}}\' whose description starts with {{doc.overview|truncatewords: 20}}"
      },
      "image": {
        "source": "userProvided",
        "dimensions": 512,
      },
      "translation": {
        "source": "huggingFace",
        "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2",
        "documentTemplate": "A movie titled \'{{doc.title}}\' whose description starts with {{doc.overview|truncatewords: 20}}"
      }
    }
  }'

documentTemplate is a view of your document that will serve as the base for computing the embedding. This field is a JSON string in the Liquid format
model is the model OpenAI or HuggingFace should use when generating document embeddings

Refer to the documentation for more vector search usage instructions.

⚠️ Vector search breaking changes

If you have used vector search between v1.3.0 and v1.5.0, API usage has changed with v1.6:

When providing both the q and vector parameters for a single query, you must provide the hybrid parameter
Define a model in your embedder settings is now mandatory:

"embedders": {
    "default": {
      "source": "userProvided",
      "dimensions": 512
    }
}

Vectors should be JSON objects instead of arrays:

"_vectors": { "image2text": [0.0, 0.1, …] } # ✅
"_vectors": [ [0.0, 0.1] ] # ❌

Done in #4226 by @dureuill, @irevoire, @Kerollmops and @ManyTheFish.

Experimental: Hybrid search

This release introduces hybrid search functionality. Hybrid search allows users to mix keyword and semantic search at search time.

Use the hybrid search parameter to perform a hybrid search:

curl \
  -X POST 'http://localhost:7700/indexes/movies/search' \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "q": "Plumbers and dinosaurs",
    "hybrid": {
        "semanticRatio": 0.9,
        "embedder": "default"
    }
  }'

embedder is the embedder you choose to perform the search among the ones you defined in your settings
semanticRatio is a number between 0 and 1. The default value is 0.5. 1 corresponds to a full semantic search and 0 corresponds to keyword search

Tip

The new vector search functionality uses Arroy, a Rust library developed by the Meilisearch engine team. Check out @Kerollmops blog post describing the whole process.

Done in #4226 by @dureuill, @irevoire, @Kerollmops and @ManyTheFish.

Improve indexing speed

This version introduces significant indexing performance improvements. Meilisearch v1.6 has been optimized to:

store and pre-compute less data than in previous versions
re-index and delete only the necessary data when updating a document. For example, when you update one document field, Meilisearch will no longer re-index the whole document

On an e-commerce dataset of 2.5Gb of documents, these changes led to more than a 50% time reduction when adding documents for the first time. When updating documents frequently and partially, re-indexing performance hovers between 50% and 75%.

⚠️ Performance improvements depend on your dataset, your machine, and how you're indexing documents.

Done in #4090 by @ManyTheFish, @dureuill and @Kerollmops.

Disk space usage reduction

Meilisearch now stores less internal data. This leads to smaller database disk sizes.

With a ~15Mb dataset, the created database is 40% and 50% smaller. Additionally, the database size has become more stable and will display more modest growth with new document additions.

Proximity precision and performance

You can now customize the accuracy of the proximity ranking rule.

Computing this ranking rule uses a significant amount of resources and may lead to increased indexing times. Lowering its precision may lead to significant performance gains. In a minority of use cases, lower proximity precision may also impact relevancy for queries using multiple search terms.

Usage

curl \
  -X PATCH 'http://localhost:7700/indexes/books/settings/proximity-precision' \
  -H 'Content-Type: application/json'  \
  --data-binary '{
    "proximityPrecision": "byAttribute"
  }'

proximityPrecision accepts either byWord or byAttribute:

byWord calculates the exact distance between words. This is the default setting.
byAttribute only determines whether words are present in the same attribute. It is less accurate, but provides better performance.

Done in #4225 by @ManyTheFish.

Experimental: Limit the number of batched tasks

Meilisearch may occasionally batch too many tasks together, which may lead to system instability. Relaunch Meilisearch with the --experimental-max-number-of-batched-tasks configuration option to address this issue:

./meilisearch --experimental-max-number-of-batched-tasks 100

You may also configure --experimental-max-number-of-batched-tasks as an environment variable or directly in the config file with MEILI_EXPERIMENTAL_MAX_NUMBER_OF_BATCHED_TASKS.

Done in #4249 by @Kerollmops

Task queue webhook

This release introduces a configurable webhook url that will be called whenever Meilisearch finishes processing a task.

Relaunch Meilisearch using --task-webhook-url and --task-webhook-authorization-header to use the webhook:

./meilisearch \
  --task-webhook-url=https://example.com/example-webhook?foo=bar&number=8 \
  --task-webhook-authorization-header=Bearer aSampleAPISearchKey

You may also define the webhook URL and header with environment variables or in the configuration file with MEILI_TASK_WEBHOOK_URL and MEILI_TASK_WEBHOOK_AUTHORIZATION_HEADER.

Done by @irevoire in #4238

Fixes 🐞

Fix document formatting performances during search (#4313) @ManyTheFish
The dump tasks are now cancellable (#4208) @irevoire
Fix: the payload size limit is now also applied to all routes, not only routes to add and update documents (#4231) @Karribalu
Fix: typo tolerance is ineffective for attributes with similar content (related issue: #4256)
Fix: the geosort is no longer ignored after the first bucket of a preceding sort ranking rule (#4226)
Fix hang on /indexes and /stats routes (#4308) @dureuill
Limit the number of values returned by the facet search based on maxValuePerFacet setting (#4311) @Kerollmops

Misc

Dependencies upgrade
- Updating CI dependencies
- Update to heed 0.20 (#4223) @Kerollmops
- Set rust toolchain to 1.71.1 in Dockerfile (#4261) @dureuill
- Update mini-dashboard to v0.2.12 (#4277) @mdubus
Documentation
- Remove banner (#4191) @curquiza
Misc
- Extract the creation and last updated timestamp from v2 dumps (#4132) @vivek-26
- Fix puffin in the index scheduler (#4234) @irevoire
- Remove the actix-web dependency from (#4239) @Kerollmops

❤️ Thanks again to our external contributors: @Karribalu, and @vivek-26

Contributors

Kerollmops, ManyTheFish, and 6 other contributors

Assets 8

0 Join discussion

11 Jan 15:50

ManyTheFish

v1.6.0-rc.8

e93d36d

v1.6.0-rc.8 🦊 Pre-release

Pre-release

Fixes

Fix proximity precision telemetry by @ManyTheFish in #4314
Fix document formatting performances by @ManyTheFish in #4313

Contributors

ManyTheFish

Assets 7

11 Jan 11:59

Kerollmops

v1.6.0-rc.7

1f5e8fc

v1.6.0-rc.7 🦊 Pre-release

Pre-release

What's Changed

Limit the number of values returned by the facet search by @Kerollmops in #4311

Full Changelog: v1.6.0-rc.6...v1.6.0-rc.7

Contributors

Kerollmops

Assets 7

10 Jan 16:08

curquiza

v1.6.0-rc.6

93363b0

v1.6.0-rc.6 🦊 Pre-release

Pre-release

Fix

Fix hang on /indexes and /stats routes by @dureuill in #4308

Contributors

dureuill

Assets 7

08 Jan 15:19

curquiza

v1.6.0-rc.5

5ee1378

v1.6.0-rc.5 🦊 Pre-release

Pre-release

Fix

Display default value when proximityPrecision is not set (byWord and not null) #4303 @ManyTheFish

Contributors

ManyTheFish

Assets 7

04 Jan 11:51

curquiza

v1.6.0-rc.4

6203f4a

v1.6.0-rc.4 🦊 Pre-release

Pre-release

Fixes

Fix issue with hybrid/vector search (#4296) @dureuill
Fix CI issue (#4294) @irevoire

Contributors

irevoire and dureuill

Assets 7

Releases: meilisearch/meilisearch

v1.7.0-rc.1 🐇

What's Changed since previous RC

Contributors

v1.6.2 🦊

Fixes 🪲

Contributors

v1.7.0-rc.0 🐇

New features and improvements 🔥

Improve AI with Meilisearch (experimental feature)

Add new OpenAI embedding models & ability to override their models dimensions

Add GPU support to compute embeddings

Improve indexing speed & reduce memory crashes

Stabilize scoreDetails feature

Logs improvements

Log format change

Experimental: new routes to manage logs

Other improvements

Misc

Contributors

v1.6.1 🦊

Fixes 🪲

Contributors

v1.6.0 🦊

New features and improvements 🔥

Experimental: Automated embeddings generation for vector search

Usage

⚠️ Vector search breaking changes

Experimental: Hybrid search

Improve indexing speed

Disk space usage reduction

Proximity precision and performance

Usage

Experimental: Limit the number of batched tasks

Task queue webhook

Fixes 🐞

Misc

Contributors

v1.6.0-rc.8 🦊

Fixes

Contributors

v1.6.0-rc.7 🦊

What's Changed

Contributors

v1.6.0-rc.6 🦊

Fix

Contributors

v1.6.0-rc.5 🦊

Fix

Contributors

v1.6.0-rc.4 🦊

Fixes

Contributors

Stabilize `scoreDetails` feature