Skip to content

Commit c7efc24

Browse files
Apply suggestions from code review
Co-authored-by: gui machiavelli <gui@meilisearch.com>
1 parent 0f8ad04 commit c7efc24

File tree

1 file changed

+13
-13
lines changed

1 file changed

+13
-13
lines changed

learn/indexing/optimize_indexing_performance.mdx

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
11
---
2-
title: Optimize indexing performance by analyzing batch statistics
2+
title: Optimize indexing performance with batch statistics
33
description: Learn how to analyze the `progressTrace` to identify and resolve indexing bottlenecks in Meilisearch.
44
---
55

66
# Optimize indexing performance by analyzing batch statistics
77

88
Indexing performance can vary significantly depending on your dataset, index settings, and hardware. The [batch object](/reference/api/batches) provides information about the progress of asynchronous indexing operations.
99

10-
The `progressTrace` field within the batch object offers a detailed breakdown of where time is spent during the indexing process. By analyzing this data, you can identify bottlenecks and adjust configuration settings to improve indexing speed.
10+
The `progressTrace` field within the batch object offers a detailed breakdown of where time is spent during the indexing process. Use this data to identify bottlenecks and improve indexing speed.
1111

1212
## Understanding the `progressTrace`
1313

14-
The `progressTrace` is a hierarchical trace showing each phase of indexing and how long it took.
14+
`progressTrace` is a hierarchical trace showing each phase of indexing and how long it took.
1515
Each entry follows the structure:
1616

1717
```json
@@ -24,15 +24,15 @@ This means:
2424
- The subtask was **extracting word proximity**.
2525
- It took **33.71 seconds**.
2626

27-
Your goal is to focus on the **longest-running steps** and understand which index settings or data characteristics influence them.
27+
Focus on the **longest-running steps** and investigate which index settings or data characteristics influence them.
2828

2929
## Key phases and how to optimize them
3030

3131
### Document processing
3232

3333
| Trace key | Description | Optimization |
3434
|------------|--------------|--------------|
35-
| `computing document changes`, `extracting documents` | Meilisearch compares incoming documents to existing ones. | No direct optimization possible. The duration scales with the number and size of incoming documents.|
35+
| `computing document changes`, `extracting documents` | Meilisearch compares incoming documents to existing ones. | No direct optimization possible. Process duration scales with the number and size of incoming documents.|
3636

3737
### Filterable attributes
3838

@@ -44,40 +44,40 @@ Your goal is to focus on the **longest-running steps** and understand which inde
4444

4545
| Trace key | Description | Optimization |
4646
|------------|--------------|--------------|
47-
| `extracting words`, `merging word caches` | Tokenizes text and builds the inverted index. | - Ensure the [**searchable attributes**](/reference/api/settings#searchable-attributes) list includes only the fields you want to be checked for query word matches. |
47+
| `extracting words`, `merging word caches` | Tokenizes text and builds the inverted index. | Ensure the [searchable attributes](/reference/api/settings#searchable-attributes) list only includes the fields you want to be checked for query word matches. |
4848

4949
### Proximity precision
5050

5151
| Trace key | Description | Optimization |
5252
|------------|--------------|--------------|
53-
| `extracting word proximity`, `merging word proximity` | Builds the data structures for phrase and attribute ranking. | Lower the precision of this operation by setting [proximity precision](/reference/api/settings#proximity-precision) to `byAttribute` instead of the default `byWord`|
53+
| `extracting word proximity`, `merging word proximity` | Builds data structures for phrase and attribute ranking. | Lower the precision of this operation by setting [proximity precision](/reference/api/settings#proximity-precision) to `byAttribute` |
5454

5555
### Disk I/O and hardware bottlenecks
5656

5757
| Trace key | Description | Optimization |
5858
|------------|--------------|--------------|
59-
| `waiting for database writes` | Time spent writing data to disk. | No direct optimization possible. Either the disk is slow, either the quantity of data to write is big. Avoid HDDs (Hard Disk Drives). |
59+
| `waiting for database writes` | Time spent writing data to disk. | No direct optimization possible. Either the disk is too slow or you are writing too much data in a single operation. Avoid HDDs (Hard Disk Drives) |
6060
| `waiting for extractors` | Time spent waiting for CPU-bound extraction. | No direct optimization possible. Indicates a CPU bottleneck. Use more cores or scale horizontally with [sharding](/learn/advanced/sharding). |
6161

6262
### Facets and filterable attributes
6363

6464
| Trace key | Description | Optimization |
6565
|------------|--------------|--------------|
66-
| `post processing facets > strings bulk` / `numbers bulk` | Processes equality or comparison filters. | - Disable unused [**filter features**](/reference/api/settings#features), such as comparison operators on string values. <br/>- Keep [**sortable attributes**](reference/api/settings#sortable-attributes) to the minimum required. |
66+
| `post processing facets > strings bulk` / `numbers bulk` | Processes equality or comparison filters. | - Disable unused [**filter features**](/reference/api/settings#features), such as comparison operators on string values. <br /> - Reduce the number of [**sortable attributes**](reference/api/settings#sortable-attributes). |
6767
| `post processing facets > facet search` | Builds structures for the [facet search API](/reference/api/facet_search). | If you don’t use the facet search API, [disable it](/reference/api/settings#update-facet-search-settings).|
6868

6969
### Embeddings
7070

7171
| Trace key | Description | Optimization |
7272
|------------|--------------|--------------|
73-
| `writing embeddings to database` | Time spent saving vector embeddings. | - Use smaller embedding vectors when possible. <br/>- You can avoid recomputing embeddings on document update by [disabling embedding regeneration](/reference/api/documents#vectors). <br/>- Consider enabling [binary quantization](/reference/api/settings#binaryquantized) for your embedders. |
73+
| `writing embeddings to database` | Time spent saving vector embeddings. | Use embedding vectors with fewer dimensions. <br/>- [Disabling embedding regeneration on document update](/reference/api/documents#vectors). <br/>- Consider enabling [binary quantization](/reference/api/settings#binaryquantized). |
7474

7575
### Word prefixes and post-processing
7676

7777
| Trace key | Description | Optimization |
7878
|------------|--------------|--------------|
79-
| `post processing words > word prefix *` | Builds prefix data for autocomplete. Allows to match documents that begin with a specific query term, instead of only exact matches.| Disable [**prefix search**](/reference/api/settings#prefix-search) (`prefixSearch: disabled`) if not required. <br/> Note that this can severely impact search result relevancy. |
80-
| `post processing words > word fst` | Builds the word FST (finite state transducer). | No direct action possible, as it depends on the number of different words in the database. Fewer searchable words can improve speed. |
79+
| `post processing words > word prefix *` | Builds prefix data for autocomplete. Allows matching documents that begin with a specific query term, instead of only exact matches.| Disable [**prefix search**](/reference/api/settings#prefix-search) (`prefixSearch: disabled`). *This can severely impact search result relevancy.* |
80+
| `post processing words > word fst` | Builds the word FST (finite state transducer). | No direct action possible, as FST size reflect the number of different words in the database. Using documents with fewer searchable words may improve operation speed. |
8181

8282
## Example analysis
8383

@@ -87,7 +87,7 @@ If you see:
8787
"processing tasks > indexing > post processing facets > facet search": "1763.06s"
8888
```
8989

90-
The [facet search feature](/learn/filtering_and_sorting/search_with_facet_filters#searching-facet-values) is consuming significant time. If your application doesn’t use it, disable it:
90+
[Facet searching](/learn/filtering_and_sorting/search_with_facet_filters#searching-facet-values) is raking significant indexing time. If your application doesn’t use facets, disable the feature:
9191

9292
```
9393
client.index('INDEX_NAME').updateFacetSearch(false);

0 commit comments

Comments
 (0)