Skip to content

Commit a35e814

Browse files
authored
fix(specs): clarify decompounding limitations (#3227)
1 parent 3f622e5 commit a35e814

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

specs/common/schemas/IndexSettings.yml

+5-1
Original file line numberDiff line numberDiff line change
@@ -144,6 +144,8 @@ baseIndexSettings:
144144
You can specify different lists for different languages.
145145
Decompounding is supported for these languages:
146146
Dutch (`nl`), German (`de`), Finnish (`fi`), Danish (`da`), Swedish (`sv`), and Norwegian (`no`).
147+
Decompounding doesn't work for words with [non-spacing mark Unicode characters](https://www.charactercodes.net/category/non-spacing_mark).
148+
For example, `Gartenstühle` won't be decompounded if the `ü` consists of `u` (U+0075) and `◌̈` (U+0308).
147149
default: {}
148150
x-categories:
149151
- Languages
@@ -527,10 +529,12 @@ indexSettingsAsSearchParams:
527529
decompoundQuery:
528530
type: boolean
529531
description: |
530-
Whether to split compound words into their building blocks.
532+
Whether to split compound words in the query into their building blocks.
531533
532534
For more information, see [Word segmentation](https://www.algolia.com/doc/guides/managing-results/optimize-search-results/handling-natural-languages-nlp/in-depth/language-specific-configurations/#splitting-compound-words).
533535
Word segmentation is supported for these languages: German, Dutch, Finnish, Swedish, and Norwegian.
536+
Decompounding doesn't work for words with [non-spacing mark Unicode characters](https://www.charactercodes.net/category/non-spacing_mark).
537+
For example, `Gartenstühle` won't be decompounded if the `ü` consists of `u` (U+0075) and `◌̈` (U+0308).
534538
default: true
535539
x-categories:
536540
- Languages

0 commit comments

Comments
 (0)