@@ -92,12 +92,6 @@ The following configuration options are supported:
9292 part of this list by automatically falling back to the stemmer yielding the
9393 best result.
9494
95- !!! tip "Chinese search support – 中文搜索支持"
96-
97- Material for MkDocs recently added __experimental language support for
98- Chinese__ as part of [Insiders]. [Read the blog article][chinese search]
99- to learn how to set up search for Chinese in a matter of minutes.
100-
10195` separator` { # search-separator }
10296
10397: :octicons-milestone-24 : Default: _automatically set_ – The separator for
@@ -112,10 +106,9 @@ The following configuration options are supported:
112106 ` ` `
113107
114108 1. Tokenization itself is carried out by [lunr's default tokenizer], which
115- doesn't allow for lookahead or separators spanning multiple characters.
116-
117- For more finegrained control over the tokenization process, see the
118- section on [tokenizer lookahead].
109+ doesn't allow for lookahead or multi-character separators. For more
110+ finegrained control over the tokenization process, see the section on
111+ [tokenizer lookahead].
119112
120113<div class="mdx-deprecated" markdown>
121114
@@ -142,28 +135,82 @@ The following configuration options are supported:
142135
143136</div>
144137
145- The other configuration options of this plugin are not officially supported
146- by Material for MkDocs, which is why they may yield unexpected results. Use
147- them at your own risk.
148-
149138 [search support] : https://github.com/squidfunk/mkdocs-material/releases/tag/0.1.0
150139 [lunr] : https://lunrjs.com
151140 [lunr-languages] : https://github.com/MihaiValentin/lunr-languages
152- [chinese search] : ../blog/2022/chinese-search-support.md
153141 [lunr's default tokenizer] : https://github.com/olivernn/lunr.js/blob/aa5a878f62a6bba1e8e5b95714899e17e8150b38/lunr.js#L413-L456
154142 [site language] : changing-the-language.md#site-language
155143 [tokenizer lookahead] : # tokenizer-lookahead
156144 [prebuilt index support] : https://github.com/squidfunk/mkdocs-material/releases/tag/5.0.0
157145 [prebuilt index] : https://www.mkdocs.org/user-guide/configuration/#prebuild_index
158146 [50% smaller] : ../blog/2021/search-better-faster-smaller.md#benchmarks
159147
148+ # ### Chinese language support
149+
150+ [:octicons-heart-fill-24:{ .mdx-heart } Sponsors only][Insiders]{ .mdx-insiders } ·
151+ [:octicons-tag-24 : insiders-4.14.0][Insiders] ·
152+ :octicons-beaker-24 : Experimental
153+
154+ [Insiders] adds search support for the Chinese language (see our [blog article]
155+ [chinese search] from May 2022) by integrating with the text segmentation
156+ library [jieba], which can be installed with `pip`.
157+
158+ ` ` ` sh
159+ pip install jieba
160+ ` ` `
161+
162+ If [jieba] is installed, the [built-in search plugin] automatically detects
163+ Chinese characters and runs them through the segmenter. The following
164+ configuration options are available :
165+
166+ ` jieba_dict` { # jieba-dict }
167+
168+ : [:octicons-tag-24 : insiders-4.17.2][Insiders] · :octicons-milestone-24:
169+ Default : _none_ – This option allows for specifying a [custom dictionary]
170+ to be used by [jieba] for segmenting text, replacing the default dictionary :
171+
172+ ` ` ` yaml
173+ plugins:
174+ - search:
175+ jieba_dict: dict.txt # (1)!
176+ ` ` `
177+
178+ 1. The following alternative dictionaries are provided by [jieba] :
179+
180+ - [dict.txt.small] – 占用内存较小的词典文件
181+ - [dict.txt.big] – 支持繁体分词更好的词典文件
182+
183+ ` jieba_dict_user` { # jieba-dict-user }
184+
185+ : [:octicons-tag-24 : insiders-4.17.2][Insiders] · :octicons-milestone-24:
186+ Default : _none_ – This option allows for specifying an additional
187+ [user dictionary] to be used by [jieba] for segmenting text, augmenting the
188+ default dictionary :
189+
190+ ` ` ` yaml
191+ plugins:
192+ - search:
193+ jieba_dict_user: user_dict.txt
194+ ` ` `
195+
196+ User dictionaries can be used for tuning the segmenter to preserve
197+ technical terms.
198+
199+ [chinese search] : ../blog/2022/chinese-search-support.md
200+ [jieba] : https://pypi.org/project/jieba/
201+ [built-in search plugin] : # built-in-search-plugin
202+ [custom dictionary] : https://github.com/fxsjy/jieba#%E5%85%B6%E4%BB%96%E8%AF%8D%E5%85%B8
203+ [dict.txt.small] : https://github.com/fxsjy/jieba/raw/master/extra_dict/dict.txt.small
204+ [dict.txt.big] : https://github.com/fxsjy/jieba/raw/master/extra_dict/dict.txt.big
205+ [user dictionary] : https://github.com/fxsjy/jieba#%E8%BD%BD%E5%85%A5%E8%AF%8D%E5%85%B8
206+
160207# ## Rich search previews
161208
162209[:octicons-heart-fill-24:{ .mdx-heart } Sponsors only][Insiders]{ .mdx-insiders } ·
163210[:octicons-tag-24 : insiders-3.0.0][Insiders] ·
164211:octicons-beaker-24 : Experimental
165212
166- Insiders ships rich search previews as part of the [new search plugin], which
213+ [ Insiders] ships rich search previews as part of the [new search plugin], which
167214will render code blocks directly in the search result, and highlight all
168215occurrences inside those blocks :
169216
@@ -186,7 +233,7 @@ occurrences inside those blocks:
186233[:octicons-tag-24 : insiders-3.0.0][Insiders] ·
187234:octicons-beaker-24 : Experimental
188235
189- Insiders allows for more complex configurations of the [`separator`][separator]
236+ [ Insiders] allows for more complex configurations of the [`separator`][separator]
190237setting as part of the [new search plugin], yielding more influence on the way
191238documents are tokenized :
192239
0 commit comments