Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

可否设定对数字不做分词? #150

Open
Jun90925 opened this issue May 20, 2024 · 1 comment
Open

可否设定对数字不做分词? #150

Jun90925 opened this issue May 20, 2024 · 1 comment

Comments

@Jun90925
Copy link

我需要在fts表里记录一个业务数据的id,如果这个id用UNINDEXED不做索引的话,那么在删除业务数据顺带删除索引时,因为id没有做索引,所以查询速度会很慢,从而有可能会阻塞数据库的写入。

删除是一个不可忽视的场景,其实使用频率并不低。

如果这个业务数据id是个纯数字(初步定是业务数据的rowid),那么在建表的时候把UNINDEXED去掉,让业务数据id也做索引,这样子查询速度就会很快了,虽然肯定比不是正常实体表的索引。

我看了分词器的参数,貌似是没有设定对数字不分词的选项,可否考虑增加?或者说有没有必要增加这个选项?

@wangfenjin
Copy link
Owner

你说的这个问题跟分词器没关系,对于给定字段要不要索引是建表的时候定的,分词器只是把你需要分词的东西做分词。

https://www.wangfenjin.com/posts/simple-jieba-tokenizer/
你可以看看这个文章,讲了怎么组织数据结构

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants