-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
feat(search): Add task queue for Meilisearch to prevent race conditions #1423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Introduce a Meilisearch task queue to serialize and batch index updates, preventing duplicate async tasks and reducing race conditions; plus minor fixes and batching improvements.
- Add TaskQueueManager with 30s consumption window, diffing against current index state before submitting Meilisearch tasks
- Route Update() to enqueue for Meilisearch, and batch index add/delete operations
- Fix buildSearchDocumentFromResults size assignment from JSON results
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/search/meilisearch/task_queue.go | New task queue manager, consumption logic, diffing, and submission to Meilisearch with pending task tracking. |
| internal/search/build.go | Use queue for Meilisearch; batch-index additions; remove recursive BuildIndex for directories. |
| internal/search/meilisearch/init.go | Instantiate and start TaskQueueManager at init. |
| internal/search/meilisearch/search.go | Add taskQueue field, EnqueueUpdate, batch index/delete with task UIDs. |
| internal/search/meilisearch/utils.go | Fix document building: correct assignment and float64-to-int64 size conversion. |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
已经修改以让跳过的任务重新加入任务执行队列中,前提是这个时刻的任务执行队列不存在入队时间更新的相同任务。对于用户修改文件夹内容的场景,可以更快的同步索引。 |
|
请解决所有对话,如果不需要解决的,说明原因并直接点击解决按钮。 |
- Implement TaskQueueManager for async index operations - Queue update tasks and process them in batches every 30 seconds - Check pending task status before executing new operations - Optimize batch indexing and deletion logic - Fix type assertion bug in buildSearchDocumentFromResults
When tasks are skipped due to pending dependencies, they are now re-enqueued if not already in queue. This prevents task loss while avoiding overwriting newer snapshots for the same parent.
cd53fb9 to
6b6961e
Compare
|
Force pushed due to rebase main |
|
@PIKACHUIM @elysia-best LGTM |
Description / 描述
新增文件
internal/search/meilisearch/task_queue.go- 任务队列管理器实现修改文件
internal/search/build.gointernal/search/meilisearch/init.gointernal/search/meilisearch/search.gointernal/search/meilisearch/utils.goMotivation and Context / 背景
Meilisearch 提交任务时是异步的,这会导致任务在队列里排队。如果此时用户再次访问文件夹,而之前排队的任务还没执行时,会导致索引和传入的文件夹包含的真实 Objs 对不上,从而重复提交索引任务。在严重情况下,这会淹没任务队列。
原有的逐个更新索引逻辑效率较低,且自动更新索引遇到文件夹时调用 BuildIndex,对于 Meilisearch 会导致同样的问题,即可能重复提交相同路径的索引请求任务,因此改为将folder和file都视为需要更新的toAddObjs,直接调用BatchIndex索引。
需要确认: 文件夹不再调用BuildIndex后,会失去递归更新文件夹内子文件夹内容的功能。
Closes #1417
How Has This Been Tested? / 测试
触发了TestBuild,已经编译通过。
在实机上部署后,利用Python脚本遍历整个OpenList站点,极短时间内发起了几千个/api/fs/list,之后再次执行遍历脚本,观察Meilisearch的任务队列没有爆炸性增长,且在/api/fs/list持续触发的时候任务队列仍在持续减少,这代表不再有重复的任务被提交和排队。
Checklist / 检查清单
我已阅读 CONTRIBUTING 文档。
go fmtor prettier.我已使用
go fmt或 prettier 格式化提交的代码。我已为此 PR 添加了适当的标签(如无权限或需要的标签不存在,请在描述中说明,管理员将后续处理)。
我已在适当情况下使用"Request review"功能请求相关代码作者进行审查。
我已相应更新了相关仓库(若适用)。
我在提交PR的界面没看见在哪里打标签,请为我打上enhancement标签