Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Performance optimization for full text search if there are many terms #1320

Closed
1 task done
yingfeng opened this issue Jun 13, 2024 · 1 comment
Closed
1 task done
Labels
feature request New feature or request

Comments

@yingfeng
Copy link
Member

Is there an existing issue for the same feature request?

  • I have checked the existing issues.

Is your feature request related to a problem?

No response

Describe the feature you'd like

Either BMW or BMM could not skip efficiently if the query contains too many clauses. Optimizations are required for these scenarios

Describe implementation you've considered

No response

Documentation, adoption, use case

No response

Additional information

No response

@yingfeng yingfeng added the feature request New feature or request label Jun 13, 2024
@yingfeng
Copy link
Member Author

Fix by #1342

@yangzq50 yangzq50 mentioned this issue Nov 28, 2024
2 tasks
JinHai-CN pushed a commit that referenced this issue Nov 28, 2024
### What problem does this PR solve?

Add BatchOrIterator for fulltext search, which calculates BM25 scores in
batch.

Issue link:#1320

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
@yingfeng yingfeng mentioned this issue Nov 28, 2024
79 tasks
JinHai-CN pushed a commit that referenced this issue Dec 4, 2024
### What problem does this PR solve?

Update fulltext query builder
Apply BatchOrIterator when necessary

Issue link:#1320

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
- [x] Performance Improvement
JinHai-CN pushed a commit that referenced this issue Dec 6, 2024
### What problem does this PR solve?

Update file io for fulltext search
Disable log in phrase iterator in release build

Issue link:#1320

### Type of change

- [x] Refactoring
- [x] Performance Improvement
vsian pushed a commit to vsian/infinity that referenced this issue Dec 13, 2024
### What problem does this PR solve?

Update fulltext query builder
Apply BatchOrIterator when necessary

Issue link:infiniflow#1320

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
- [x] Performance Improvement
vsian pushed a commit to vsian/infinity that referenced this issue Dec 13, 2024
### What problem does this PR solve?

Update file io for fulltext search
Disable log in phrase iterator in release build

Issue link:infiniflow#1320

### Type of change

- [x] Refactoring
- [x] Performance Improvement
JinHai-CN pushed a commit that referenced this issue Dec 13, 2024
### What problem does this PR solve?

Now BlockMaxWandIterator can accept both TermDocIterator and
PhraseDocIterator as children

Issue link:#1320

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
- [x] Performance Improvement
JinHai-CN pushed a commit that referenced this issue Dec 24, 2024
### What problem does this PR solve?

Update IDF formula for PhraseDocIterator
reference:
https://lucene.apache.org/core/10_1_0/core/org/apache/lucene/search/similarities/BM25Similarity.html#idfExplain(org.apache.lucene.search.CollectionStatistics,org.apache.lucene.search.TermStatistics[])

Issue link:#1320

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
- [x] Test cases
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant