-
Notifications
You must be signed in to change notification settings - Fork 2.7k
feat: add configurable embedding batch size for code indexing #7464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewing my own code is like debugging in a mirror - everything looks backwards and I still missed the obvious bugs.
src/package.json
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intentional? The configuration key uses but in service-factory.ts you're reading from which might be different. This could prevent the setting from being read correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter is accepted but not passed to the OpenAICompatibleEmbedder. Should this be:
instead of ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue here - the parameter should be passed through:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here as well - the parameter needs to be passed:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a test case that actually verifies the batch size parameter is being used correctly, not just passing . This would help catch the issue where the parameter isn't being propagated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a comment here explaining how interacts with . Future maintainers would benefit from understanding that the actual batch size is limited by both the configured size and the token limit.
- Added new VSCode setting roo-cline.codeIndex.embeddingBatchSize - Default value is 400, configurable range 1-2048 - Updated DirectoryScanner and FileWatcher to use configurable batch size - Updated service factory to pass batch size to processors - Maintains backward compatibility with default value - Fixes #7356 - allows users to configure batch size based on API provider limits
38d9b5f to
da98690
Compare
|
really need this |
- Changed default from 400 to 60 to match BATCH_SEGMENT_THRESHOLD constant - Added proper translation key for setting description - Added translations for all 17 supported languages
- Changed batchSize type from 'number | undefined' to 'number' - Always provide BATCH_SEGMENT_THRESHOLD as fallback instead of undefined - Ensures consistent behavior in test environments
daniel-lxs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR addresses Issue #7356 by adding a configurable embedding batch size setting for the code indexing feature.
Problem
Users with API providers that have stricter batch limits (e.g., batch size limit of 10) were unable to use the code indexing feature because the system was trying to process batches of 400 items by default.
Solution
Added a new VSCode extension setting
roo-code.codeIndex.embeddingBatchSizethat allows users to configure the embedding batch size based on their API provider's limits.Changes
roo-code.codeIndex.embeddingBatchSizewith:Testing
How to Use
Users can now configure the batch size in their VSCode settings:
{ "roo-code.codeIndex.embeddingBatchSize": 10 }This allows users with API providers that have stricter batch limits to successfully use the code indexing feature.
Fixes #7356
Important
Adds configurable embedding batch size setting for code indexing, updating
FileWatcher,DirectoryScanner, and service factory to use this setting, with localization and test coverage.roo-code.codeIndex.embeddingBatchSizesetting inpackage.jsonwith default 60, range 1-200.FileWatcherandDirectoryScannerto use configurable batch size from VSCode settings.createDirectoryScanner()andcreateFileWatcher()to retrieve batch size from settings.embeddingBatchSizedescription in multiplepackage.nls.*.jsonfiles.This description was created by
for 2135434. You can customize this summary. It will automatically update as commits are pushed.