[0.6] [MOD-10559] Decouple the shrinking and growing logic of large containers in Flat and HNSW #783
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
This PR backports the resize logic improvements from PR #753 to the 0.6 branch, adapting the implementation to account for 0.6's specific capacity management approach.
The original change decouples shrinking and growing operations in vector index algorithms to prevent oscillating allocation/deallocation cycles during index updates at block boundaries, which was particularly problematic for large containers like hash tables and metadata vectors.
Behavior Changes in 0.6 Branch
Before this PR (0.6 branch):
count % blockSize == 0during vector removalAfter this PR (0.6 branch):
id >= capacityin BruteForce,cur_element_count >= max_elements_in HNSW)indexCapacity() >= (indexSize() + 2 * blockSize))count == 0Key Differences from 0.8 Branch Implementation
The 0.6 branch implementation differs from the 0.8 backport (PR #777) due to different initial capacity handling and HNSW architecture:
Initial Capacity Rounding:
HNSW Architecture Differences:
Implementation Details
The key changes ensure that: