[0.6] [MOD-10559] Decouple the shrinking and growing logic of large containers in Flat and HNSW #783

meiravgri · 2025-09-18T14:56:29Z

Purpose

This PR backports the resize logic improvements from PR #753 to the 0.6 branch, adapting the implementation to account for 0.6's specific capacity management approach.
The original change decouples shrinking and growing operations in vector index algorithms to prevent oscillating allocation/deallocation cycles during index updates at block boundaries, which was particularly problematic for large containers like hash tables and metadata vectors.

Behavior Changes in 0.6 Branch

Before this PR (0.6 branch):

Shrinking occurred whenever count % blockSize == 0 during vector removal
Metadata containers were immediately resized down by exactly one block size
No buffer zone existed between growing and shrinking operations

After this PR (0.6 branch):

Growing: Triggered when we need space for the next element (id >= capacity in BruteForce, cur_element_count >= max_elements_ in HNSW)
Shrinking: Only when there are 2+ free blocks (indexCapacity() >= (indexSize() + 2 * blockSize))
Buffer Zone: Maintains at least 1 block buffer to prevent oscillation
Special handling: Always shrinks by exactly one block, with special condition for the last block when count == 0

Key Differences from 0.8 Branch Implementation

The 0.6 branch implementation differs from the 0.8 backport (PR #777) due to different initial capacity handling and HNSW architecture:

Initial Capacity Rounding:
- 0.8 branch: Initial capacity is rounded up to block size at index creation, so resize logic is simpler
- 0.6 branch: Initial capacity is NOT rounded up to block size at initialization - rounding only occurs at the first resize operation
HNSW Architecture Differences:
- 0.6 branch: Reallocates ALL containers including vector data during resize operations
- 0.7+ branches: Uses separate vector and graph data blocks, allowing incremental block-by-block operations

Implementation Details

The key changes ensure that:

Growing: Only occurs when we need space for the next element to be added
Shrinking: Only occurs when there's sufficient buffer (2+ blocks free) or when the index is completely empty
Block Alignment: Handled during resize operations rather than at initialization, maintaining 0.6's deferred rounding approach

- Refactor `resize_and_align_index` tests in both `test_bruteforce.cpp` and `test_bruteforce_multi.cpp` to improve clarity and maintainability. - Introduce helper functions to verify index size and capacity, reducing code duplication. - Add comprehensive checks for index size, capacity, and label counts during vector addition and deletion. - Implement tests to ensure no oscillation in index size and capacity during repeated add/delete cycles. - Address edge cases for initial capacity and resizing behavior, ensuring proper alignment with block sizes.

codecov · 2025-09-18T15:11:39Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.07%. Comparing base (71bd103) to head (4686be6).
⚠️ Report is 1 commits behind head on 0.6.

Additional details and impacted files

@@            Coverage Diff             @@
##              0.6     #783      +/-   ##
==========================================
+ Coverage   94.99%   95.07%   +0.08%     
==========================================
  Files          60       60              
  Lines        3434     3451      +17     
==========================================
+ Hits         3262     3281      +19     
+ Misses        172      170       -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

meiravgri changed the title ~~Enhance BruteForce index tests for resizing and alignment~~ [0.6] [MOD-10559] Decouple the shrinking and growing logic of large containers in Flat and HNSW Sep 18, 2025

meiravgri requested a review from GuyAv46 September 18, 2025 15:18

meiravgri enabled auto-merge September 18, 2025 15:18

GuyAv46 approved these changes Sep 21, 2025

View reviewed changes

meiravgri added this pull request to the merge queue Sep 21, 2025

Merged via the queue into 0.6 with commit c10d1dc Sep 21, 2025
33 checks passed

meiravgri deleted the meiravg_resize_callback branch September 21, 2025 13:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[0.6] [MOD-10559] Decouple the shrinking and growing logic of large containers in Flat and HNSW #783

[0.6] [MOD-10559] Decouple the shrinking and growing logic of large containers in Flat and HNSW #783

Uh oh!

meiravgri commented Sep 18, 2025 •

edited

Loading

Uh oh!

codecov bot commented Sep 18, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[0.6] [MOD-10559] Decouple the shrinking and growing logic of large containers in Flat and HNSW #783

[0.6] [MOD-10559] Decouple the shrinking and growing logic of large containers in Flat and HNSW #783

Uh oh!

Conversation

meiravgri commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Behavior Changes in 0.6 Branch

Key Differences from 0.8 Branch Implementation

Implementation Details

Uh oh!

codecov bot commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

meiravgri commented Sep 18, 2025 •

edited

Loading

codecov bot commented Sep 18, 2025 •

edited

Loading