Skip to content

Conversation

@meiravgri
Copy link
Collaborator

@meiravgri meiravgri commented Sep 15, 2025

Purpose

This PR backports the resize logic improvements from PR #753 to the 0.8 branch.
The original change decouples shrinking and growing operations in vector index algorithms to prevent oscillating allocation/deallocation cycles during index updates at block boundaries, which was particularly problematic for large containers like hash tables and metadata vectors.

Behavior Changes in 0.8 Branch

Before this PR (0.8 branch):

  • Shrinking occurred whenever count % blockSize == 0 during vector removal
  • Metadata containers were immediately resized down by exactly one block size
  • No buffer zone existed between growing and shrinking operations

After this PR (0.8 branch):

  • Growing: Triggered when indexSize() == indexCapacity() (capacity is full)
  • Shrinking: Only when there are 2+ free blocks (indexCapacity() >= (indexSize() + 2 * blockSize))
  • Buffer Zone: Maintains at least 1 block buffer to prevent oscillation
  • Special handling: Always shrinks by exactly one block, with special condition for the last block

Key Differences from Main Branch

The main branch implementation differs from this 0.8 backport in several important ways due to initial capacity support in the 0.8 branch:

  1. Initial Capacity Support:

    • 0.8 branch: Supports initialCapacity parameter, allowing pre-allocation of index capacity
    • Main branch: No initial capacity support (deprecated)
  2. Shrinking to Zero Logic:

    • 0.8 branch: Always shrinks by block size with special condition: "when capacity equals one block size, shrink to zero"
    • Main branch: Immediately shrinks to 0 when index size becomes 0
  3. Resize Policy:

    • 0.8 branch: Maintains "always remove one block" guarantee to align with initial capacity behavior
    • Main branch: Uses simpler "shrink to 0 when size=0" logic since no initial capacity exists
  4. Block Management:

    • 0.8 branch: Can have more than two free blocks due to initial capacity pre-allocation
    • Main branch: Simpler block management without initial capacity considerations

⚠️ Disclaimer: Potential Heavy Resize Sequence with Large Initial Capacity

When using large initial capacity values, this implementation may still trigger frequent metadata container resizes during update-heavy workloads.

Example scenario:

Initial capacity: 10M elements (10,000 blocks)
Current size: 7M elements
Update operations (remove + insert): Each removal can trigger shrinking since capacity >= (size + 2*blockSize) remains true for thousands of operations

shrink by blocksize
shrink to zero only if capcity is 1 blocksize.
@meiravgri meiravgri changed the title backport #753 [0.8] [MOD-10559] Decouple the shrinking and growing logic of large containers in Flat and HNSW Sep 15, 2025
@codecov
Copy link

codecov bot commented Sep 15, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.90%. Comparing base (3a7ec14) to head (b70e4d2).
⚠️ Report is 1 commits behind head on 0.8.

Additional details and impacted files
@@            Coverage Diff             @@
##              0.8     #777      +/-   ##
==========================================
+ Coverage   96.87%   96.90%   +0.02%     
==========================================
  Files          91       91              
  Lines        5082     5131      +49     
==========================================
+ Hits         4923     4972      +49     
  Misses        159      159              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@meiravgri meiravgri requested a review from GuyAv46 September 15, 2025 11:37
jobs.size());
}

#ifdef BUILD_TESTS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentionally wrapping the public: with #ifdef? Seems like a bug. Consider wrapping the new function only

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidentally backported from master. Ill align with the current version approach as you suggest.
Not sure its a bug because we don't have a SA object of VecSimTieredIndex

if (curElementCount % this->blockSize == 0) {
shrinkByBlock();
}
shrinkByBlock();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why move the condition into the function? Consider renaming so it's clear it doesn't necessarily shrink

Copy link
Collaborator Author

@meiravgri meiravgri Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it in main, i think that it was required for the initial implementation and then i forgot to revert
Should i keep it aligned with main or revert here?

}

template <typename DataType, typename DistType>
void HNSWIndex<DataType, DistType>::resizeIndexCommon(size_t new_max_elements) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like now, new_max_elements is equal to maxElements already. Can we avoid passing it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maxElements is not always aligned with size of meta data containers.
for example:
// insert 3 * bs vecs. maxElements: 3 * bs. metadata containers size: 3 * bs.
// remove 1 * bs. maxElements: 2 * bs. metadata containers size: 3 * bs (no resize)
// remove another bs. maxElements: 1 * bs. metadata containers size: 2 * bs (resizes)

/******************** Implementation **************/

template <typename DataType, typename DistType>
void BruteForceIndex<DataType, DistType>::appendVector(const void *vector_data, labelType label) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were any of the changes in this function necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplifies the resize logic (avoiding -1 offset calculations) and now also aligned with main

@meiravgri meiravgri requested a review from GuyAv46 September 15, 2025 15:02
@meiravgri meiravgri enabled auto-merge September 15, 2025 15:03
GuyAv46
GuyAv46 previously approved these changes Sep 15, 2025
@GuyAv46 GuyAv46 disabled auto-merge September 15, 2025 15:08
@meiravgri meiravgri enabled auto-merge September 15, 2025 15:30
@meiravgri meiravgri added this pull request to the merge queue Sep 15, 2025
Merged via the queue into 0.8 with commit bcc4d67 Sep 15, 2025
28 checks passed
@meiravgri meiravgri deleted the backport-meiravg_relax_resize-0.8 branch September 15, 2025 16:31
@github-actions
Copy link

Backport failed for 0.6, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 0.6
git worktree add -d .worktree/backport-777-to-0.6 origin/0.6
cd .worktree/backport-777-to-0.6
git switch --create backport-777-to-0.6
git cherry-pick -x 2f87813b0a0b4e08602d29816d3a92964f34e776 0ebc8b371952394af30156cc6f8caead6d58c5bd b70e4d2268280f4c9d7ea9c4eb4e4954867750b5

@github-actions
Copy link

Backport failed for 0.7, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 0.7
git worktree add -d .worktree/backport-777-to-0.7 origin/0.7
cd .worktree/backport-777-to-0.7
git switch --create backport-777-to-0.7
git cherry-pick -x 2f87813b0a0b4e08602d29816d3a92964f34e776 0ebc8b371952394af30156cc6f8caead6d58c5bd b70e4d2268280f4c9d7ea9c4eb4e4954867750b5

meiravgri added a commit that referenced this pull request Sep 16, 2025
…ontainers in Flat and HNSW (#777)

* backport #753

shrink by blocksize
shrink to zero only if capcity is 1 blocksize.

* move public outside

* revert size

(cherry picked from commit bcc4d67)
github-merge-queue bot pushed a commit that referenced this pull request Sep 16, 2025
…ontainers in Flat and HNSW (#780)

[0.8] [MOD-10559] Decouple the shrinking and growing logic of large containers in Flat and HNSW (#777)

* backport #753

shrink by blocksize
shrink to zero only if capcity is 1 blocksize.

* move public outside

* revert size

(cherry picked from commit bcc4d67)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants