Skip to content

perf(vacuum-filter): 优化 has 热路径(降低 API Key 负向短路成本)#757

Merged
ding113 merged 10 commits intoding113:devfrom
tesgth032:perf/vacuum-filter-has
Feb 11, 2026
Merged

perf(vacuum-filter): 优化 has 热路径(降低 API Key 负向短路成本)#757
ding113 merged 10 commits intoding113:devfrom
tesgth032:perf/vacuum-filter-has

Conversation

@tesgth032
Copy link
Contributor

@tesgth032 tesgth032 commented Feb 10, 2026

背景

#734 合入后,本地 microbench 发现 VacuumFilter.has 在“重复查询同一批 string 实例”的场景下会明显慢于 Set.has(有时看起来可达 ~10x)。

主要原因:

  • Set.has 是 V8 原生实现;并且 V8 会把 string hash 缓存在 string 实例上(当基准反复复用同一批 string 对象时优势极大)。
  • VacuumFilter.has 每次查询都会走纯 JS 热路径:JS 层手写 ASCII 拷贝(string -> bytes)+ MurmurHash3 再扫一遍 bytes(双遍历)+ 同时计算两份 hash(index/tag)+ % numBuckets

因此 microbench 下差距会被放大;而在真实请求里(从请求头解析得到的 key 通常是“新 string 实例”),Set.has 的缓存优势会明显减弱。

改进

  • 编码优化:优先 TextEncoder.encodeInto 写入复用 scratch(无分配;编码在原生层完成)。
  • 哈希优化:由“双 hash(index/tag)”改为“单次 MurmurHash3”;tag 由二次混合派生(避免再扫一遍 bytes)。
  • index 映射优化
    • numBuckets 为 2 的幂:位与(最快);
    • 否则在 numBuckets <= 2^21 时使用 fast reduction:floor(hvIndex * numBuckets / 2^32)(在 IEEE754 精度约束下可精确计算);
    • 其它情况回退到 % numBuckets
  • block 读取优化:little-endian 下用 Uint32Array 视图读取 4-byte blocks;其它端回退 byte 读取,保持可移植性。
  • 正确性修复:修复 scratch 扩容时 scratch32 4-byte 对齐导致的 RangeError,并新增单测覆盖。

量化基准(本机 microbench)

命令:
(默认跳过;需要显式开启环境变量 RUN_VACUUM_FILTER_BENCH=1

*nix:
RUN_VACUUM_FILTER_BENCH=1 node --expose-gc node_modules/vitest/vitest.mjs run tests/unit/vacuum-filter/vacuum-filter-has.bench.test.ts

PowerShell:
$env:RUN_VACUUM_FILTER_BENCH='1'; node --expose-gc node_modules/vitest/vitest.mjs run tests/unit/vacuum-filter/vacuum-filter-has.bench.test.ts

参数(测试内部固定):

  • N=50_000, keyLen=48, lookups=200_000
  • warmup=2, measure=7(取 median)
  • reuse:复用同一批 string 实例(Set.has 受益于 string hash 缓存,偏向 Set)
  • single_use:每次 lookup 都是新的 string 实例(更贴近请求头解析场景)

环境:

  • node=v20.20.0, v8=11.3.244.8-node.33, win32 x64

结果(median;ns/op 越低越好):

scenario dev Set (ns/op) dev VF (ns/op) dev VF/Set PR Set (ns/op) PR VF (ns/op) PR VF/Set VF 加速(dev->PR)
hit/reuse 45.8 272.1 5.94x 62.8 187.8 2.99x 1.45x
miss/reuse 75.0 229.0 3.06x 76.3 137.3 1.80x 1.67x
hit/single_use 352.6 233.0 0.66x 365.5 151.4 0.41x 1.54x
miss/single_use 282.8 218.8 0.77x 293.6 127.9 0.44x 1.71x

解读要点:

  • reuse 场景(偏向 Set)VF/Set 从 ~5.9x 降到 ~3.0x(hit);从 ~3.1x 降到 ~1.8x(miss)。
  • single_use 场景(更贴近请求头解析):本机上 VacuumFilter.has 明显快于 Set.hasVF/Set < 1),并且优化后进一步拉开差距。

备注

  • tag 由 hvIndex 二次混合派生,(index, tag) 不再是两份独立 hash;但在当前默认的 32-bit fingerprint 场景下碰撞概率仍极低。

Greptile Overview

Greptile Summary

This PR optimizes the hot path of VacuumFilter.has to reduce overhead for API key lookups, addressing performance gaps identified in microbenchmarks.

Key improvements:

  • Replaced dual-hash approach (index/tag) with single MurmurHash3 + derived tag via secondary mixing
  • Optimized encoding path: prioritized TextEncoder.encodeInto with reusable scratch buffer (zero allocation)
  • Added fast bucket index reduction for non-power-of-2 bucket counts using multiply-high technique floor(hvIndex * numBuckets / 2^32) when numBuckets <= 2^21 (maintains IEEE754 precision)
  • Leveraged little-endian Uint32Array view for 4-byte block reads in MurmurHash3 (falls back to byte-level reads for portability)
  • Fixed scratch buffer expansion to maintain 4-byte alignment, preventing RangeError when creating Uint32Array view

Correctness verification:

  • Previous PR comments raised concerns about fast-reduce implementation being incorrect. After thorough mathematical analysis, the current implementation is correct: when numBuckets <= 2^21 and hvIndex < 2^32, the product stays within IEEE754 double precision (< 2^53), ensuring (hvIndex * fastReduceMul) >>> 0 is equivalent to Math.floor(hvIndex * numBuckets / 2^32) and always produces results in range [0, numBuckets)
  • Comprehensive test coverage added: 10,000-iteration index boundary validation test, scratch buffer alignment edge case test
  • Benchmark suite quantifies improvements across realistic scenarios (1.45x-1.71x speedup per PR description)

Test quality fix:

  • Fixed flaky endpoint circuit breaker tests by flushing async promises before clearing alert mocks

Confidence Score: 5/5

  • Safe to merge - well-tested performance optimization with correct fast-reduce implementation
  • All previous concerns about fast-reduce implementation have been addressed with mathematically correct logic. The code includes comprehensive test coverage for edge cases (scratch buffer alignment, index boundary validation), and the optimization is backed by IEEE754 precision analysis. The benchmark suite provides empirical validation of correctness and performance improvements.
  • No files require special attention

Important Files Changed

Filename Overview
src/lib/vacuum-filter/vacuum-filter.ts Optimized VacuumFilter.has hot path with encoding/hashing improvements, correct fast-reduce implementation, and scratch buffer alignment fix
tests/unit/vacuum-filter/vacuum-filter.test.ts Added comprehensive tests for scratch buffer alignment edge case and fast-reduce index boundary validation
tests/unit/vacuum-filter/vacuum-filter-has.bench.test.ts Added thorough benchmark suite to quantify VacuumFilter.has performance improvements (opt-in via env variable)
tests/unit/lib/endpoint-circuit-breaker.test.ts Fixed test flakiness by flushing async alert promises before clearing mocks

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @tesgth032, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求旨在显著提升 VacuumFilter.has 方法的性能,特别是在处理 API Key 等热路径场景下,以降低负向短路(negative short-circuiting)的成本。通过对字符串编码、哈希计算、索引映射和数据块读取等关键环节进行深入优化,解决了在重复查询相同字符串实例时 VacuumFilter.has 明显慢于 Set.has 的问题。这些改进在保持现有语义和接口不变的前提下,确保了过滤器的正确性、无假阴性以及对非 ASCII 字符的支持。

Highlights

  • 编码优化: 使用 TextEncoder.encodeInto 写入复用的 scratch 缓冲区,将编码操作下放到原生层,避免了 JavaScript 字符循环和 encode() 方法的内存分配。
  • 哈希优化: 将 MurmurHash3 的计算从两次改为单次,并从主哈希值二次混合派生出 tag,避免了对字节的重复扫描。
  • 索引映射优化: 针对 numBuckets 为 2 的幂次、或小于等于 2^21 的情况,分别采用了位与操作和乘法高位等价式进行优化,避免了昂贵的模运算。
  • 块读取优化: 在小端序系统上,利用 Uint32Array 视图读取 4 字节块,提升了处理效率。
  • 正确性兜底: 增加了 encodeInto 极端截断时的回退机制,确保在缓冲区不足时仍能通过 TextEncoder.encode() 保证哈希输入的完整性。
Changelog
  • src/lib/vacuum-filter/vacuum-filter.ts
    • 新增了 INV_2_32、FAST_REDUCE_MAX_BUCKETS 和 IS_LITTLE_ENDIAN 等常量,用于性能优化和平台特性检测。
    • 重构了 MurmurHash3 算法,将 murmur3X86_32x2 函数拆分为 fmix32 和 murmur3X86_32,实现了单次哈希计算并从主哈希值派生 tag。
    • 在 VacuumFilter 类中引入了 bucketMask 和 fastReduceMul 属性,以支持更高效的索引映射策略。
    • 修改了 indexTag 方法,优先使用 TextEncoder.encodeInto 进行编码,并增加了缓冲区不足时的动态扩容和回退机制。
    • 为 scratch 缓冲区添加了 scratch32 属性,作为 Uint32Array 视图,以便在小端序系统上进行优化读取。
Activity
  • 目前没有与此拉取请求相关的活动。
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link

coderabbitai bot commented Feb 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

重构哈希与索引计算:移除 MurmurHash3 x86-32x2,实现 murmur3X86_32fmix32,引入 bucketMask / fastReduceMul 快速约减;改用可扩展 scratch/scratch32TextEncoder.encodeInto,并新增超长 key 与 fast-reduce 索引边界单元测试及 has 性能基准。

Changes

Cohort / File(s) Summary
核心实现
src/lib/vacuum-filter/vacuum-filter.ts
新增常量 INV_2_32, FAST_REDUCE_MAX_BUCKETS, IS_LITTLE_ENDIAN;移除 MurmurHash3 x86-32x2,添加 murmur3X86_32fmix32;统一使用 hvIndex / hvTag 输出;引入并初始化 bucketMaskfastReduceMul,根据情况选择位掩码 / 乘法高位快速约减 / % 回退来计算索引;新增 scratch32 视图并在 ASCII 路径改用 TextEncoder.encodeInto,处理截断并按需扩容以避免对齐导致的 RangeError。
测试 — 功能与边界
tests/unit/vacuum-filter/vacuum-filter.test.ts
新增两项测试:超长 key 的 add/has/delete 场景(避免 scratch32 对齐触发 RangeError),以及对非 2 的幂 numBuckets 情况下 fast-reduce 映射不越界的内部索引健壮性验证(通过访问私有字段并多次调用内部 indexTag)。
测试 — 基准
tests/unit/vacuum-filter/vacuum-filter-has.bench.test.ts
新增本地微基准,比较 Set.hasVacuumFilter.has 在多种命中/失配与复用/单次使用场景下的吞吐(ns/op、Mops/s),包含多轮预热与测量、随机数据生成与环境信息输出。

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed 标题清晰准确地反映了主要改进内容,即优化VacuumFilter.has热路径性能,降低API Key检查的开销。
Description check ✅ Passed PR 描述完整详细,清晰阐述了性能瓶颈根因、优化策略、实现细节和量化基准,与代码变更高度相关。

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added the size/S Small PR (< 200 lines) label Feb 10, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request primarily aims to optimize the hot path performance of VacuumFilter.has, reducing the cost of negative short-circuiting in API Key verification through various low-level optimizations. However, a critical vulnerability was identified in the indexTag method: dynamic resizing of the internal scratch buffer does not guarantee 4-byte alignment, leading to a RangeError when creating a Uint32Array view. This can be exploited to cause a Denial of Service (DoS). Additionally, there are two instances of code duplication that could be extracted into helper functions to improve maintainability.

Comment on lines +458 to +461
this.scratch32 = new Uint32Array(this.scratch.buffer);
}

let asciiLen = 0;
for (; asciiLen < strLen; asciiLen++) {
const c = key.charCodeAt(asciiLen);
if (c > 0x7f) break;
this.scratch[asciiLen] = c;
// encodeInto 可能因 out buffer 不足而截断:read < strLen 时扩容重试
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The indexTag method has a critical vulnerability: it can throw a RangeError when creating a Uint32Array if the scratch buffer's length is not a multiple of 4. This can lead to a Denial of Service (DoS) when processing specific input lengths. This issue occurs within a duplicated code block for resizing the scratch buffer and recreating the scratch32 view. The proposed fix addresses the alignment issue, and extracting this logic into a helper method (e.g., resizeScratch(minSize: number)) would further improve maintainability and prevent future occurrences of this vulnerability due to duplicated code.

Suggested change
this.scratch32 = new Uint32Array(this.scratch.buffer);
}
let asciiLen = 0;
for (; asciiLen < strLen; asciiLen++) {
const c = key.charCodeAt(asciiLen);
if (c > 0x7f) break;
this.scratch[asciiLen] = c;
// encodeInto 可能因 out buffer 不足而截断:read < strLen 时扩容重试
if (this.scratch.length < strLen) {
// Ensure the new length is a multiple of 4 to avoid RangeError when creating Uint32Array
const newLen = (Math.max(this.scratch.length * 2, strLen) + 3) & ~3;
this.scratch = new Uint8Array(newLen);
this.scratch32 = new Uint32Array(this.scratch.buffer);
}

Comment on lines 465 to 470
this.scratch = new Uint8Array(Math.max(this.scratch.length * 2, strLen * 4));
this.scratch32 = new Uint32Array(this.scratch.buffer);
encoded = textEncoder.encodeInto(key, this.scratch);
}

if (asciiLen === strLen) {
murmur3X86_32x2(this.scratch, strLen, this.hashSeedA, this.hashSeedB, this.hashOut);
} else {
// 非 ASCII:交给 TextEncoder(少见路径)
const keyBytes = textEncoder.encode(key);
murmur3X86_32x2(keyBytes, keyBytes.length, this.hashSeedA, this.hashSeedB, this.hashOut);
}

const hvIndex = this.hashOut[0] >>> 0;
const hvTag = this.hashOut[1] >>> 0;

// 参考实现使用 `hash % numBuckets`。这里保持简单、快速(即便 numBuckets 非 2 的幂也可用)。
const index = hvIndex % this.numBuckets;
// 极端情况下 encodeInto 仍可能因缓冲不足而截断:回退到 encode(保证正确性)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Similar to the previous finding, if the buffer is resized here due to UTF-8 encoding expansion, the length must be ensured to be a multiple of 4. While strLen * 4 is always a multiple of 4, the this.scratch.length * 2 branch might not be if the buffer was previously allocated with a non-multiple-of-4 length (though the previous fix would prevent that). It is safer to always align the allocation size.

Suggested change
this.scratch = new Uint8Array(Math.max(this.scratch.length * 2, strLen * 4));
this.scratch32 = new Uint32Array(this.scratch.buffer);
encoded = textEncoder.encodeInto(key, this.scratch);
}
if (asciiLen === strLen) {
murmur3X86_32x2(this.scratch, strLen, this.hashSeedA, this.hashSeedB, this.hashOut);
} else {
// 非 ASCII:交给 TextEncoder(少见路径)
const keyBytes = textEncoder.encode(key);
murmur3X86_32x2(keyBytes, keyBytes.length, this.hashSeedA, this.hashSeedB, this.hashOut);
}
const hvIndex = this.hashOut[0] >>> 0;
const hvTag = this.hashOut[1] >>> 0;
// 参考实现使用 `hash % numBuckets`。这里保持简单、快速(即便 numBuckets 非 2 的幂也可用)。
const index = hvIndex % this.numBuckets;
// 极端情况下 encodeInto 仍可能因缓冲不足而截断:回退到 encode(保证正确性)
if (encoded.read < strLen) {
// UTF-8 最坏 4 bytes/char;用 4x 作为上界(仅影响少见的非 ASCII key)
// Ensure the new length is a multiple of 4
const newLen = (Math.max(this.scratch.length * 2, strLen * 4) + 3) & ~3;
this.scratch = new Uint8Array(newLen);
this.scratch32 = new Uint32Array(this.scratch.buffer);
encoded = textEncoder.encodeInto(key, this.scratch);
}

Comment on lines 334 to 338
this.bucketMask === 0 && this.numBuckets <= FAST_REDUCE_MAX_BUCKETS
? this.numBuckets * INV_2_32
: null;
this.table = new Uint32Array(this.numBuckets * BUCKET_SIZE);
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

这段用于初始化 bucketMaskfastReduceMul 的逻辑与 360-364 行的代码完全相同。为了遵循 DRY (Don't Repeat Yourself) 原则并提高代码的可维护性,建议将这部分逻辑提取到一个私有的辅助方法中,然后在两个地方调用该方法。

@github-actions github-actions bot added enhancement New feature or request area:core labels Feb 10, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/lib/vacuum-filter/vacuum-filter.ts`:
- Around line 456-470: The resize logic for this.scratch/scratch32 around
textEncoder.encodeInto can produce an ArrayBuffer whose byteLength is not a
multiple of 4, causing new Uint32Array(this.scratch.buffer) to throw; update the
expansion in the block that checks this.scratch.length < strLen and the fallback
that grows to Math.max(this.scratch.length * 2, strLen * 4) to round the new
length up to a multiple of 4 (e.g., compute newLen =
roundUpToMultipleOf4(Math.max(...)) before creating new Uint8Array and then set
this.scratch32 = new Uint32Array(this.scratch.buffer)); ensure all allocations
for this.scratch use that 4-byte alignment so scratch32 construction is always
safe.
🧹 Nitpick comments (4)
src/lib/vacuum-filter/vacuum-filter.ts (4)

475-482: 单次哈希 + 二次混合派生 tag:相比旧双次独立哈希,碰撞配置发生了变化。

旧实现对输入字节计算两次独立的 MurmurHash3(不同 seed),提供 ~64-bit 独立性;新实现 hvTag = fmix32(hvIndex ^ seedB) 将 tag 确定性地由 index hash 派生,有效熵降为 32-bit。这意味着若两个 key 产生相同的 MurmurHash3 输出,它们的 (index, tag) 完全相同——在旧方案中则还需要第二个 hash 也碰撞才会如此。

对于 32-bit fingerprint + API Key 场景,实际 FPR 仍然极低,影响可忽略。但如果未来有人选择较小的 fingerprint(如 8-bit),碰撞对(同 index 且同 tag)的概率会从 ~1/2^40 退化到 ~1/2^32,值得在注释或文档中注明这一行为变化。


488-495: 快速索引映射三级路径:逻辑正确,但 fastReduceMul 路径存在可读性陷阱。

fastReduceMul
  ? ((hvIndex * fastReduceMul) | 0) >>> 0

| 0 在此处起 Math.trunc 的作用(将 double 截断为 32-bit 有符号整型),再 >>> 0 转无符号。由于 hvIndex * fastReduceMul 的值域为 [0, numBuckets)(最大 ~2^21),不会溢出 32-bit 有符号范围,行为正确。不过 | 0 做截断是一个容易误读的惯用法,考虑加一行简短注释说明意图(纯建议)。


334-338: 构造函数两处 bucketMask / fastReduceMul 计算逻辑一致,建议抽取为私有辅助方法以消除重复。

小规模(< 10_000)和大规模路径的 bucketMask / fastReduceMul 赋值代码完全相同,如果后续修改遗漏其中一处会引入不一致。

提取辅助方法示例
+  private initFastReduce(): void {
+    this.bucketMask =
+      (this.numBuckets & (this.numBuckets - 1)) === 0 ? this.numBuckets - 1 : 0;
+    this.fastReduceMul =
+      this.bucketMask === 0 && this.numBuckets <= FAST_REDUCE_MAX_BUCKETS
+        ? this.numBuckets * INV_2_32
+        : null;
+  }

然后在两处初始化路径末尾调用 this.initFastReduce() 即可。注意需要将 bucketMaskfastReduceMulreadonly 改为普通 private,或在辅助方法中使用强制类型断言。

Also applies to: 360-364


463-474: encodeInto 截断回退逻辑中,第三次判断使用的仍是第二次 encoded 的结果——逻辑正确但阅读需要追踪变量遮蔽。

Lines 464-469 做了第二次 encodeInto 并覆盖了 encoded。Lines 473-474 再次检查 encoded.read < strLen 来决定是否回退到 textEncoder.encode()。由于 encoded 此时确实指向第二次调用的返回值,逻辑正确。

但若阅读者忽略了 Line 469 的 encoded = ... 覆盖,容易误以为这里还在用第一次的结果。可以考虑加一句注释或者将回退逻辑 inline 到 if 块内以提升可读性,纯建议。

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines 488 to 493
const index =
bucketMask !== 0
? (hvIndex & bucketMask) >>> 0
: fastReduceMul
? ((hvIndex * fastReduceMul) | 0) >>> 0
: hvIndex % this.numBuckets;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fast reduce is incorrect

((hvIndex * fastReduceMul) | 0) >>> 0 is not equivalent to hvIndex % this.numBuckets and will produce out-of-range indexes (e.g., for numBuckets = 1000, this yields values up to 999,999). The intended “multiply-high” fast reduction needs the high 32 bits of the 64-bit product, not the low 32 bits.

This will break has/add/delete by indexing the table outside the expected bucket range. Please implement a correct reduction (or keep %) for non-power-of-two bucket counts.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/vacuum-filter/vacuum-filter.ts
Line: 488:493

Comment:
**Fast reduce is incorrect**

`((hvIndex * fastReduceMul) | 0) >>> 0` is not equivalent to `hvIndex % this.numBuckets` and will produce out-of-range indexes (e.g., for `numBuckets = 1000`, this yields values up to 999,999). The intended “multiply-high” fast reduction needs the high 32 bits of the 64-bit product, not the low 32 bits.

This will break `has/add/delete` by indexing the table outside the expected bucket range. Please implement a correct reduction (or keep `%`) for non-power-of-two bucket counts.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

This PR optimizes the VacuumFilter.has hot path by replacing the dual MurmurHash3 computation with a single hash + fmix32 derivation, switching from an ASCII fast-path to TextEncoder.encodeInto, adding a Uint32Array view for little-endian block reads, and implementing a fast-reduce alternative to modulo for bucket index mapping.

The algorithmic changes (single-hash + fmix32 tag derivation, fast-reduce via multiply-high) are mathematically sound. However, there is a runtime crash bug in the scratch buffer resizing logic.

PR Size: S

  • Lines changed: 159 (97 additions, 62 deletions)
  • Files changed: 1

Issues Found

Category Critical High Medium Low
Logic/Bugs 0 1 0 0
Security 0 0 0 0
Error Handling 0 0 0 0
Types 0 0 0 0
Comments/Docs 0 0 0 0
Tests 0 0 0 0
Simplification 0 0 0 0

High Priority Issues (Should Fix)

  1. [LOGIC-BUG] Uint32Array alignment crash (vacuum-filter.ts:457): When key.length > scratch.length * 2 and key.length % 4 \!== 0, the new Uint8Array(strLen) creates a buffer whose byte length is not a multiple of 4. The subsequent new Uint32Array(this.scratch.buffer) throws a RangeError. Fix: round up to the nearest multiple of 4 with (size + 3) & ~3. See inline comment for details.

Review Coverage

  • Logic and correctness
  • Security (OWASP Top 10)
  • Error handling
  • Type safety
  • Documentation accuracy
  • Test coverage
  • Code clarity

Automated review by Claude AI

// 关键优化:尽量走 TextEncoder.encodeInto(无分配,且编码在原生层完成)
const strLen = key.length;
if (this.scratch.length < strLen) {
this.scratch = new Uint8Array(Math.max(this.scratch.length * 2, strLen));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[High] [LOGIC-BUG] Uint32Array construction will throw RangeError when scratch buffer size is not a multiple of 4

Why this is a problem: When key.length exceeds this.scratch.length * 2 and is not a multiple of 4, the new Uint8Array will have a non-aligned byte length. The subsequent new Uint32Array(this.scratch.buffer) requires the buffer's byteLength to be a multiple of 4, and will throw a RangeError otherwise.

Concrete trigger: any string with .length > 512 (initial scratch is 256, so 256 * 2 = 512) where .length % 4 !== 0. For example, a 513-character string. Once triggered, the misalignment cascades to all future resizes via doubling (e.g., 513 * 2 = 1026, also not a multiple of 4).

Verified with Node.js:

const buf = new Uint8Array(513);
new Uint32Array(buf.buffer); // RangeError: byte length of Uint32Array should be a multiple of 4

The same issue exists on line 465-466 (the second resize path): though strLen * 4 is inherently aligned, if the first resize produced a non-aligned size and this.scratch.length * 2 wins the Math.max, the bug propagates.

Suggested fix:

if (this.scratch.length < strLen) {
  const newLen = (Math.max(this.scratch.length * 2, strLen) + 3) & ~3;
  this.scratch = new Uint8Array(newLen);
  this.scratch32 = new Uint32Array(this.scratch.buffer);
}

Apply the same alignment to the second resize for consistency:

const newLen = (Math.max(this.scratch.length * 2, strLen * 4) + 3) & ~3;
this.scratch = new Uint8Array(newLen);
this.scratch32 = new Uint32Array(this.scratch.buffer);

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines 507 to 512
// - 其它:回退到 `%`
const bucketMask = this.bucketMask;
const fastReduceMul = this.fastReduceMul;
const index =
bucketMask !== 0
? (hvIndex & bucketMask) >>> 0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect fast reduce math

The fastReduceMul path here is still using the low 32 bits of a floating-point product (((hvIndex * fastReduceMul) | 0) >>> 0), which is not equivalent to hvIndex % this.numBuckets and can return indexes outside [0, numBuckets). This will break has/add/delete by reading/writing the wrong buckets. Since this exact issue has already been noted in prior threads, this needs to be replaced with a correct reduction (or fall back to %) for non-power-of-two bucket counts.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/vacuum-filter/vacuum-filter.ts
Line: 507:512

Comment:
**Incorrect fast reduce math**

The `fastReduceMul` path here is still using the low 32 bits of a floating-point product (`((hvIndex * fastReduceMul) | 0) >>> 0`), which is not equivalent to `hvIndex % this.numBuckets` and can return indexes outside `[0, numBuckets)`. This will break `has/add/delete` by reading/writing the wrong buckets. Since this exact issue has already been noted in prior threads, this needs to be replaced with a correct reduction (or fall back to `%`) for non-power-of-two bucket counts.

How can I resolve this? If you propose a fix, please make it concise.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines 510 to 515
const index =
bucketMask !== 0
? (hvIndex & bucketMask) >>> 0
: fastReduceMul
? ((hvIndex * fastReduceMul) | 0) >>> 0 // |0 用于截断(等价 floor;值域 < 2^31)
: hvIndex % this.numBuckets;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fast-reduce returns wrong range

The fastReduceMul branch is computing ((hvIndex * fastReduceMul) | 0) >>> 0, which is not equivalent to floor(hvIndex * numBuckets / 2^32) and can yield values outside [0, numBuckets). With fastReduceMul = numBuckets / 2^32, hvIndex * fastReduceMul is already < numBuckets, so the correct operation is Math.floor(...) (or a safe truncation). Using |0 forces 32-bit signed conversion and changes the value domain, breaking bucket indexing in has/add/delete for non-power-of-two numBuckets.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/vacuum-filter/vacuum-filter.ts
Line: 510:515

Comment:
**Fast-reduce returns wrong range**

The `fastReduceMul` branch is computing `((hvIndex * fastReduceMul) | 0) >>> 0`, which is **not** equivalent to `floor(hvIndex * numBuckets / 2^32)` and can yield values outside `[0, numBuckets)`. With `fastReduceMul = numBuckets / 2^32`, `hvIndex * fastReduceMul` is already `< numBuckets`, so the correct operation is `Math.floor(...)` (or a safe truncation). Using `|0` forces 32-bit signed conversion and changes the value domain, breaking bucket indexing in `has/add/delete` for non-power-of-two `numBuckets`.


How can I resolve this? If you propose a fix, please make it concise.

@tesgth032
Copy link
Contributor Author

已根据 Greptile/CodeRabbit 对 fast-reduce 分支的担忧做了两处改进:

  • 将 ((hvIndex * fastReduceMul) | 0) >>> 0 简化为 (hvIndex * fastReduceMul) >>> 0,避免 |0 带来的误读/潜在 signed-wrap 担忧。
  • 新增单测覆盖 fast-reduce 配置下 indexTag 生成的 mpIndex 始终落在 [0, numBuckets)。

本地
pm test 已全绿(bench 默认跳过)。

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines 510 to 515
const index =
bucketMask !== 0
? (hvIndex & bucketMask) >>> 0
: fastReduceMul
? (hvIndex * fastReduceMul) >>> 0 // >>>0 用于截断(等价 floor;值域 < 2^32)
: hvIndex % this.numBuckets;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fast-reduce index is wrong

In indexTag, the fast-reduce branch uses ((hvIndex * fastReduceMul) >>> 0), but fastReduceMul is numBuckets / 2^32. For hvIndex < 2^32, the product is < numBuckets (and numBuckets <= 2^21 here), so >>> 0 is not equivalent to Math.floor(...): it converts to uint32 and will wrap for any product ≥ 2^32, producing out-of-range indexes and breaking bucket reads/writes.

This should be implemented as Math.floor(hvIndex * fastReduceMul) (or another correct multiply-high reduction), otherwise has/add/delete can index the wrong bucket.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/vacuum-filter/vacuum-filter.ts
Line: 510:515

Comment:
**Fast-reduce index is wrong**

In `indexTag`, the fast-reduce branch uses `((hvIndex * fastReduceMul) >>> 0)`, but `fastReduceMul` is `numBuckets / 2^32`. For `hvIndex < 2^32`, the product is `< numBuckets` (and `numBuckets <= 2^21` here), so `>>> 0` is **not** equivalent to `Math.floor(...)`: it converts to uint32 and will wrap for any product ≥ 2^32, producing out-of-range indexes and breaking bucket reads/writes.

This should be implemented as `Math.floor(hvIndex * fastReduceMul)` (or another correct multiply-high reduction), otherwise `has/add/delete` can index the wrong bucket.


How can I resolve this? If you propose a fix, please make it concise.

@tesgth032 tesgth032 force-pushed the perf/vacuum-filter-has branch from 0c57161 to de19ef8 Compare February 11, 2026 03:02
@ding113 ding113 merged commit 2a8cd05 into ding113:dev Feb 11, 2026
9 checks passed
@github-project-automation github-project-automation bot moved this from Backlog to Done in Claude Code Hub Roadmap Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core enhancement New feature or request size/S Small PR (< 200 lines)

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants