Skip to content

fix(proxy): 修复请求卡死(AgentPool 驱逐阻塞)#759

Merged
ding113 merged 11 commits intoding113:devfrom
tesgth032:fix/hang-stuck-requesting-v2
Feb 11, 2026
Merged

fix(proxy): 修复请求卡死(AgentPool 驱逐阻塞)#759
ding113 merged 11 commits intoding113:devfrom
tesgth032:fix/hang-stuck-requesting-v2

Conversation

@tesgth032
Copy link
Contributor

@tesgth032 tesgth032 commented Feb 10, 2026

概要

修复在某些情况下服务端“全局卡死、所有请求长期停留在 requesting”的问题:避免 AgentPool 驱逐时被 undici Dispatcher.close() 的 in-flight 等待阻塞,从而让后续请求在发起 upstream 之前就被卡住(因此与 200/403 等返回码无关)。

同时加固 Gemini SSE 透传的超时/中断链路,降低产生“永不结束的 in-flight”流式连接的概率,并为透传 stats 任务增加响应体缓冲上限,避免 OOM/DoS。

该修复为 provider 无关(对 Codex/OpenAI/Anthropic/Gemini 等所有走 getGlobalAgentPool() 的出站请求均生效)。

问题与根因(链路)

现象:一旦进入异常态,任意 provider / 任意返回码(200/403/...)的请求都卡在 requesting;重启容器后暂时恢复。

关键链路:

  • ProxyForwarder 为出站请求获取 dispatcher:getGlobalAgentPool().getAgent(...)
  • 当 agent 被标记 unhealthy、过期、LRU 驱逐或覆盖旧 entry 时,AgentPoolImpl.getAgent() 会等待 evictByKey() -> closeAgent()
  • undici 的 close() 会等待 in-flight 请求自然结束;若存在“流式/连接已卡住”的 in-flight,close() 可能长期不返回
  • 于是 getAgent() 被阻塞,新请求无法真正发起 upstream fetch -> 客户端一直处于 requesting,相关 request record 也无法进入结束态
  • 重启后全局 pool 被重置,因此短期恢复

解决方案

1) Root fix:AgentPool 驱逐不再阻塞

  • src/lib/proxy-agent/agent-pool.ts

    • 驱逐/清理/关停时优先使用 destroy(),避免 close() 等待 in-flight 导致全局阻塞
    • 增加防御性处理与日志
  • tests/unit/lib/proxy-agent/agent-pool.test.ts

    • 新增回归:模拟 close() 永不 resolve,验证 shutdown() / unhealthy eviction 不会挂死,并优先调用 destroy()

2) Gemini SSE 透传链路加固(降低触发条件 + 更安全)

  • src/app/v1/_lib/proxy/response-handler.ts

    • 首字节超时:仅在收到首块 body 后才清除
    • streamingIdleTimeoutMs:每个 chunk 重置 watchdog,中途静默超时主动 abort 上游
    • stats 缓冲:增加 10MB “尾部窗口”上限,防止无界缓冲导致 OOM/DoS;超限时跳过 Redis 存储
    • abortReason 归因更准确(CLIENT_ABORTED vs STREAM_*)
    • finally 清理增加 warn 日志,避免 silent swallow
  • tests/unit/proxy/response-handler-gemini-stream-passthrough-timeouts.test.ts

    • 覆盖 headers-only、后续 chunk 延迟、以及中途静默三类场景,并补强“确实由超时/中断触发”的断言

兼容性 / Breaking Changes

无。仅内部资源管理与超时行为更健壮。

本地验证

  • npm test
  • npm run typecheck
  • npm run build

Greptile Overview

Greptile Summary

This PR fixes a critical production issue where the proxy service would globally deadlock, leaving all requests stuck in "requesting" state indefinitely. The root cause was AgentPool eviction blocking on undici Agent.close(), which waits for in-flight streaming requests to complete—if a stream hung, eviction would block, preventing getAgent() from returning and causing all subsequent requests to hang before even reaching upstream.

Key Changes

AgentPool blocking fix (agent-pool.ts:358-390):

  • Changed closeAgent() to prefer destroy() over close() since destroy() forcefully terminates connections without waiting
  • Made eviction non-blocking by firing cleanup without await, preventing getAgent()/markUnhealthy()/shutdown() from hanging
  • Added defensive null check and detailed logging

Gemini SSE passthrough hardening (response-handler.ts:853-1200):

  • Deferred first-byte timeout clearing until first data chunk arrives (not just headers), preventing "200 + headers but no data" hangs
  • Added per-chunk idle watchdog (streamingIdleTimeoutMs) that resets on each chunk to detect mid-stream silence
  • Implemented 10MB tail-window buffering to prevent DoS/OOM from unbounded response accumulation
  • Improved abort reason attribution (CLIENT_ABORTED vs STREAM_IDLE_TIMEOUT vs STREAM_RESPONSE_TIMEOUT)
  • Enhanced finally block cleanup with defensive error handling and logging

Test coverage:

  • New regression tests for non-blocking eviction behavior (agent-pool.test.ts:238-294, 505-531)
  • New test file covering headers-only hang, delayed chunks, and idle timeout scenarios (response-handler-gemini-stream-passthrough-timeouts.test.ts)
  • Fixed test isolation issue in circuit breaker tests to prevent async alert task interference

Issues Found

The core fix is sound and addresses the deadlock root cause, but there are concerns:

  1. Test mocking ineffective: The destroy() preference test at agent-pool.test.ts:505-531 attempts to mock methods on real undici Agent instances, which likely won't work as intended—the regression may not be properly validated
  2. Zero-byte chunk bypass: At response-handler.ts:1016-1029, streams yielding only zero-length chunks won't arm the idle watchdog, allowing silent hangs
  3. Metadata loss risk: The 10MB tail-window buffering could discard early usage/cost metadata in large responses, breaking stats extraction
  4. Resource leak potential: Fire-and-forget cleanup in closeAgent() logs errors but doesn't track pending promises for graceful shutdown

The approach is well-reasoned with detailed comments explaining the trade-offs. The fix should prevent the global deadlock, though the test coverage may give false confidence and edge cases around zero-byte chunks and large responses need consideration.

Confidence Score: 3/5

  • Safe to merge with moderate risk—core fix is sound but test coverage may be ineffective and edge cases exist
  • The root cause analysis is excellent and the non-blocking eviction fix directly addresses the global deadlock. However: (1) the regression test at line 505 likely doesn't actually exercise the fix due to mocking real undici instances, (2) zero-byte chunks can bypass the idle watchdog, (3) tail-window buffering could break stats for large responses with front-loaded metadata, and (4) fire-and-forget cleanup may leak resources on error. The fix will likely resolve the production issue, but the test confidence is lower than ideal and edge cases need monitoring.
  • Pay close attention to src/app/v1/_lib/proxy/response-handler.ts (idle timer logic with zero-byte chunks) and tests/unit/lib/proxy-agent/agent-pool.test.ts (mock setup may not validate the fix)

Important Files Changed

Filename Overview
src/lib/proxy-agent/agent-pool.ts Root fix for global blocking: changed closeAgent() to prefer destroy() over close() and made eviction non-blocking by not awaiting cleanup to prevent pool-wide request hangs
src/app/v1/_lib/proxy/response-handler.ts Gemini SSE passthrough hardened with deferred first-byte timeout clearing, idle watchdog on each chunk, 10MB buffer cap to prevent OOM, and improved abort reason tracking
tests/unit/lib/proxy-agent/agent-pool.test.ts Added regression tests for non-blocking eviction and destroy() preference, but mock setup may not actually exercise undici Agent methods

Sequence Diagram

sequenceDiagram
    participant Client
    participant ProxyForwarder
    participant AgentPool
    participant Agent as undici Agent
    participant Upstream as Gemini API
    participant ResponseHandler
    participant StatsTask as Background Stats Task

    Client->>ProxyForwarder: Request with streaming
    ProxyForwarder->>AgentPool: getAgent(params)
    
    alt Agent exists & healthy
        AgentPool-->>ProxyForwarder: Return cached agent
    else Agent unhealthy/expired
        AgentPool->>Agent: destroy() (fire-and-forget)
        Note over AgentPool,Agent: Non-blocking eviction<br/>prevents pool-wide hang
        AgentPool->>Agent: Create new agent
        AgentPool-->>ProxyForwarder: Return new agent
    end
    
    ProxyForwarder->>Upstream: Forward request via agent
    Upstream-->>ProxyForwarder: 200 + SSE headers
    ProxyForwarder->>ResponseHandler: dispatch(session, response)
    
    ResponseHandler->>Client: Stream headers (passthrough)
    ResponseHandler->>StatsTask: Clone response & start background stats
    
    Note over StatsTask: First-byte timeout NOT cleared<br/>until first chunk arrives
    
    Upstream-->>ResponseHandler: First data chunk
    ResponseHandler->>Client: Forward chunk
    StatsTask->>StatsTask: recordTtfb()<br/>clearResponseTimeout()<br/>startIdleTimer()
    
    loop Streaming
        Upstream-->>ResponseHandler: Data chunk
        ResponseHandler->>Client: Forward chunk
        StatsTask->>StatsTask: pushChunk()<br/>reset idle watchdog
        
        alt Buffer exceeds 10MB
            StatsTask->>StatsTask: Evict old chunks<br/>(tail-window only)
        end
    end
    
    alt Stream completes normally
        Upstream-->>ResponseHandler: Stream end
        ResponseHandler->>Client: End stream
        StatsTask->>StatsTask: finalize stats & persist
    else Idle timeout
        StatsTask->>Upstream: abort() connection
        Note over StatsTask: Idle watchdog triggered<br/>no data for N ms
        StatsTask->>StatsTask: finalize with STREAM_IDLE_TIMEOUT
    else Client disconnect
        Client->>ResponseHandler: Abort
        StatsTask->>StatsTask: finalize with CLIENT_ABORTED
    end
Loading

tesgth032 and others added 2 commits February 10, 2026 22:34
- Gemini SSE 透传不再在仅收到 headers 时清除首字节超时
- 首块数据到达后清除首字节超时,并支持 streamingIdleTimeoutMs 静默超时中断
- stats 任务失败时尽量落库/结束追踪,避免请求长期停留在 requesting
- 添加回归测试覆盖无首块/首块后延迟/中途静默三种场景
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @tesgth032, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求旨在解决 Gemini 流式透传请求在特定情况下可能长时间卡死的问题。通过调整首字节超时的清除时机、引入流式静默超时机制以及优化异常处理流程,确保了流式连接的健壮性,避免了因上游响应行为异常导致的客户端悬挂和资源占用,并提供了全面的回归测试来验证这些改进。

Highlights

  • 首字节超时处理优化: Gemini 流式透传不再在仅收到响应头时清除首字节超时,而是推迟到读取到第一个数据块时,以避免上游长时间不发送数据导致连接悬挂。
  • 新增流式静默超时机制: 为 Gemini 流式透传引入 streamingIdleTimeoutMs 看门狗,当上游在发送首块数据后中途静默超过设定时间时,主动中断连接,防止请求卡死。
  • 增强异常处理与追踪: 改进了透传统计任务在异常或中断时的处理逻辑,确保即使失败也能尽量完成结算和记录,避免产生长期停留在“请求中”状态的孤立记录。
  • 补充回归测试: 新增了针对首字节超时、后续数据延迟和中途静默等场景的单元测试,以验证修复的有效性和稳定性。
Changelog
  • src/app/v1/_lib/proxy/response-handler.ts
    • 调整了 Gemini 流式透传中首字节超时的清除逻辑,使其在接收到第一个数据块时触发。
    • 引入了 streamingIdleTimeoutMs 静默超时机制,用于检测并中断中途静默的流式连接。
    • 增强了流式统计任务的错误处理,确保在异常情况下也能进行记录和清理。
    • 在 finally 块中增加了清理逻辑,包括清除静默计时器、响应超时和释放读取器锁。
    • 在 finalizeRequestStats 调用中添加了 abortReason 参数。
  • tests/unit/proxy/response-handler-gemini-stream-passthrough-timeouts.test.ts
    • 新增了单元测试文件,包含三个测试用例,分别验证了首字节超时、后续数据延迟和中途静默超时机制的正确性。
    • 为测试环境模拟了 ProxyForwarder、ProxyResponseHandler 和 ProxySession 的行为。
Activity
  • 目前没有关于此拉取请求的评论或审查活动。
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link

coderabbitai bot commented Feb 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

在 Gemini 代理响应路径引入流式处理增强:10MB 尾窗内存缓冲、首字节与空闲超时 watchdog、按块解码/缓冲/压缩与使用量计费、abortReason 传播及鲁棒最终化/回退;新增针对流超时的单元测试;代理池优先调用 destroy() 以避免 close() 挂起。

Changes

Cohort / File(s) Summary
Gemini 流/非流响应处理
src/app/v1/_lib/proxy/response-handler.ts
重构 passthrough 流处理:引入 10MB 尾窗内存缓冲、延后首字节超时清除、空闲超时 watchdog、每块解码/缓冲/指针管理与周期性压缩、按块使用量提取与定价/Redis 计费、Codex prompt_cache_key 提取/会话绑定、abortReason 传播与鲁棒的最终化/回退持久化、并在 finally 中做全面清理与分支取消。
流超时单元测试
tests/unit/proxy/response-handler-gemini-stream-passthrough-timeouts.test.ts
新增针对 Gemini 流 passthrough 的 SSE 风格测试:覆盖无数据首字节超时、首块到达后首字节超时清除、以及静默空闲超时;包含依赖 mock、SSE 测试服务器与读取超时辅助工具。
代理池实现与测试
src/lib/proxy-agent/agent-pool.ts, tests/unit/lib/proxy-agent/agent-pool.test.ts
在 AgentPoolImpl.closeAgent 中加入空值检查并优先调用 destroy()(若存在)而非等待 close(),避免阻塞;相应测试验证当 close() 永远不返回时仍能调用 destroy() 并避免挂起。

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed 标题清晰准确地反映了本 PR 的主要修复内容——解决 AgentPool 驱逐阻塞导致请求卡死的问题。
Description check ✅ Passed 拉取请求的描述详细说明了根因(AgentPool驱逐阻塞导致全局卡死)、解决方案(优先使用destroy、增加idle超时、缓冲上限)和相关代码改动位置,与变更集内容高度相关。

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added bug Something isn't working area:Google Gemini labels Feb 10, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the critical issue of Gemini streaming passthrough requests getting stuck by optimizing first-byte timeout handling and introducing a streaming idle timeout watchdog. The changes correctly defer clearing the first-byte timeout until the first data block is received, and the streamingIdleTimeoutMs mechanism ensures connections terminate correctly if the upstream provider goes silent mid-stream. Error handling in statsPromise is also enhanced to ensure request logging and tracking even in case of streaming failures, preventing "orphaned" records. However, it introduces a significant Denial of Service (DoS) vulnerability due to unbounded memory buffering of the upstream response body in the background statistics task, which could lead to Out-Of-Memory (OOM) errors and crash the proxy service. It is highly recommended to implement response size limits before deploying these changes to production.

Comment on lines 944 to 980
while (true) {
if (session.clientAbortSignal?.aborted) break;

const { done, value } = await reader.read();
if (done) {
streamEndedNormally = true;
const wasResponseControllerAborted =
sessionWithController.responseController?.signal.aborted ?? false;
const clientAborted = session.clientAbortSignal?.aborted ?? false;

// abort -> nodeStreamToWebStreamSafe 可能会把错误吞掉并 close(),导致 done=true;
// 这里必须结合 abort signal 判断是否为“自然结束”。
if (wasResponseControllerAborted || clientAborted) {
streamEndedNormally = false;
abortReason = abortReason ?? "STREAM_RESPONSE_TIMEOUT";
} else {
streamEndedNormally = true;
}
break;
}

if (value) {
if (isFirstChunk) {
isFirstChunk = false;
session.recordTtfb();
clearResponseTimeoutOnce(value.length);
}
chunks.push(decoder.decode(value, { stream: true }));

// 首块数据到达后才启动 idle timer(避免与首字节超时职责重叠)
if (!isFirstChunk) {
startIdleTimer();
}
}
}

const flushed = decoder.decode();
if (flushed) chunks.push(flushed);
const allContent = chunks.join("");
clearIdleTimer();
const allContent = flushAndJoin();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The ProxyResponseHandler implementation for Gemini stream passthrough buffers the entire upstream response body in memory. This occurs in the background statsPromise task to facilitate usage tracking (token counting) and "fake 200" error detection.

Specifically, the code collects all received chunks into a chunks array (Line 970) and then joins them into a single large string allContent (Line 980 and 1045). There is no enforced limit on the size of the response being buffered.

An attacker or a malicious/compromised upstream provider can send an extremely large response (e.g., several gigabytes). This will cause the Node.js process to exhaust its heap memory and crash with an Out-Of-Memory (OOM) error. Since this is a gateway/proxy service, a crash of the process results in a denial of service for all concurrent requests and users.

Furthermore, the buffered allContent is passed to JSON.parse (via finalizeRequestStats -> parseUsageFromResponseText), which can block the Node.js event loop for a significant duration if the input is large, further contributing to DoS. It is also stored in Redis (Line 985) without size validation, which could lead to Redis memory exhaustion.

Recommendation:
Implement a strict limit on the maximum response size that the proxy is willing to buffer for statistics and logging.

  1. Track the total size of chunks received in the while loop.
  2. If the total size exceeds a safe threshold (e.g., 10MB or 50MB), abort the stream, log a warning, and skip the full buffering/parsing logic.
  3. Ensure that the client still receives the data (if possible) or is disconnected gracefully, but the proxy must protect its own memory resources.

Comment on lines 1080 to 1082
} catch {
// ignore
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

finally 块中,clearResponseTimeoutOnce()try...catch 语句使用了空的 catch 块。虽然这可以防止清理过程中的崩溃,但它可能会隐藏潜在的问题,如果这些清理操作持续失败,将难以调试。建议在此 catch 块中添加警告日志,以记录清理错误,这对于调试非常有价值。

            } catch (e) {
              logger.warn("[ResponseHandler] Gemini passthrough: Error during clearResponseTimeoutOnce cleanup", {
                taskId,
                error: e instanceof Error ? e.message : String(e),
              });
            }

Comment on lines 1086 to 1088
} catch {
// ignore
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

finally 块中,reader?.cancel()try...catch 语句使用了空的 catch 块。虽然这可以防止清理过程中的崩溃,但它可能会隐藏潜在的问题,如果这些清理操作持续失败,将难以调试。建议在此 catch 块中添加警告日志,以记录清理错误,这对于调试非常有价值。

            } catch (e) {
              logger.warn("[ResponseHandler] Gemini passthrough: Error during reader.cancel cleanup", {
                taskId,
                error: e instanceof Error ? e.message : String(e),
              });
            }

Comment on lines 1091 to 1093
} catch {
// ignore
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

finally 块中,reader?.releaseLock()try...catch 语句使用了空的 catch 块。虽然这可以防止清理过程中的崩溃,但它可能会隐藏潜在的问题,如果这些清理操作持续失败,将难以调试。建议在此 catch 块中添加警告日志,以记录清理错误,这对于调试非常有价值。

            } catch (e) {
              logger.warn("[ResponseHandler] Gemini passthrough: Error during reader.releaseLock cleanup", {
                taskId,
                error: e instanceof Error ? e.message : String(e),
              });
            }

@github-actions github-actions bot added the size/M Medium PR (< 500 lines) label Feb 10, 2026
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

No significant issues identified in this PR. The changes correctly fix a real hang/stuck bug in the Gemini stream passthrough path by deferring the first-byte timeout clearing from the response-headers stage to the first actual body chunk, and by adding an idle timeout watchdog for the passthrough path (consistent with the non-passthrough path). The error handling in the catch block is substantially improved with proper finalization and fallback to persistRequestFailure, preventing orphan "requesting" records. The test file provides good behavioral coverage of the three core scenarios.

PR Size: M

  • Lines changed: 631 (601 additions, 30 deletions)
  • Files changed: 2

Review Coverage

  • Logic and correctness - Clean
  • Security (OWASP Top 10) - Clean
  • Error handling - Clean
  • Type safety - Clean
  • Documentation accuracy - Clean
  • Test coverage - Adequate (3 behavioral tests covering first-byte timeout, timeout clearing after first chunk, and idle timeout)
  • Code clarity - Good

Automated review by Claude AI

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 955 to 958
if (wasResponseControllerAborted || clientAborted) {
streamEndedNormally = false;
abortReason = abortReason ?? "STREAM_RESPONSE_TIMEOUT";
} else {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Client abort misclassified

In the passthrough stats reader, when done === true and clientAborted is true, the code sets abortReason = abortReason ?? "STREAM_RESPONSE_TIMEOUT" (see the if (wasResponseControllerAborted || clientAborted) branch). This will record client-driven disconnects as a timeout, which can skew stats/circuit attribution and contradicts the later catch block that maps client aborts to CLIENT_ABORTED. Consider setting abortReason based on which signal aborted (e.g. client -> CLIENT_ABORTED, controller -> STREAM_RESPONSE_TIMEOUT).

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 955:958

Comment:
**Client abort misclassified**

In the passthrough stats reader, when `done === true` and `clientAborted` is true, the code sets `abortReason = abortReason ?? "STREAM_RESPONSE_TIMEOUT"` (see the `if (wasResponseControllerAborted || clientAborted)` branch). This will record client-driven disconnects as a timeout, which can skew stats/circuit attribution and contradicts the later catch block that maps client aborts to `CLIENT_ABORTED`. Consider setting `abortReason` based on which signal aborted (e.g. client -> `CLIENT_ABORTED`, controller -> `STREAM_RESPONSE_TIMEOUT`).

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 966 to 975
isFirstChunk = false;
session.recordTtfb();
clearResponseTimeoutOnce(value.length);
}
chunks.push(decoder.decode(value, { stream: true }));

// 首块数据到达后才启动 idle timer(避免与首字节超时职责重叠)
if (!isFirstChunk) {
startIdleTimer();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idle timer never resets

startIdleTimer() is only invoked once (on the first received chunk) because it’s guarded by if (!isFirstChunk) inside the if (value) block, but isFirstChunk is set to false immediately before. That means subsequent chunks won’t restart the watchdog, so streamingIdleTimeoutMs won’t enforce “no data for N ms” during an ongoing stream. The idle timer should be restarted on every chunk after the first (or at least on every chunk) to correctly detect mid-stream silence.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 966:975

Comment:
**Idle timer never resets**

`startIdleTimer()` is only invoked once (on the first received chunk) because it’s guarded by `if (!isFirstChunk)` inside the `if (value)` block, but `isFirstChunk` is set to `false` immediately before. That means subsequent chunks won’t restart the watchdog, so `streamingIdleTimeoutMs` won’t enforce “no data for N ms” during an ongoing stream. The idle timer should be restarted on every chunk after the first (or at least on every chunk) to correctly detect mid-stream silence.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 1012 to 1017
} catch (error) {
if (!isClientAbortError(error as Error)) {
logger.error("[ResponseHandler] Gemini passthrough stats task failed:", error);
const err = error as Error;
const clientAborted = session.clientAbortSignal?.aborted ?? false;
const isResponseControllerAborted =
sessionWithController.responseController?.signal.aborted ?? false;
const isIdleTimeout = err.message?.includes("streaming_idle");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catch may throw ReferenceError

In the passthrough statsPromise catch block, isIdleTimeout is computed via err.message?.includes("streaming_idle"). If err.message is undefined (allowed by the Error type), this expression evaluates to undefined, and later isIdleTimeout ? ... will throw (undefined is not a boolean). Use a boolean coercion (e.g. !!err.message?.includes(...)) or default to false to keep the catch block from crashing while handling an error.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 1012:1017

Comment:
**Catch may throw ReferenceError**

In the passthrough `statsPromise` catch block, `isIdleTimeout` is computed via `err.message?.includes("streaming_idle")`. If `err.message` is `undefined` (allowed by the Error type), this expression evaluates to `undefined`, and later `isIdleTimeout ? ...` will throw (`undefined` is not a boolean). Use a boolean coercion (e.g. `!!err.message?.includes(...)`) or default to `false` to keep the catch block from crashing while handling an error.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 240 to 288
test("不应在仅收到 headers 时清除首字节超时:无首块数据时应在窗口内中断避免悬挂", async () => {
asyncTasks.length = 0;
const { baseUrl, close } = await startSseServer((_req, res) => {
res.writeHead(200, {
"content-type": "text/event-stream",
"cache-control": "no-cache",
connection: "keep-alive",
});
res.flushHeaders();
// 不发送任何 body,保持连接不结束
});

const clientAbortController = new AbortController();
try {
const provider = createProvider({
url: baseUrl,
firstByteTimeoutStreamingMs: 200,
});
const session = createSession({
clientAbortSignal: clientAbortController.signal,
messageId: 1,
userId: 1,
});
session.setProvider(provider);

const doForward = (
ProxyForwarder as unknown as {
doForward: (this: typeof ProxyForwarder, ...args: unknown[]) => unknown;
}
).doForward;

const upstreamResponse = (await doForward.call(
ProxyForwarder,
session,
provider,
baseUrl
)) as Response;

const clientResponse = await ProxyResponseHandler.dispatch(session, upstreamResponse);
const reader = clientResponse.body?.getReader();
expect(reader).toBeTruthy();
if (!reader) throw new Error("Missing body reader");

const firstRead = await readWithTimeout(reader, 1500);
if (!firstRead.ok) {
clientAbortController.abort(new Error("test_timeout"));
throw new Error("首字节超时未生效:读首块数据在 1.5s 内仍未返回(可能仍会卡死)");
}
} finally {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test doesn’t assert timeout

In the “no first chunk” case, the test only checks that reader.read() returns within 1.5s, but it doesn’t assert why it returned (e.g., done === true due to abort vs a chunk arriving). As written, the test could pass even if the proxy returns an immediate empty body (or any early completion) unrelated to firstByteTimeoutStreamingMs. To make this a regression test for “headers-only then hang”, assert that the read completes with done === true (or that the session/controller was aborted) and ideally that it happens after ~firstByteTimeoutStreamingMs rather than just “< 1.5s”.

Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/unit/proxy/response-handler-gemini-stream-passthrough-timeouts.test.ts
Line: 240:288

Comment:
**Test doesn’t assert timeout**

In the “no first chunk” case, the test only checks that `reader.read()` returns within 1.5s, but it doesn’t assert *why* it returned (e.g., `done === true` due to abort vs a chunk arriving). As written, the test could pass even if the proxy returns an immediate empty body (or any early completion) unrelated to `firstByteTimeoutStreamingMs`. To make this a regression test for “headers-only then hang”, assert that the read completes with `done === true` (or that the session/controller was aborted) and ideally that it happens after ~`firstByteTimeoutStreamingMs` rather than just “< 1.5s”.

How can I resolve this? If you propose a fix, please make it concise.

- AgentPool 驱逐时优先 destroy,避免 close 等待 in-flight 导致 getAgent/cleanup 卡住\n- Gemini SSE 透传 stats 读取增加内存上限、完善 abort reason 判定与清理日志\n- 补齐/加强回归测试
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@src/app/v1/_lib/proxy/response-handler.ts`:
- Around line 1149-1152: The call to reader?.cancel() currently uses void
reader?.cancel() which discards the returned Promise and can cause unhandled
rejection warnings; change this so the Promise rejection is handled—either await
reader?.cancel() inside an async try/catch or append .catch(...) to the Promise
(e.g., reader?.cancel().catch(err => { /* handle or log via responseHandler
logger */ })); update the code paths around reader?.cancel() in
response-handler.ts to ensure any error is caught and logged/ignored
appropriately.
- Around line 856-860: Remove all emoji characters from the modified comment
blocks in response-handler.ts: delete the "⚠️" and "⭐" emoji found in the
comment that begins "注意:不要在“仅收到响应头”时清除首字节超时。" and the other nearby explanatory
comments so they contain plain text only; ensure the revised comments still
convey the same guidance and comply with the rule forbidding emoji in code,
comments, or string literals.

In `@src/lib/proxy-agent/agent-pool.ts`:
- Around line 358-370: The comment inside closeAgent contains an emoji (⚠️)
which violates the no-emoji rule; remove the emoji and rephrase the warning as
plain text (e.g., "Warning:" or "Note:") while preserving the existing
explanation about preferring destroy() over close() and the rationale about
in-flight requests; update the comment near the checks for agent.destroy and
agent.close accordingly and run the linter to ensure the file no longer contains
emoji in comments or string literals.
🧹 Nitpick comments (2)
tests/unit/proxy/response-handler-gemini-stream-passthrough-timeouts.test.ts (2)

52-71: 部分依赖未 mock,后台 stats 任务中的 DB 调用可能会抛异常。

@/repository/system-configgetSystemSettings)和 @/repository/model-pricefindLatestPriceByModel)未被 mock,这些在 finalizeRequestStatsupdateRequestCostFromUsage 路径中会被调用。虽然这些异常会被 response-handler 内部的 try-catch 捕获,不会导致测试失败(因为测试关注的是超时行为而非统计准确性),但可能会在测试输出中产生虚假的错误日志。

如果希望测试输出更干净,可以考虑补充这些 mock:

vi.mock("@/repository/system-config", () => ({
  getSystemSettings: vi.fn(async () => ({ billingModelSource: "redirected" })),
}));

vi.mock("@/repository/model-price", () => ({
  findLatestPriceByModel: vi.fn(async () => null),
}));

不阻塞合并,后续可补充。


270-281: 通过 unsafe cast 访问 ProxyForwarder.doForward — 建议验证方法签名的稳定性。

doForward 被通过 as unknown as { doForward: ... } 访问,绕过了访问控制。如果 doForward 的签名在后续重构中变化,TypeScript 不会在编译时捕获此处的不匹配。

这在测试中是常见的做法,不阻塞合并,但建议在方法签名变更时同步更新此测试。

@tesgth032 tesgth032 changed the title fix(proxy): 修复 Gemini 流式透传请求卡死 fix(proxy): 修复请求卡死(AgentPool 驱逐阻塞) Feb 10, 2026
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +246 to +249
let timeoutId: ReturnType<typeof setTimeout> | null = null;
try {
return await Promise.race([
promise,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fake timers left disabled

This test calls vi.useRealTimers() but never restores fake timers afterwards, so later tests in this file that rely on vi.advanceTimersByTime(...) will stop working (or behave nondeterministically). Wrap the real-timer section with a try/finally and call vi.useFakeTimers() in the finally (or explicitly restore the previous timer mode) before the test returns.

Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/unit/lib/proxy-agent/agent-pool.test.ts
Line: 246:249

Comment:
**Fake timers left disabled**

This test calls `vi.useRealTimers()` but never restores fake timers afterwards, so later tests in this file that rely on `vi.advanceTimersByTime(...)` will stop working (or behave nondeterministically). Wrap the real-timer section with a `try/finally` and call `vi.useFakeTimers()` in the `finally` (or explicitly restore the previous timer mode) before the test returns.

How can I resolve this? If you propose a fix, please make it concise.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 1024 to 1026
// 首块数据到达后才启动 idle timer(避免与首字节超时职责重叠)
if (!isFirstChunk) {
startIdleTimer();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idle timeout never triggers

startIdleTimer() is only called when value is truthy. If the stream yields { done: false, value: undefined } (allowed by the Streams API), the watchdog won’t be (re)armed and streamingIdleTimeoutMs can fail to abort a stuck connection. Consider starting/resetting the idle timer on every successful reader.read() iteration where done === false (or at least when done===false regardless of value) so “mid-stream silence” can’t bypass the timer.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 1024:1026

Comment:
**Idle timeout never triggers**

`startIdleTimer()` is only called when `value` is truthy. If the stream yields `{ done: false, value: undefined }` (allowed by the Streams API), the watchdog won’t be (re)armed and `streamingIdleTimeoutMs` can fail to abort a stuck connection. Consider starting/resetting the idle timer on *every* successful `reader.read()` iteration where `done === false` (or at least when `done===false` regardless of `value`) so “mid-stream silence” can’t bypass the timer.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 266 to 273
const result1 = await realPool.getAgent(params);
const agent1 = result1.agent as unknown as {
close?: { mockImplementation: (fn: () => Promise<void>) => void };
destroy?: unknown;
};

// 模拟:close 可能因等待 in-flight 请求结束而长期不返回
agent1.close?.mockImplementation(() => new Promise<void>(() => {}));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mocking close likely ineffective

These tests assume result1.agent.close is a Vitest mock (mockImplementation), but AgentPoolImpl.createAgent() constructs a real undici Agent, whose .close() won’t have mockImplementation. As a result, the “close() never resolves” simulation may be a no-op and the test can pass without actually exercising the regression. To make this deterministic, inject a custom Dispatcher into the pool (or mock the undici Agent constructor) so you can provide a close()/destroy() stub that behaves as intended.

Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/unit/lib/proxy-agent/agent-pool.test.ts
Line: 266:273

Comment:
**Mocking close likely ineffective**

These tests assume `result1.agent.close` is a Vitest mock (`mockImplementation`), but `AgentPoolImpl.createAgent()` constructs a real undici `Agent`, whose `.close()` won’t have `mockImplementation`. As a result, the “close() never resolves” simulation may be a no-op and the test can pass without actually exercising the regression. To make this deterministic, inject a custom Dispatcher into the pool (or mock the undici Agent constructor) so you can provide a `close()`/`destroy()` stub that behaves as intended.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +17 to +23
vi.mock("@/lib/config", async (importOriginal) => {
const actual = await importOriginal<typeof import("@/lib/config")>();
return {
...actual,
isHttp2Enabled: mocks.isHttp2Enabled,
};
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mock isHttp2Enabled not reset

mocks.isHttp2Enabled is hoisted and shared across tests, but none of the tests reset its call history/implementation. If other tests in this file (or later suites) change its behavior, assertions can become order-dependent. Add a beforeEach(() => mocks.isHttp2Enabled.mockReset()) (or vi.clearAllMocks()) to keep tests isolated.

Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/unit/proxy/response-handler-gemini-stream-passthrough-timeouts.test.ts
Line: 17:23

Comment:
**Mock isHttp2Enabled not reset**

`mocks.isHttp2Enabled` is hoisted and shared across tests, but none of the tests reset its call history/implementation. If other tests in this file (or later suites) change its behavior, assertions can become order-dependent. Add a `beforeEach(() => mocks.isHttp2Enabled.mockReset())` (or `vi.clearAllMocks()`) to keep tests isolated.

How can I resolve this? If you propose a fix, please make it concise.

@greptile-apps
Copy link

greptile-apps bot commented Feb 10, 2026

Additional Comments (1)

src/lib/proxy-agent/agent-pool.ts
Shutdown can still hang

shutdown() awaits Promise.all(closePromises) (now calling destroy() first), but Dispatcher.destroy() isn’t guaranteed to resolve quickly in all implementations. Since the PR goal is “eviction/cleanup/shutdown never blocks new requests”, consider making shutdown/eviction non-blocking (fire-and-forget close/destroy, or add a bounded timeout) so a misbehaving dispatcher can’t stall shutdown/cleanup the same way close() did.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/proxy-agent/agent-pool.ts
Line: 332:338

Comment:
**Shutdown can still hang**

`shutdown()` awaits `Promise.all(closePromises)` (now calling `destroy()` first), but `Dispatcher.destroy()` isn’t guaranteed to resolve quickly in all implementations. Since the PR goal is “eviction/cleanup/shutdown never blocks new requests”, consider making shutdown/eviction non-blocking (fire-and-forget close/destroy, or add a bounded timeout) so a misbehaving dispatcher can’t stall shutdown/cleanup the same way `close()` did.

How can I resolve this? If you propose a fix, please make it concise.

- closeAgent 触发 destroy/close 后不 await,避免驱逐路径被卡住\n- Gemini 透传 stats:仅在收到非空 chunk 后清首字节超时\n- 回归测试:覆盖 close 无 destroy 且永不返回的场景
- 透传 stats:done=false 时也会重置 idle timer(避免 value 异常导致 watchdog 不工作)\n- 回归测试:reset hoisted isHttp2Enabled mock,避免跨用例污染
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 505 to 522
it("should prefer destroy over close to avoid hanging on in-flight streaming requests", async () => {
const result = await pool.getAgent({
endpointUrl: "https://api.anthropic.com/v1/messages",
proxyUrl: null,
enableHttp2: true,
});

const agent = result.agent as unknown as {
close?: () => Promise<void>;
destroy?: () => Promise<void>;
};

// 模拟:close 可能因等待 in-flight 请求结束而长期不返回
if (typeof agent.close === "function") {
(
agent.close as unknown as { mockImplementation: (fn: () => Promise<void>) => void }
).mockImplementation(() => new Promise<void>(() => {}));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mocking close is ineffective

In should prefer destroy over close..., agent.close comes from the mocked undici.Agent (a vi.fn()), but it’s being cast to { mockImplementation } and invoked that way. This will throw at runtime (mockImplementation is not a function) or silently fail depending on how the cast is compiled, making the test non-deterministic and not actually simulating a never-resolving close(). Use vi.mocked(agent.close).mockImplementation(...) (or agent.close = vi.fn(() => new Promise(() => {}))) and likewise stub destroy to assert it was called.

Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/unit/lib/proxy-agent/agent-pool.test.ts
Line: 505:522

Comment:
**Mocking close is ineffective**

In `should prefer destroy over close...`, `agent.close` comes from the mocked `undici.Agent` (a `vi.fn()`), but it’s being cast to `{ mockImplementation }` and invoked that way. This will throw at runtime (`mockImplementation is not a function`) or silently fail depending on how the cast is compiled, making the test non-deterministic and not actually simulating a never-resolving `close()`. Use `vi.mocked(agent.close).mockImplementation(...)` (or `agent.close = vi.fn(() => new Promise(() => {}))`) and likewise stub `destroy` to assert it was called.


How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 1016 to 1031
if (value) {
if (isFirstChunk) {
isFirstChunk = false;
session.recordTtfb();
const chunkSize = value.byteLength;
if (chunkSize > 0) {
if (isFirstChunk) {
isFirstChunk = false;
session.recordTtfb();
clearResponseTimeoutOnce(chunkSize);
}
pushChunk(decoder.decode(value, { stream: true }), chunkSize);
}

// 首块数据到达后才启动 idle timer(避免与首字节超时职责重叠)
if (!isFirstChunk) {
startIdleTimer();
}
chunks.push(decoder.decode(value, { stream: true }));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idle watchdog can be bypassed

In the passthrough stats reader loop, startIdleTimer() is only called inside if (value) { ... }. If the stream produces { done: false, value: undefined } (allowed by the Streams API) or yields zero-length chunks, the watchdog won’t be (re)armed and streamingIdleTimeoutMs won’t abort a stuck connection mid-stream. To ensure the “no data for N ms” guarantee, arm/reset the idle timer on every successful reader.read() when done === false (regardless of value), and consider treating byteLength === 0 as activity.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 1016:1031

Comment:
**Idle watchdog can be bypassed**

In the passthrough stats reader loop, `startIdleTimer()` is only called inside `if (value) { ... }`. If the stream produces `{ done: false, value: undefined }` (allowed by the Streams API) or yields zero-length chunks, the watchdog won’t be (re)armed and `streamingIdleTimeoutMs` won’t abort a stuck connection mid-stream. To ensure the “no data for N ms” guarantee, arm/reset the idle timer on every successful `reader.read()` when `done === false` (regardless of `value`), and consider treating `byteLength === 0` as activity.


How can I resolve this? If you propose a fix, please make it concise.

@greptile-apps
Copy link

greptile-apps bot commented Feb 11, 2026

Additional Comments (1)

src/lib/proxy-agent/agent-pool.ts
Shutdown still waits on evictions

shutdown() builds closePromises and await Promise.all(closePromises), but closeAgent() no longer awaits destroy()/close() (it just attaches .catch). As a result, shutdown() resolves immediately while agents may still be tearing down in the background. If other code assumes shutdown() means “all sockets closed” (e.g., tests or reset logic), this breaks that guarantee and can cause flaky behavior/leaked connections. Either make shutdown() fire-and-forget explicitly (and rename/adjust callers), or return/await the actual destroy()/close() promise(s) in closeAgent() for shutdown-only paths.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/proxy-agent/agent-pool.ts
Line: 326:343

Comment:
**Shutdown still waits on evictions**

`shutdown()` builds `closePromises` and `await Promise.all(closePromises)`, but `closeAgent()` no longer awaits `destroy()/close()` (it just attaches `.catch`). As a result, `shutdown()` resolves immediately while agents may still be tearing down in the background. If other code assumes `shutdown()` means “all sockets closed” (e.g., tests or reset logic), this breaks that guarantee and can cause flaky behavior/leaked connections. Either make `shutdown()` fire-and-forget explicitly (and rename/adjust callers), or return/await the actual `destroy()/close()` promise(s) in `closeAgent()` for shutdown-only paths.


How can I resolve this? If you propose a fix, please make it concise.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link

greptile-apps bot commented Feb 11, 2026

Additional Comments (1)

tests/unit/lib/proxy-agent/agent-pool.test.ts
Broken mockImplementation cast

agent.close is a plain function in this test (from the undici mock), but the code casts it to { mockImplementation(...) } and calls that method. At runtime this will throw (mockImplementation is not a function) or silently not do what the test intends, so the test can pass/fail nondeterministically and won’t actually simulate a never-resolving close(). Prefer stubbing the function directly (e.g. agent.close = vi.fn(() => new Promise(() => {}))) or vi.mocked(agent.close).mockImplementation(...) so the hang simulation is real.

Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/unit/lib/proxy-agent/agent-pool.test.ts
Line: 497:509

Comment:
**Broken mockImplementation cast**

`agent.close` is a plain function in this test (from the `undici` mock), but the code casts it to `{ mockImplementation(...) }` and calls that method. At runtime this will throw (`mockImplementation is not a function`) or silently not do what the test intends, so the test can pass/fail nondeterministically and won’t actually simulate a never-resolving `close()`. Prefer stubbing the function directly (e.g. `agent.close = vi.fn(() => new Promise(() => {}))`) or `vi.mocked(agent.close).mockImplementation(...)` so the hang simulation is real.

How can I resolve this? If you propose a fix, please make it concise.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link

greptile-apps bot commented Feb 11, 2026

Additional Comments (1)

src/lib/proxy-agent/agent-pool.ts
Shutdown resolves before close

closeAgent() no longer awaits destroy()/close(), but shutdown() still does await Promise.all(closePromises) expecting those promises to represent completion. As a result, shutdown() (and resetGlobalAgentPool()) can resolve while agents are still in the process of closing, which breaks the implied contract for callers/tests and can leave sockets/timers alive past shutdown. Consider awaiting destroy() in shutdown() (or adding a closeAgentFast() for eviction while keeping shutdown() blocking).

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/proxy-agent/agent-pool.ts
Line: 392:414

Comment:
**Shutdown resolves before close**

`closeAgent()` no longer awaits `destroy()`/`close()`, but `shutdown()` still does `await Promise.all(closePromises)` expecting those promises to represent completion. As a result, `shutdown()` (and `resetGlobalAgentPool()`) can resolve while agents are still in the process of closing, which breaks the implied contract for callers/tests and can leave sockets/timers alive past shutdown. Consider awaiting `destroy()` in `shutdown()` (or adding a `closeAgentFast()` for eviction while keeping `shutdown()` blocking).

How can I resolve this? If you propose a fix, please make it concise.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +375 to +382
if (operation === "destroy") {
agent.destroy().catch((error) => {
logger.warn("AgentPool: Error closing agent", {
key,
operation,
error: error instanceof Error ? error.message : String(error),
});
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fire-and-forget destroy() can leak resources if it rejects.

The unhandled rejection is logged but the agent may remain partially open. Consider tracking pending cleanup promises and awaiting them in shutdown() to ensure graceful teardown, or document that eviction is best-effort and resources may leak on error.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/proxy-agent/agent-pool.ts
Line: 375:382

Comment:
Fire-and-forget `destroy()` can leak resources if it rejects.

The unhandled rejection is logged but the agent may remain partially open. Consider tracking pending cleanup promises and awaiting them in `shutdown()` to ensure graceful teardown, or document that eviction is best-effort and resources may leak on error.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +1016 to +1024
const chunkSize = value?.byteLength ?? 0;
if (value && chunkSize > 0) {
if (isFirstChunk) {
isFirstChunk = false;
session.recordTtfb();
clearResponseTimeoutOnce(chunkSize);
}
chunks.push(decoder.decode(value, { stream: true }));
pushChunk(decoder.decode(value, { stream: true }), chunkSize);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zero-byte chunks won't arm idle watchdog.

When value exists but chunkSize === 0, the code skips pushChunk() and won't enter the if (value && chunkSize > 0) block. Subsequent if (!isFirstChunk) will be false (since isFirstChunk is still true), so startIdleTimer() never runs. A stream yielding only zero-length chunks will bypass idle timeout entirely.

Consider moving startIdleTimer() outside the chunkSize > 0 check, or treating zero-byte reads as activity.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 1016:1024

Comment:
Zero-byte chunks won't arm idle watchdog.

When `value` exists but `chunkSize === 0`, the code skips `pushChunk()` and won't enter the `if (value && chunkSize > 0)` block. Subsequent `if (!isFirstChunk)` will be false (since `isFirstChunk` is still true), so `startIdleTimer()` never runs. A stream yielding only zero-length chunks will bypass idle timeout entirely.

Consider moving `startIdleTimer()` outside the `chunkSize > 0` check, or treating zero-byte reads as activity.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +927 to +936
let abortReason: string | undefined;

// 静默期 Watchdog:透传也需要支持中途卡住(无新数据推送)
const idleTimeoutMs =
provider.streamingIdleTimeoutMs > 0 ? provider.streamingIdleTimeoutMs : Infinity;
let idleTimeoutId: NodeJS.Timeout | null = null;
const clearIdleTimer = () => {
if (idleTimeoutId) {
clearTimeout(idleTimeoutId);
idleTimeoutId = null;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tail-window chunking can lose data for usage parsing.

When response exceeds MAX_STATS_BUFFER_BYTES, only the trailing 10MB is kept. If usage/cost metadata appears early in the stream (e.g., first SSE event), it will be discarded and stats finalization may fail to extract billing info.

Check if upstream Gemini responses front-load metadata, or document that stats are best-effort for >10MB responses.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 927:936

Comment:
Tail-window chunking can lose data for usage parsing.

When response exceeds `MAX_STATS_BUFFER_BYTES`, only the trailing 10MB is kept. If usage/cost metadata appears early in the stream (e.g., first SSE event), it will be discarded and stats finalization may fail to extract billing info.

Check if upstream Gemini responses front-load metadata, or document that stats are best-effort for >10MB responses.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +505 to +531
it("should prefer destroy over close to avoid hanging on in-flight streaming requests", async () => {
const result = await pool.getAgent({
endpointUrl: "https://api.anthropic.com/v1/messages",
proxyUrl: null,
enableHttp2: true,
});

const agent = result.agent as unknown as {
close?: () => Promise<void>;
destroy?: () => Promise<void>;
};

// 模拟:close 可能因等待 in-flight 请求结束而长期不返回
if (typeof agent.close === "function") {
vi.mocked(agent.close).mockImplementation(() => new Promise<void>(() => {}));
}

await pool.shutdown();

// destroy 应被优先调用(避免 close 挂死导致 shutdown/evict 卡住)
if (typeof agent.destroy === "function") {
expect(agent.destroy).toHaveBeenCalled();
}
if (typeof agent.close === "function") {
expect(agent.close).not.toHaveBeenCalled();
}
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test likely passes without exercising the fix.

AgentPoolImpl.createAgent() constructs a real undici Agent, so result.agent.destroy is a native method, not a Vitest mock. vi.mocked(agent.destroy) will fail or be a no-op, meaning the test can't verify destroy() was called over close().

Mock the undici Agent constructor (e.g., via vi.mock("undici")) or inject a custom dispatcher to make this regression test deterministic.

Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/unit/lib/proxy-agent/agent-pool.test.ts
Line: 505:531

Comment:
Test likely passes without exercising the fix.

`AgentPoolImpl.createAgent()` constructs a real undici `Agent`, so `result.agent.destroy` is a native method, not a Vitest mock. `vi.mocked(agent.destroy)` will fail or be a no-op, meaning the test can't verify `destroy()` was called over `close()`.

Mock the undici Agent constructor (e.g., via `vi.mock("undici")`) or inject a custom dispatcher to make this regression test deterministic.

How can I resolve this? If you propose a fix, please make it concise.

@ding113 ding113 merged commit f6bb0f0 into ding113:dev Feb 11, 2026
9 checks passed
@github-project-automation github-project-automation bot moved this from Backlog to Done in Claude Code Hub Roadmap Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Google Gemini bug Something isn't working size/M Medium PR (< 500 lines)

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants