Skip to content

fix(proxy): SSE结束后识别假200错误#735

Merged
ding113 merged 13 commits intoding113:devfrom
tesgth032:fix/pr2-20260208
Feb 9, 2026
Merged

fix(proxy): SSE结束后识别假200错误#735
ding113 merged 13 commits intoding113:devfrom
tesgth032:fix/pr2-20260208

Conversation

@tesgth032
Copy link
Contributor

@tesgth032 tesgth032 commented Feb 8, 2026

背景

部分上游在鉴权/配额/风控等失败场景下会返回 HTTP 200,但响应体却是错误 JSON(“假 200”)。在 SSE 已开始透传 后,CCH 无法再把本次响应的 HTTP 状态码改回 4xx/5xx,进而导致:

  • 内部统计/熔断未按失败计入,故障转移与供应商熔断难以生效
  • session 智能绑定可能仍会把会话粘到实际不可用的 provider
  • 客户端下次自动重试更容易继续复用同一坏 provider(重试失效)

方案

对 SSE 场景引入“延迟结算”:

  • 收到响应头后:尽快开始向客户端透传 SSE(降低首包延迟),但不立刻将其结算为 success/更新会话绑定
  • SSE 自然结束后:对完整响应体做一次轻量结构化检查,仅基于 error / message 做启发式判断(不扫描模型正文如 content/choices,降低误判)
    • 响应体为空 => 视为错误
    • JSON error 字段非空 => 视为错误
    • JSON 文本 < 1000 charsmessage 命中 /error/i => 视为错误
  • 若命中错误:内部按 effectiveStatusCode=502 计入 provider/endpoint 失败与请求统计,并 跳过 session 绑定更新(避免粘到坏 provider)

变更点

  • src/app/v1/_lib/proxy/forwarder.ts:识别 SSE 时写入一次性 DeferredStreamingFinalization 元信息
  • src/app/v1/_lib/proxy/response-handler.ts:流结束后补齐成功/失败结算(熔断/endpoint/绑定/统计)
  • src/lib/utils/upstream-error-detection.ts:新增假 200 检测工具
  • 新增环境变量:STORE_SESSION_RESPONSE_BODY(默认 true
    • 设为 false 时不将会话响应体写入 Redis(不落盘/不持久化)
    • 仍会在内存中暂存本次响应体用于“延迟结算”的错误检查与统计

测试

  • npm test
  • npm run typecheck

限制

  • 该改动无法阻止错误内容被透传到客户端(SSE 已开始后无法回头);主要目标是让 CCH 自己能把“假 200”识别为失败,从而让后续重试/故障转移更容易切换到其它 provider。

Greptile Overview

Greptile Summary

This PR implements deferred finalization for SSE responses to detect "fake 200" errors (HTTP 200 with error JSON body). The approach is architecturally sound:

Core Changes:

  • Forwarder sets deferred finalization metadata when SSE is detected, allowing immediate client streaming while deferring success/failure recording
  • ResponseHandler performs post-stream error detection using heuristics (error field, message keyword matching), then updates circuit breakers, session binding, and provider chains accordingly
  • Error Detection uses conservative heuristics to avoid false positives (only checks structured fields, not model-generated content)
  • Security includes comprehensive redaction for tokens, keys, JWT, emails, and sensitive paths

Implementation Quality:

  • Well-documented with clear comments explaining the problem and solution
  • Good test coverage (13 tests covering edge cases and security)
  • Proper separation of concerns using WeakMap for metadata
  • Handles timeout/abort edge cases correctly

Remaining Concerns:

  • Quick filter in upstream-error-detection.ts:207 may have false positives when "error" or "message" appear in quoted string content
  • JSON.stringify performance optimization could skip unnecessary work when error is already detected (line 233)
  • Provider attribution logic has fallback handling but is complex and could benefit from simplification

Confidence Score: 4/5

  • This PR introduces important fake-200 detection logic with generally solid implementation, but has some edge cases and potential performance concerns worth addressing
  • The deferred streaming finalization design is architecturally sound and well-tested. Most issues mentioned in previous threads have been addressed. However, the implementation still has some edge cases (quick filter false positives, stringify performance) and complex provider attribution logic that could cause subtle issues. The core functionality works correctly for typical scenarios.
  • Pay close attention to src/lib/utils/upstream-error-detection.ts for the quick filter optimization and src/app/v1/_lib/proxy/response-handler.ts for provider attribution consistency

Important Files Changed

Filename Overview
src/app/v1/_lib/proxy/forwarder.ts Adds deferred streaming finalization metadata when SSE is detected; implementation is clean and well-documented
src/app/v1/_lib/proxy/response-handler.ts Core finalization logic for detecting fake 200 errors after SSE completion; handles circuit breaker updates and session binding correctly, but has complex provider attribution logic
src/app/v1/_lib/proxy/stream-finalization.ts New file implementing WeakMap-based metadata storage for deferred finalization; clean design with single-consumption guarantee
src/lib/utils/upstream-error-detection.ts New utility for detecting fake 200 errors in SSE/JSON responses; includes security redaction and conservative heuristics, but quick filter has edge cases
src/lib/utils/upstream-error-detection.test.ts Comprehensive test coverage (13 tests) for error detection, including security redaction, edge cases, and SSE/JSON scenarios

Sequence Diagram

sequenceDiagram
    participant Client
    participant Forwarder
    participant Upstream
    participant ResponseHandler
    participant ErrorDetection
    participant CircuitBreaker
    participant SessionManager

    Client->>Forwarder: POST /v1/chat/completions
    Forwarder->>Upstream: Forward request
    Upstream-->>Forwarder: HTTP 200 + SSE headers
    
    Note over Forwarder: Detect Content-Type: text/event-stream
    
    Forwarder->>Forwarder: setDeferredStreamingFinalization()<br/>(store providerId, endpointId, etc)
    
    Note over Forwarder: Skip immediate success recording<br/>Return response to start streaming
    
    Forwarder-->>Client: HTTP 200 + Start SSE stream
    Forwarder->>ResponseHandler: Pass response for handling
    
    ResponseHandler->>ResponseHandler: Create tee() streams<br/>(clientStream + internalStream)
    
    ResponseHandler-->>Client: Stream data chunks<br/>(via clientStream)
    
    par Background processing
        ResponseHandler->>ResponseHandler: Read internalStream<br/>Accumulate all content
        Upstream-->>ResponseHandler: data: {"error":"no credentials"}
        Upstream-->>ResponseHandler: [stream ends]
        
        ResponseHandler->>ResponseHandler: consumeDeferredStreamingFinalization()<br/>(get metadata, clear WeakMap)
        
        ResponseHandler->>ErrorDetection: detectUpstreamErrorFromSseOrJsonText(allContent)
        
        Note over ErrorDetection: Check empty body<br/>Check "error" field<br/>Check "message" keyword (if <1000 chars)
        
        ErrorDetection-->>ResponseHandler: {isError: true, code: "FAKE_200_..."}
        
        alt Fake 200 detected
            ResponseHandler->>ResponseHandler: effectiveStatusCode = 502<br/>(internal use only)
            ResponseHandler->>CircuitBreaker: recordFailure(meta.providerId)
            ResponseHandler->>CircuitBreaker: recordEndpointFailure(meta.endpointId)
            ResponseHandler->>ResponseHandler: addProviderToChain(...)<br/>(reason: "retry_failed")
            
            Note over ResponseHandler,SessionManager: Skip session binding update<br/>(avoid sticking to bad provider)
            
        else Stream ended normally, no error
            ResponseHandler->>CircuitBreaker: recordSuccess(meta.providerId)
            ResponseHandler->>CircuitBreaker: recordEndpointSuccess(meta.endpointId)
            ResponseHandler->>SessionManager: updateSessionBindingSmart()
            ResponseHandler->>ResponseHandler: addProviderToChain(...)<br/>(reason: "request_success")
        end
        
        ResponseHandler->>SessionManager: storeSessionResponse(allContent)<br/>(if STORE_SESSION_RESPONSE_BODY=true)
        ResponseHandler->>ResponseHandler: updateMessageRequestDetails()<br/>(with effectiveStatusCode)
    end
    
    Note over Client: Receives complete stream<br/>(may include error content)
Loading

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @tesgth032, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求旨在解决代理服务中一个关键的“假200”问题,即上游服务可能返回HTTP 200状态码,但其流式响应(SSE)内容实际上包含错误信息。通过将流式响应的最终成功或失败判定延迟到数据流完全结束后,并引入一套新的响应体内容检测机制,系统现在能够准确识别这些隐性错误。这显著提升了代理的健壮性,确保了熔断器、故障转移和会话绑定策略能够基于真实的响应状态进行决策,从而避免了客户端因误判成功而重复使用故障提供者的问题。

Highlights

  • 延迟流式响应结算: 将SSE(Server-Sent Events)响应的成功/失败判定从收到响应头时延迟到整个流式响应结束后进行,以避免“假200”错误。
  • 新增响应体错误检测: 在SSE流结束后,对响应体内容进行严格检查,包括:响应体为空则视为错误;JSON中包含非空“error”字段则视为错误;小于1000字符的JSON,若“message”(或“error.message”)字段包含“error”(不区分大小写)字样则视为错误。
  • 错误处理与熔断: 命中上述错误时,将记录到熔断器(circuit breaker),并将内部状态码按502处理,同时记录错误信息,以确保故障转移和重试策略的正确执行。
  • 引入新工具函数: 新增“stream-finalization.ts”用于管理延迟结算的元数据,以及“upstream-error-detection.ts”用于集中处理上游错误检测逻辑。
  • 增强测试覆盖: 为新的错误检测逻辑添加了全面的单元测试。
Changelog
  • src/app/v1/_lib/proxy/forwarder.ts
    • 新增导入setDeferredStreamingFinalization
    • 对于SSE响应,不再立即记录成功或更新会话绑定,而是调用setDeferredStreamingFinalization存储相关元数据,并将最终结算逻辑推迟到ResponseHandler处理。
  • src/app/v1/_lib/proxy/response-handler.ts
    • 新增导入detectUpstreamErrorFromSseOrJsonTextconsumeDeferredStreamingFinalization
    • 引入finalizeDeferredStreamingFinalizationIfNeeded异步函数,负责处理延迟的流式响应结算逻辑。
    • 在该函数中,利用detectUpstreamErrorFromSseOrJsonText检测SSE或JSON响应体中的上游错误。
    • 如果检测到错误,则记录熔断器失败,将有效状态码设置为502,并更新会话的提供者链和错误信息。
    • 如果流式响应成功且未检测到错误,则记录熔断器成功,并根据智能绑定策略更新会话绑定。
    • 修改ProxyResponseHandler.dispatch方法,在处理流式响应时调用finalizeDeferredStreamingFinalizationIfNeeded
    • 新增streamEndedNormally布尔变量,用于标记流是否正常结束(而非被客户端中断)。
    • 更新finalizeRequestStats函数签名,增加可选的errorMessage参数,并在更新消息请求详情和会话使用量时使用effectiveStatusCodeerrorMessage
    • 调整SessionManager.updateSessionUsageupdateMessageRequestDetails的调用,以支持新的错误信息和有效状态码。
  • src/app/v1/_lib/proxy/stream-finalization.ts
    • 新增文件,定义DeferredStreamingFinalization类型,用于在ForwarderResponseHandler之间传递延迟结算所需的元数据。
    • 导出setDeferredStreamingFinalization函数,用于在会话中设置延迟结算信息。
    • 导出consumeDeferredStreamingFinalization函数,用于从会话中获取并清除延迟结算信息。
  • src/lib/utils/upstream-error-detection.test.ts
    • 新增文件,包含detectUpstreamErrorFromSseOrJsonText函数的单元测试。
    • 测试用例覆盖了空响应体、包含非空error字段的JSON、包含“error”关键字的短JSON message、长JSON消息不触发关键字检测、以及SSE数据中包含错误JSON等多种场景。
  • src/lib/utils/upstream-error-detection.ts
    • 新增文件,导出detectUpstreamErrorFromSseOrJsonText函数。
    • 该函数用于检测给定文本(可以是纯JSON或SSE格式)中是否存在上游错误。
    • 错误检测规则包括:响应体为空、JSON中存在非空error字段、以及在特定条件下(JSON字符数小于1000)messageerror.message字段包含“error”关键字。
    • 包含辅助函数如isPlainRecordhasNonEmptyValuetruncateForReasondetectFromJsonObject
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link

coderabbitai bot commented Feb 8, 2026

📝 Walkthrough

Walkthrough

引入对 SSE 流的延迟最终化机制:forwarder 在 SSE 启动时保存最终化元数据并早返回原始响应;response-handler 在流结束后消费该元数据、检测上游“假200”并决定生效状态/错误码,然后将 effectiveStatusCode、errorMessage 与 providerIdForPersistence 贯穿到统计、会话与断路器更新;同时新增上游错误检测工具、相应测试与会话响应体存储开关及多语言前端提示与测试。

Changes

Cohort / File(s) Summary
流式终结基础设施
src/app/v1/_lib/proxy/stream-finalization.ts, src/app/v1/_lib/proxy/forwarder.ts
新增 DeferredStreamingFinalization、setDeferredStreamingFinalizationconsumeDeferredStreamingFinalization;forwarder 在 SSE 路径设置延迟最终化(调用 setDeferredStreamingFinalization)并早返回原始响应,不在流未结束时立即决定成功/失败。
流式响应处理与错误传播
src/app/v1/_lib/proxy/response-handler.ts
新增类型 FinalizeDeferredStreamingResult 与函数 finalizeDeferredStreamingFinalizationIfNeeded;消费延迟最终化并使用 detectUpstreamErrorFromSseOrJsonText 判定上游问题,将 effectiveStatusCode、errorMessage、providerIdForPersistence 传递至 finalizeRequestStats 与断路器/端点/会话持久化路径;扩展 finalizeRequestStats 签名以接受 errorMessage 和 providerIdOverride;引入 streamEndedNormally 与 clientAborted 标志。
上游错误检测实现与测试
src/lib/utils/upstream-error-detection.ts, src/lib/utils/upstream-error-detection.test.ts
新增 detectUpstreamErrorFromSseOrJsonText 与类型 UpstreamErrorDetectionResult,实现对 JSON/SSE 中“假200”情形的检测、敏感信息裁剪与截断;添加全面单元测试覆盖各类输入与掩码规则。
会话响应体存储控制 & 配置/脚本/测试
src/lib/config/env.schema.ts, .env.example, scripts/deploy.sh, scripts/deploy.ps1, src/lib/session-manager.ts, tests/unit/lib/env-store-session-response-body.test.ts, tests/unit/lib/session-manager-redaction.test.ts
新增环境变量 STORE_SESSION_RESPONSE_BODY(默认 true)并在 EnvSchema 注册;SessionManager 根据该开关决定是否将响应体写入 Redis(早退逻辑);部署脚本与单元测试已更新以验证行为。
前端本地化与 UI 提示/测试
messages/*/dashboard.json, src/app/[locale]/dashboard/logs/_components/SummaryTab.tsx, src/app/[locale]/dashboard/logs/_components/provider-chain-popover.tsx, src/app/[locale]/dashboard/logs/_components/.../*.test.tsx
新增多语言键 fake200ForwardedNotice 并在 SummaryTab 与 ProviderChainPopover 条件渲染“假200 转发提示”(InfoIcon);更新多语言文案和若干前端测试以覆盖该提示在 FAKE_200_* 情况下的显示。

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed PR 标题准确概括了核心变更:在 SSE 结束后识别假 200 错误,与 raw_summary 和 pr_objectives 中的主要改动完全一致。
Description check ✅ Passed PR 描述详细阐述了背景、方案、变更点、测试与限制,与 raw_summary 中的所有文件改动高度相关,涵盖了延迟结算、假 200 检测、熔断/会话绑定等核心逻辑。

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments
src/lib/utils/upstream-error-detection.ts (2)

66-68: isPlainRecordsession-messages-guards.ts 中的同名函数重复

src/app/[locale]/dashboard/sessions/[sessionId]/messages/_components/session-messages-guards.ts 中已有一个 isPlainRecord,且它额外检查了 Object.prototype.toString.call(value) === "[object Object]",比这里的实现更严格(能排除 Map/Set/Date 等)。

此处因为输入来源是 JSON.parse,只会产出 plain object,所以功能上没有问题。但建议考虑抽取为共享工具函数以减少重复。


85-116: 脱敏逻辑整体合理,有一处正则可改进

Line 108 的通用敏感键值正则 [^'"\s]+ 在无引号包裹的值后面如果紧跟其他内容(如 JSON 逗号),会把逗号也吞掉。影响有限(仅用于日志 detail),但可以考虑更精确的匹配:

可选改进
-    /\b(password|token|secret|api[_-]?key)\b\s*[:=]\s*['"]?[^'"\s]+['"]?/gi,
+    /\b(password|token|secret|api[_-]?key)\b\s*[:=]\s*(?:"[^"]*"|'[^']*'|[^'"\s,;})\]]+)/gi,
src/app/v1/_lib/proxy/response-handler.ts (3)

101-144: finalizeDeferredStreamingFinalizationIfNeeded 核心逻辑审阅

整体流程合理:

  • Line 119: 仅在 streamEndedNormally && upstreamStatusCode === 200 时做假 200 检测,避免对截断内容误判。
  • Line 129-138: 状态码映射逻辑清晰——假 200→502、客户端中断→499、其他中断→502。
  • Line 143: !meta || !provider 时 early return,不执行熔断/绑定更新,设计正确。

函数名 finalizeDeferredStreamingFinalizationIfNeeded 略显冗长(Finalization 出现两次),建议简化为 finalizeDeferredStreamingIfNeeded


882-939: Gemini 透传路径正确集成延迟结算

新增 streamEndedNormally 标记(Line 888, 895)和 clientAborted 检测(Line 910),通过 finalizeDeferredStreamingFinalizationIfNeeded 统一结算后传递给 finalizeRequestStats。与主流式路径保持一致的处理模式。

有一点注意:Line 912-921 的 storeSessionResponsefinalizeDeferredStreamingFinalizationIfNeeded 之前调用。即使检测到假 200,响应体仍会被存储到 Redis。从设计上这是合理的(存储是为了审计/调试),但与主流式路径(Line 1131-1140 在 finalizeStream 内部、即 finalizeDeferredStreamingFinalizationIfNeeded 之后调用)顺序不一致。


1213-1232: payload 构造方式可简化

当前先创建 payload 再有条件地追加 usageForCost 字段,可以用一次对象展开简化:

可选简化
-          const payload: SessionUsageUpdate = {
-            status: effectiveStatusCode >= 200 && effectiveStatusCode < 300 ? "completed" : "error",
-            statusCode: effectiveStatusCode,
-            ...(streamErrorMessage ? { errorMessage: streamErrorMessage } : {}),
-          };
-
-          if (usageForCost) {
-            payload.inputTokens = usageForCost.input_tokens;
-            payload.outputTokens = usageForCost.output_tokens;
-            payload.cacheCreationInputTokens = usageForCost.cache_creation_input_tokens;
-            payload.cacheReadInputTokens = usageForCost.cache_read_input_tokens;
-            payload.costUsd = costUsdStr;
-          }
+          const payload: SessionUsageUpdate = {
+            status: effectiveStatusCode >= 200 && effectiveStatusCode < 300 ? "completed" : "error",
+            statusCode: effectiveStatusCode,
+            ...(streamErrorMessage ? { errorMessage: streamErrorMessage } : {}),
+            ...(usageForCost
+              ? {
+                  inputTokens: usageForCost.input_tokens,
+                  outputTokens: usageForCost.output_tokens,
+                  cacheCreationInputTokens: usageForCost.cache_creation_input_tokens,
+                  cacheReadInputTokens: usageForCost.cache_read_input_tokens,
+                  costUsd: costUsdStr,
+                }
+              : {}),
+          };

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines 122 to 125
// 情况 2:SSE 文本。快速过滤:既无 "error" 也无 "message" key 时跳过解析
// 注意:这里用 key 形式的引号匹配,尽量避免 assistant 正文里出现 "error" 造成的无谓解析
if (!text.includes("\"error\"") && !text.includes("\"message\"")) {
return { isError: false };
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick filter may miss error JSON if assistant response contains quoted strings with "error" or "message" as content (not as keys), e.g. data: {"content": "explain \"error\" handling"} would incorrectly skip the check

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/utils/upstream-error-detection.ts
Line: 122:125

Comment:
quick filter may miss error JSON if assistant response contains quoted strings with `"error"` or `"message"` as content (not as keys), e.g. `data: {"content": "explain \"error\" handling"}` would incorrectly skip the check

How can I resolve this? If you propose a fix, please make it concise.

@github-actions github-actions bot added bug Something isn't working area:core area:provider labels Feb 8, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a deferred streaming finalization mechanism to accurately determine the success or failure of streaming (SSE) responses, particularly addressing "fake 200" scenarios where an upstream service returns an HTTP 200 status but an error within the response body. Key changes include a new stream-finalization utility to store temporary metadata on the session, modifications to ProxyForwarder to defer finalization for SSE responses, and updates to ProxyResponseHandler to consume this deferred metadata. A new upstream-error-detection utility is used to analyze the complete streamed content for errors, subsequently updating circuit breakers, session bindings, and request statistics. Review comments identify several issues: the error detection logic is vulnerable to prompt injection due to generic keyword matching on AI-generated content; potential inconsistencies in logging and state management if provider IDs mismatch during deferred finalization; risks of logging and storing sensitive information from upstream error messages without redaction; suggestions to move dynamic imports to the top of files; a recommendation to replace type assertion for deferred finalization metadata with a WeakMap for better type safety and maintainability; and the need to extend the error detection utility to handle JSON array responses.

Comment on lines 69 to 83
if (rawJsonChars < options.maxJsonCharsForMessageCheck) {
const message =
typeof obj.message === "string"
? obj.message
: isPlainRecord(obj.error) && typeof obj.error.message === "string"
? obj.error.message
: null;

if (message && options.messageKeyword.test(message)) {
return {
isError: true,
reason: `上游返回 200 但 JSON.message 命中关键字: ${truncateForReason(message)}`,
};
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The detectFromJsonObject function identifies upstream errors by checking if the message or error.message fields in a JSON response contain the keyword 'error' (case-insensitive). Since these fields can contain content generated by the AI model, an attacker can use prompt injection to force the AI to output a response that triggers this detection (e.g., by asking the AI to output a JSON object containing the word 'error'). This results in the proxy incorrectly identifying a successful response as a provider failure, which trips the circuit breaker and can lead to a Denial of Service for that provider for all users.

Remediation: Avoid using generic keyword matching on fields that may contain AI-generated content. Instead, use more specific patterns that match known error formats from supported providers, or rely on HTTP status codes and specific error types.

Comment on lines 90 to 96
if (provider.id !== meta.providerId) {
logger.warn("[ResponseHandler] Deferred streaming meta provider mismatch", {
sessionId: session.sessionId ?? null,
metaProviderId: meta.providerId,
currentProviderId: provider.id,
});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

当从会话中获取的 provider.idmeta.providerId 不匹配时,当前逻辑仅记录一条警告日志,但后续流程会继续执行。这可能导致状态不一致,因为 addProviderToChain 方法使用的是从会话中获取的、可能已过时的 provider 对象,而熔断器和会话绑定逻辑使用的是 meta 中的信息。这会导致决策链(provider chain)中记录错误的供应商信息,影响问题排查。

为了保证数据一致性,建议在此处采取更严格的措施。例如,如果出现不匹配,可以考虑使用 meta 中的信息来获取正确的供应商对象,或者至少在调用 addProviderToChain 时使用 meta 中的 providerIdproviderName,以确保审计日志的准确性。

Comment on lines 98 to 105
if (detected.isError) {
logger.warn("[ResponseHandler] SSE completed but body indicates error (fake 200)", {
providerId: meta.providerId,
providerName: meta.providerName,
upstreamStatusCode: meta.upstreamStatusCode,
effectiveStatusCode,
reason: detected.reason,
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The application extracts error reasons from upstream responses and logs them via logger.warn. These error messages can contain sensitive information such as API keys, PII, or snippets of private prompts echoed back by the provider. Logging this data without redaction poses a risk of information exposure in log management systems.

Remediation: Implement redaction logic for error messages before logging them. Ensure that sensitive data like API keys or PII are removed from the detected.reason string.

Comment on lines +1048 to +1052
const payload: SessionUsageUpdate = {
status: effectiveStatusCode >= 200 && effectiveStatusCode < 300 ? "completed" : "error",
statusCode: effectiveStatusCode,
...(streamErrorMessage ? { errorMessage: streamErrorMessage } : {}),
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The streamErrorMessage (derived from the upstream response body) is stored in the errorMessage field of the session usage payload in Redis and the message request details in the database. Storing raw error messages from upstream providers without redaction can lead to the persistent storage of sensitive information such as API keys or PII.

Remediation: Sanitize or redact the streamErrorMessage before storing it in Redis or the database. Use existing redaction utilities to ensure sensitive patterns are removed.


// 计入熔断器:让后续请求能正确触发故障转移/熔断
try {
const { recordFailure } = await import("@/lib/circuit-breaker");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

finalizeDeferredStreamingFinalizationIfNeeded 函数内部,多次使用了动态 import() 来加载 @/lib/circuit-breaker 等模块。如果不存在循环依赖的问题,建议将这些导入语句移至文件顶部。

这样做有以下好处:

  1. 提高可读性:将所有依赖项集中在文件开头,使代码结构更清晰。
  2. 性能:避免在函数调用路径中重复解析和加载模块,减少潜在的性能开销。

Comment on lines 24 to 44
type SessionWithDeferred = ProxySession & {
deferredStreamingFinalization?: DeferredStreamingFinalization;
};

export function setDeferredStreamingFinalization(
session: ProxySession,
meta: DeferredStreamingFinalization
): void {
(session as SessionWithDeferred).deferredStreamingFinalization = meta;
}

export function consumeDeferredStreamingFinalization(
session: ProxySession
): DeferredStreamingFinalization | null {
const s = session as SessionWithDeferred;
const meta = s.deferredStreamingFinalization ?? null;
if (meta) {
s.deferredStreamingFinalization = undefined;
}
return meta;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

当前通过类型断言 (session as SessionWithDeferred) 的方式向 ProxySession 对象动态添加 deferredStreamingFinalization 属性。虽然可行,但这会修改原始对象,并且不够类型安全。

建议使用 WeakMap 来存储会话与延迟最终处理元数据之间的关系。这种方法可以更好地隔离数据,避免污染 ProxySession 对象,也无需修改其原始类型定义,从而提高代码的可维护性和健壮性。

const deferredFinalizationMap = new WeakMap<ProxySession, DeferredStreamingFinalization>();

export function setDeferredStreamingFinalization(
  session: ProxySession,
  meta: DeferredStreamingFinalization
): void {
  deferredFinalizationMap.set(session, meta);
}

export function consumeDeferredStreamingFinalization(
  session: ProxySession
): DeferredStreamingFinalization | null {
  const meta = deferredFinalizationMap.get(session) ?? null;
  if (meta) {
    deferredFinalizationMap.delete(session);
  }
  return meta;
}

Comment on lines 110 to 120
if (trimmed.startsWith("{") || trimmed.startsWith("[")) {
try {
const parsed = JSON.parse(trimmed) as unknown;
if (isPlainRecord(parsed)) {
return detectFromJsonObject(parsed, trimmed.length, merged);
}
} catch {
// JSON 解析失败:不视为错误,交由上层逻辑处理
}
return { isError: false };
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

当前对纯JSON响应体的错误检测逻辑只处理了根节点为对象(isPlainRecord)的情况。如果上游服务返回的错误信息是一个JSON数组(例如 [{"error": "..."}]),当前的实现会直接跳过错误检测,将其视为正常响应。

建议扩展此处的逻辑,增加对JSON数组的处理,遍历数组中的每个元素并应用 detectFromJsonObject 检查,以覆盖更多可能的错误格式。

@github-actions github-actions bot added the size/M Medium PR (< 500 lines) label Feb 8, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@src/app/v1/_lib/proxy/response-handler.ts`:
- Around line 155-196: Remove emoji characters from the comments introduced
around the session binding logic: replace comments like "// ⭐ 成功后绑定 session
到供应商(智能绑定策略)" and "// ⭐ 统一更新两个数据源(确保监控数据一致)" with plain-text equivalents (e.g.
"// 成功后绑定 session 到供应商(智能绑定策略)" and "// 统一更新两个数据源,确保监控数据一致"). Check and update
the other occurrences noted (lines around 198-204) to remove any emoji as well;
the changes are adjacent to the calls to
SessionManager.updateSessionBindingSmart and
SessionManager.updateSessionProvider and the
logger.info/logger.debug/logger.error blocks so you can find them by those
symbols.

In `@src/lib/utils/upstream-error-detection.test.ts`:
- Around line 4-52: Tests are missing coverage for error-as-object-with-message
and error-as-non-string/non-object branches in
detectUpstreamErrorFromSseOrJsonText; add unit tests that (1) pass a JSON body
'{"error":{"message":"rate limited"}}' and assert isError: true, and (2) pass
JSON bodies like '{"error":true}' and '{"error":42}' and assert isError: true,
also add a test for a JSON array input like '[{"error":"something"}]' asserting
isError: false; place these new tests alongside existing cases in
upstream-error-detection.test.ts referencing the
detectUpstreamErrorFromSseOrJsonText function so the core branches (error object
handling, non-string error handling, and array handling) are exercised to raise
coverage above 80%.

In `@src/lib/utils/upstream-error-detection.ts`:
- Around line 52-66: Replace the hardcoded Chinese reason strings in
src/lib/utils/upstream-error-detection.ts with i18n lookups so user-facing text
is localized (e.g. use the project i18n API such as i18n.t or translate). Update
the three places that build reasons (the branches handling typeof errorValue ===
"string", isPlainRecord(errorValue) && typeof errorValue.message === "string",
and the other message/empty-body cases referenced in the comment) to call the
i18n function with keys and pass the truncated value via interpolation (use
truncateForReason(errorValue) or truncateForReason(errorValue.message) as the
interpolated variable). Keep the same semantics of the messages but replace
literal Chinese text with i18n keys and ensure the keys exist for zh-CN, zh-TW,
en, ja, ru in the translations file.
🧹 Nitpick comments (4)
src/lib/utils/upstream-error-detection.ts (3)

24-26: isPlainRecordsession-messages-guards.ts 中的实现重复

src/app/[locale]/dashboard/sessions/[sessionId]/messages/_components/session-messages-guards.ts 中已存在一个更严谨的 isPlainRecord 实现(使用了 Object.prototype.toString.call 检查)。建议抽取到公共 utils 中复用,避免两处维护不同步。

当前简化版对 JSON.parse 的输出足够安全,但长期维护上存在分叉风险。


69-75: obj.error.message 分支在此上下文中不可达

执行到 line 69 时,hasNonEmptyValue(obj.error) 必然为 false(否则已在 line 50 的 if 中返回)。当 obj.error 不含非空值时,isPlainRecord(obj.error) && typeof obj.error.message === "string" 恒为 false——因为若 error 是含有 message 属性的对象,hasNonEmptyValue 就会返回 true

line 73-74 的 obj.error.message 分支是死代码,可以移除以减少误导。

简化建议
   if (rawJsonChars < options.maxJsonCharsForMessageCheck) {
     const message =
       typeof obj.message === "string"
         ? obj.message
-        : isPlainRecord(obj.error) && typeof obj.error.message === "string"
-          ? obj.error.message
-          : null;
+        : null;
 
     if (message && options.messageKeyword.test(message)) {

128-140: SSE 事件的 chars 计算存在性能浪费

对每个 SSE 事件都重新 JSON.stringify(evt.data) 来获取字符长度,当事件较多时开销不小。考虑到 maxJsonCharsForMessageCheck 的默认阈值为 1000,可以用一个粗略的估算(如原始 SSE data 行的长度)来代替精确的 stringify。

不过这仅在 SSE body 包含 "error""message" 关键字时才执行(line 124 的快速过滤),实际触发频率应该不高,影响有限。

src/app/v1/_lib/proxy/response-handler.ts (1)

108-116: recordFailure 失败时仅 warn,不影响后续流程

line 111 的 try/catch 捕获 circuit-breaker 记录失败后只打 warn 日志并继续。这是合理的——circuit-breaker 记录失败不应阻断请求统计的写入。但如果 circuit-breaker 持续不可用,假 200 仍不会触发熔断,与 PR 目标不完全一致。

建议在 warn 日志中增加 sessionId 以方便排查:

建议
       logger.warn("[ResponseHandler] Failed to record fake-200 error in circuit breaker", {
         providerId: meta.providerId,
+        sessionId: session.sessionId ?? null,
         error: cbError,
       });

Comment on lines 52 to 66
if (typeof errorValue === "string") {
return {
isError: true,
reason: `上游返回 200 但 JSON.error 非空: ${truncateForReason(errorValue)}`,
};
}

if (isPlainRecord(errorValue) && typeof errorValue.message === "string") {
return {
isError: true,
reason: `上游返回 200 但 JSON.error.message 非空: ${truncateForReason(errorValue.message)}`,
};
}

return { isError: true, reason: "上游返回 200 但 JSON.error 非空" };
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

reason 字符串硬编码了中文,未使用 i18n

这些 reason 值会经由 finalizeDeferredStreamingFinalizationIfNeeded 传播到 errorMessage,最终写入 updateMessageRequestDetailsSessionManager.updateSessionUsage,在仪表盘中对用户可见。按照编码规范,所有用户可见的文本应使用 i18n(支持 zh-CN, zh-TW, en, ja, ru)。

涉及的硬编码中文字符串(同样出现在 line 62、66、80、106):

  • "上游返回 200 但 JSON.error 非空: ..."
  • "上游返回 200 但 JSON.error.message 非空: ..."
  • "上游返回 200 但 JSON.message 命中关键字: ..."
  • "上游返回 200 但响应体为空"

As per coding guidelines, **/*.{ts,tsx,js,jsx}: "All user-facing strings must use i18n (5 languages supported: zh-CN, zh-TW, en, ja, ru). Never hardcode display text".

🤖 Prompt for AI Agents
In `@src/lib/utils/upstream-error-detection.ts` around lines 52 - 66, Replace the
hardcoded Chinese reason strings in src/lib/utils/upstream-error-detection.ts
with i18n lookups so user-facing text is localized (e.g. use the project i18n
API such as i18n.t or translate). Update the three places that build reasons
(the branches handling typeof errorValue === "string", isPlainRecord(errorValue)
&& typeof errorValue.message === "string", and the other message/empty-body
cases referenced in the comment) to call the i18n function with keys and pass
the truncated value via interpolation (use truncateForReason(errorValue) or
truncateForReason(errorValue.message) as the interpolated variable). Keep the
same semantics of the messages but replace literal Chinese text with i18n keys
and ensure the keys exist for zh-CN, zh-TW, en, ja, ru in the translations file.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

This PR implements deferred streaming finalization to detect fake-200 errors from upstream providers. The approach is sound: SSE responses are no longer immediately marked as successful at the HTTP header stage; instead, the full response body is inspected after the stream ends. One logic gap was found in the error path where endpoint-level circuit breaker failures are not recorded, creating an asymmetry with the success path.

PR Size: M

  • Lines changed: 504 (486 additions, 18 deletions)
  • Files changed: 5

Issues Found

Category Critical High Medium Low
Logic/Bugs 0 1 0 0
Security 0 0 0 0
Error Handling 0 0 0 0
Types 0 0 0 0
Comments/Docs 0 0 0 0
Tests 0 0 1 0
Simplification 0 0 0 0

High Priority Issues (Should Fix)

  1. [LOGIC-BUG] response-handler.ts:116 - Missing recordEndpointFailure in fake-200 error path. The success path calls recordEndpointSuccess but the error path only calls provider-level recordFailure, skipping endpoint-level failure recording. This causes asymmetric endpoint health tracking.

Medium Priority Issues (Consider Fixing)

  1. [TEST-INCOMPLETE] upstream-error-detection.test.ts - Missing test for error as object format (e.g., {"error": {"message": "...", "type": "..."}}), which is the standard OpenAI error response shape. The code handles this correctly at lines 59-64 of upstream-error-detection.ts, but there is no test verifying this behavior.

Review Coverage

  • Logic and correctness
  • Security (OWASP Top 10)
  • Error handling
  • Type safety
  • Documentation accuracy
  • Test coverage
  • Code clarity

Automated review by Claude AI

providerId: meta.providerId,
error: cbError,
});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[High] [LOGIC-BUG] Missing recordEndpointFailure in fake-200 error path

Why this is a problem: When a fake-200 error is detected after SSE completes, the code records a provider-level circuit breaker failure (recordFailure) but does NOT record an endpoint-level failure (recordEndpointFailure). Compare with:

  1. The success path in this same function (line 132-142) which correctly calls recordEndpointSuccess
  2. The forwarder error path (forwarder.ts:581) which calls recordEndpointFailure

This asymmetry means endpoint health tracking will be inaccurate for fake-200 errors: the endpoint will never accumulate failures from this path, potentially preventing endpoint-level circuit breaking from triggering.

Suggested fix:

    // Record provider-level failure
    try {
      const { recordFailure } = await import("@/lib/circuit-breaker");
      await recordFailure(meta.providerId, new Error(detected.reason));
    } catch (cbError) {
      logger.warn("[ResponseHandler] Failed to record fake-200 error in circuit breaker", {
        providerId: meta.providerId,
        error: cbError,
      });
    }

    // Record endpoint-level failure (mirrors recordEndpointSuccess in success path)
    if (meta.endpointId != null) {
      try {
        const { recordEndpointFailure } = await import("@/lib/endpoint-circuit-breaker");
        await recordEndpointFailure(meta.endpointId, new Error(detected.reason));
      } catch (endpointError) {
        logger.warn("[ResponseHandler] Failed to record endpoint failure for fake-200", {
          endpointId: meta.endpointId,
          providerId: meta.providerId,
          error: endpointError,
        });
      }
    }

const res = detectUpstreamErrorFromSseOrJsonText(sse);
expect(res.isError).toBe(false);
});
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Medium] [TEST-INCOMPLETE] Missing test for error as object format

Why this is a problem: The standard OpenAI error response format is {"error": {"message": "...", "type": "..."}} where error is an object, not a string. The detection code handles this correctly at upstream-error-detection.ts:59-64, but there is no test verifying this important code path. This is a common real-world error format that should be covered.

Suggested fix - add these test cases:

  test("JSON: error as object with message is detected as error", () => {
    const res = detectUpstreamErrorFromSseOrJsonText(
      JSON.stringify({ error: { message: "rate limit exceeded", type: "rate_limit_error" } })
    );
    expect(res.isError).toBe(true);
    if (res.isError) {
      expect(res.reason).toContain("rate limit exceeded");
    }
  });

  test("JSON: error as non-empty object without message is detected as error", () => {
    const res = detectUpstreamErrorFromSseOrJsonText(
      JSON.stringify({ error: { type: "server_error", code: 500 } })
    );
    expect(res.isError).toBe(true);
  });

  test("JSON: error as null is not detected as error", () => {
    const res = detectUpstreamErrorFromSseOrJsonText('{"error": null}');
    expect(res.isError).toBe(false);
  });

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 41 to 50
export function consumeDeferredStreamingFinalization(
session: ProxySession
): DeferredStreamingFinalization | null {
const s = session as SessionWithDeferred;
const meta = s.deferredStreamingFinalization ?? null;
if (meta) {
// 只允许消费一次:避免重复结算(例如多个后台统计任务并行时)。
s.deferredStreamingFinalization = undefined;
}
return meta;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

potential race condition if called concurrently from multiple tasks

if two background tasks call this simultaneously, both could read the same meta before either clears it, causing duplicate finalization

consider atomic operations or document single-invocation requirement

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/stream-finalization.ts
Line: 41:50

Comment:
potential race condition if called concurrently from multiple tasks

if two background tasks call this simultaneously, both could read the same `meta` before either clears it, causing duplicate finalization

consider atomic operations or document single-invocation requirement

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 101 to 103
const detected = streamEndedNormally
? detectUpstreamErrorFromSseOrJsonText(allContent)
: ({ isError: false } as const);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

abnormal terminations (timeouts, aborts) skip error detection, keeping upstreamStatusCode (200)

fake 200 responses that timeout won't be detected as errors, missing circuit breaker recording

should abnormal terminations always map to error status?

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 101:103

Comment:
abnormal terminations (timeouts, aborts) skip error detection, keeping `upstreamStatusCode` (200)

fake 200 responses that timeout won't be detected as errors, missing circuit breaker recording

should abnormal terminations always map to error status?

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 104 to 109
const message =
typeof obj.message === "string"
? obj.message
: isPlainRecord(obj.error) && typeof obj.error.message === "string"
? obj.error.message
: null;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nested ternary logic is correct but could be clearer

checking isPlainRecord(obj.error) at line 107 seems redundant since line 84-100 already handled the obj.error cases

consider extracting to separate function for readability

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/utils/upstream-error-detection.ts
Line: 104:109

Comment:
nested ternary logic is correct but could be clearer

checking `isPlainRecord(obj.error)` at line 107 seems redundant since line 84-100 already handled the `obj.error` cases

consider extracting to separate function for readability

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 162 to 167
// 情况 2:SSE 文本。快速过滤:既无 "error" 也无 "message" key 时跳过解析
// 注意:这里用 key 形式的引号匹配,尽量避免模型正文里出现 error 造成的无谓解析。
// 代价:如果上游返回的并非标准 JSON key(极少见),这里可能漏检;但我们偏向保守与低误判。
if (!text.includes("\"error\"") && !text.includes("\"message\"")) {
return { isError: false };
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick filter text.includes('"error"') may miss errors with non-standard spacing like { "error" : "msg" } or miss legitimate content containing "error" triggering unnecessary parsing

tradeoff between performance and accuracy is acknowledged in comments but worth noting

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/utils/upstream-error-detection.ts
Line: 162:167

Comment:
quick filter `text.includes('"error"')` may miss errors with non-standard spacing like `{ "error" : "msg" }` or miss legitimate content containing "error" triggering unnecessary parsing

tradeoff between performance and accuracy is acknowledged in comments but worth noting

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 90 to 93
* @param streamEndedNormally - 必须是 reader 读到 done=true 的“自然结束”;超时/中断等异常结束由其它逻辑处理。
*/
async function finalizeDeferredStreamingFinalizationIfNeeded(
session: ProxySession,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provider mismatch breaks attribution
finalizeDeferredStreamingFinalizationIfNeeded() computes provider from session.provider (current state) but uses meta.providerId for circuit-breaker and chain updates. If session.provider has been switched (or unset) by later logic before stream end, the code will still call session.addProviderToChain(provider, ...) with the current provider object while recording success/failure against the meta provider id, producing inconsistent providerChain + stats attribution. Consider looking up the provider by meta.providerId (or storing the provider snapshot in meta) and using that consistently for addProviderToChain / session provider updates.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 90:93

Comment:
**Provider mismatch breaks attribution**
`finalizeDeferredStreamingFinalizationIfNeeded()` computes `provider` from `session.provider` (current state) but uses `meta.providerId` for circuit-breaker and chain updates. If `session.provider` has been switched (or unset) by later logic before stream end, the code will still call `session.addProviderToChain(provider, ...)` with the *current* provider object while recording success/failure against the *meta* provider id, producing inconsistent providerChain + stats attribution. Consider looking up the provider by `meta.providerId` (or storing the provider snapshot in meta) and using that consistently for `addProviderToChain` / session provider updates.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +133 to +136

// 计入熔断器:让后续请求能正确触发故障转移/熔断
try {
// 动态导入:避免 proxy 模块与熔断器模块之间潜在的循环依赖。
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

False-success on abort path
When !meta || !provider || !streamEndedNormally, the function returns effectiveStatusCode (possibly still 200) and errorMessage without updating circuit breaker / chain. In the streaming handler, the “client aborted” branch calls finalizeStream(allContent, false), so effectiveStatusCode stays as the upstream statusCode (often 200) and the later updateSessionUsage / updateMessageRequestDetails paths will record a completed 200 even though the stream did not end normally. This seems to reintroduce incorrect success recording for aborted/partial streams; you likely want aborted streams to force an error effectiveStatusCode (e.g. 499/502) for internal stats, or skip updating usage/details when streamEndedNormally=false.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 133:136

Comment:
**False-success on abort path**
When `!meta || !provider || !streamEndedNormally`, the function returns `effectiveStatusCode` (possibly still 200) and `errorMessage` without updating circuit breaker / chain. In the streaming handler, the “client aborted” branch calls `finalizeStream(allContent, false)`, so `effectiveStatusCode` stays as the upstream `statusCode` (often 200) and the later `updateSessionUsage` / `updateMessageRequestDetails` paths will record a completed 200 even though the stream did not end normally. This seems to reintroduce incorrect success recording for aborted/partial streams; you likely want aborted streams to force an error `effectiveStatusCode` (e.g. 499/502) for internal stats, or skip updating usage/details when `streamEndedNormally=false`.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 1 to 3
import { describe, expect, test } from "vitest";
import { detectUpstreamErrorFromSseOrJsonText } from "./upstream-error-detection";

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test count mismatch
The PR description says this test file has 7 cases, but the file currently defines 13 tests. If CI/test expectations or documentation rely on the declared count, the description should be updated to avoid confusion during review and future maintenance.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/utils/upstream-error-detection.test.ts
Line: 1:3

Comment:
**Test count mismatch**
The PR description says this test file has 7 cases, but the file currently defines 13 tests. If CI/test expectations or documentation rely on the declared count, the description should be updated to avoid confusion during review and future maintenance.

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

17 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link

greptile-apps bot commented Feb 8, 2026

Additional Comments (1)

src/app/v1/_lib/proxy/response-handler.ts
Deferred finalize uses wrong provider

finalizeDeferredStreamingFinalizationIfNeeded() consumes meta (meta.providerId) but then uses const provider = session.provider for session.addProviderToChain(provider, ...) and logs, which can diverge if session.provider is mutated during retries/failover before stream completion. This will produce inconsistent attribution (providerChain vs circuit-breaker/session binding). Use a provider lookup/snapshot based on meta.providerId for all downstream updates, or store the provider object in the deferred meta and use that consistently.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 261:263

Comment:
**Deferred finalize uses wrong provider**

`finalizeDeferredStreamingFinalizationIfNeeded()` consumes meta (`meta.providerId`) but then uses `const provider = session.provider` for `session.addProviderToChain(provider, ...)` and logs, which can diverge if `session.provider` is mutated during retries/failover before stream completion. This will produce inconsistent attribution (providerChain vs circuit-breaker/session binding). Use a provider lookup/snapshot based on `meta.providerId` for *all* downstream updates, or store the provider object in the deferred meta and use that consistently.

How can I resolve this? If you propose a fix, please make it concise.

@tesgth032
Copy link
Contributor Author

已按审查建议更新并推送:

  • fake-200 判定为错误时:补齐 endpoint 级 recordEndpointFailure(与成功路径对称)
  • circuit-breaker 记录失败的 warn 日志补充 sessionId
  • 移除本 PR 新增注释中的 emoji(⭐)
  • upstream-error-detection 单测补齐:error=true/42、JSON array 不误判,并简化 message 提取逻辑

已验证:npm test / npm run typecheck 均通过。

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

17 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link

greptile-apps bot commented Feb 8, 2026

Additional Comments (3)

src/app/v1/_lib/proxy/response-handler.ts
Provider object mismatch
finalizeDeferredStreamingFinalizationIfNeeded() uses meta.providerId for circuit-breaker/session binding, but session.addProviderToChain(provider, ...) uses provider = session.provider (current mutable state). If session.provider has changed by the time the stream ends, the providerChain entry can be attributed to a different provider than the one whose circuit breaker was updated. This will make monitoring and retry decisions inconsistent. Use a provider lookup/snapshot from meta.providerId (or store the provider object in DeferredStreamingFinalization) and use that consistently for addProviderToChain and related fields.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 280:282

Comment:
**Provider object mismatch**
`finalizeDeferredStreamingFinalizationIfNeeded()` uses `meta.providerId` for circuit-breaker/session binding, but `session.addProviderToChain(provider, ...)` uses `provider = session.provider` (current mutable state). If `session.provider` has changed by the time the stream ends, the providerChain entry can be attributed to a different provider than the one whose circuit breaker was updated. This will make monitoring and retry decisions inconsistent. Use a provider lookup/snapshot from `meta.providerId` (or store the provider object in `DeferredStreamingFinalization`) and use that consistently for `addProviderToChain` and related fields.

How can I resolve this? If you propose a fix, please make it concise.

src/app/v1/_lib/proxy/response-handler.ts
Aborted streams recorded as 200
When streamEndedNormally=false, this returns effectiveStatusCode based on detected.isError (forced to false) and upstreamStatusCode (often 200), then skips circuit-breaker/chain updates. Callers still persist statusCode: effectiveStatusCode, so client aborts/timeouts can be recorded internally as “completed 200”, reintroducing the false-success problem for partial streams. Consider forcing a non-2xx internal status (e.g. 499/502) or skipping usage/details persistence when the stream did not end normally.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 273:277

Comment:
**Aborted streams recorded as 200**
When `streamEndedNormally=false`, this returns `effectiveStatusCode` based on `detected.isError` (forced to false) and `upstreamStatusCode` (often 200), then skips circuit-breaker/chain updates. Callers still persist `statusCode: effectiveStatusCode`, so client aborts/timeouts can be recorded internally as “completed 200”, reintroducing the false-success problem for partial streams. Consider forcing a non-2xx internal status (e.g. 499/502) or skipping usage/details persistence when the stream did not end normally.

How can I resolve this? If you propose a fix, please make it concise.

src/app/v1/_lib/proxy/response-handler.ts
Fake-200 check skipped on abnormal end
detectUpstreamErrorFromSseOrJsonText(allContent) is only executed when streamEndedNormally is true. If the upstream emits an error JSON early but the connection times out/aborts before natural EOF, this path won’t detect it and will keep the internal effectiveStatusCode as the upstream status (commonly 200), missing failure recording. If the goal is to avoid circuit-breaker misses, consider running detection whenever allContent is non-empty, regardless of how the stream ended, and only gate binding/success on streamEndedNormally.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 264:266

Comment:
**Fake-200 check skipped on abnormal end**
`detectUpstreamErrorFromSseOrJsonText(allContent)` is only executed when `streamEndedNormally` is true. If the upstream emits an error JSON early but the connection times out/aborts before natural EOF, this path won’t detect it and will keep the internal `effectiveStatusCode` as the upstream status (commonly 200), missing failure recording. If the goal is to avoid circuit-breaker misses, consider running detection whenever `allContent` is non-empty, regardless of how the stream ended, and only gate *binding/success* on `streamEndedNormally`.

How can I resolve this? If you propose a fix, please make it concise.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/app/v1/_lib/proxy/response-handler.ts`:
- Around line 117-123: When provider.id !== meta.providerId, choose a single
canonical provider ID to use for downstream metrics and state updates: switch
recordFailure and recordSuccess (and any calls that currently pass
meta.providerId) to use the actual current provider.id (the same object passed
to session.addProviderToChain) and add a short comment above the mismatch check
explaining that we intentionally prefer the runtime provider.id for
circuit-breaker accuracy; keep the existing logger.warn but include which ID
will be used for recording so future readers know the decision.

In `@src/lib/utils/upstream-error-detection.ts`:
- Around line 144-155: The JSON branch currently treats inputs starting with "["
as JSON but only calls detectFromJsonObject when isPlainRecord(parsed) is true,
causing JSON arrays to be ignored; update the logic in the block that checks
trimmed.startsWith("{") || trimmed.startsWith("[") so that after JSON.parse you
handle arrays as well: if parsed is an array, iterate its elements and call
detectFromJsonObject on any plain-record elements (or aggregate results and
return an error if any element indicates an error); alternatively, if arrays
should be unsupported, remove the trimmed.startsWith("[") check to make the
intent explicit; key symbols: the trimmed.startsWith check, JSON.parse,
isPlainRecord(parsed), detectFromJsonObject(parsed, ...), and the final return {
isError: false } to adjust.
🧹 Nitpick comments (5)
src/lib/utils/upstream-error-detection.ts (3)

50-52: isPlainRecord 实现与 session-messages-guards.ts 中的同名函数不一致

session-messages-guards.ts 中的版本额外检查了 Object.prototype.toString.call(value) === "[object Object]",能排除 DateRegExp 等特殊对象。本文件的版本只检查 typeof === "object" 且非数组,对于边缘场景(如上游返回的 JSON 被反序列化后包含特殊对象)可能行为不同。

建议统一提取为共享工具函数,避免两处实现漂移。

建议提取为共享实现
-function isPlainRecord(value: unknown): value is Record<string, unknown> {
-  return !!value && typeof value === "object" && !Array.isArray(value);
-}
+// 复用已有的 isPlainRecord 实现
+import { isPlainRecord } from "@/app/[locale]/dashboard/sessions/[sessionId]/messages/_components/session-messages-guards";

或者将 session-messages-guards.ts 中的实现迁移到 @/lib/utils/ 下的共享模块,两处统一引用。


164-177: SSE 事件检测中对 evt.data 重新 JSON.stringify 计算字符数,性能可优化

parseSSEData 已经从文本中解析出事件,但这里又将 evt.data 重新序列化以获取字符数(Line 170)。对于大量 SSE 事件的场景,这会增加不必要的序列化开销。

考虑到这是流结束后的一次性检测,且 parseSSEData 返回的是已解析对象(无法直接获取原始 JSON 长度),当前做法可接受,但值得留意。如果性能成为问题,可以在 parseSSEData 中同时返回原始 data 字符串长度。


101-115: message 关键字检测仅检查顶层 obj.message,未检查 obj.error.message

注释(Line 82)提到 "message / error.message 命中关键字" 两种场景,但代码在 Line 84-101 的 errorValue 分支中,如果 error.message 存在,会在 Line 93-97 直接返回"非空错误"结果,不会走到 Line 103-113 的关键字检测。

这意味着当 error 字段非空时,无论 error.message 是否包含 "error" 关键字都会判定为错误(这是合理的强信号逻辑)。但如果 error 为空/不存在,则只检查了顶层 message,没有检查嵌套的 error.message——这在 error 字段为空对象 {}0 等场景下,hasNonEmptyValue 返回 false,此时 error.message 不会被检查。

实际上这个边界很窄(error: {} 是空对象,hasNonEmptyValue 返回 false),但注释与代码行为存在细微差异,建议更新注释以准确反映实际逻辑。

src/app/v1/_lib/proxy/response-handler.ts (2)

101-103: 非正常结束时跳过假 200 检测的逻辑正确,但存在细微边界

streamEndedNormallyfalse 时,detected 被强制设为 { isError: false },意味着异常中断的流不会触发假 200 判定。这符合注释说明(Line 90-91)。

但考虑一个边界场景:如果上游在发送了一条包含 error 字段的 SSE 事件后立刻断开连接(streamEndedNormally = false),此时 allContent 中已经包含了错误信息,但因为非正常结束而被忽略。这种情况下,异常中断的处理逻辑(persistRequestFailure)会把它当作网络错误记录,而不是上游业务错误。

这在当前保守策略下是合理的(避免误判),只是值得在注释中说明这个 trade-off。


134-145: 大量重复的动态 import() 调用模式可考虑提取

await import("@/lib/circuit-breaker")await import("@/lib/endpoint-circuit-breaker") 的 try/catch 模式在成功路径和失败路径中重复出现,且与文件其他位置(Lines 614-625, 1213-1224, 1249-1260)的模式完全一致。

可以提取为辅助函数(如 safeRecordFailure(providerId, error) / safeRecordSuccess(providerId)),减少重复并统一错误处理。

Also applies to: 148-159, 190-198

@tesgth032
Copy link
Contributor Author

进一步按 review 意见加固(仅触及本 PR 新增/改动逻辑):

  • stream-finalization:改用 WeakMap 保存 deferred meta,避免污染 ProxySession 对象
  • response-handler:当 provider.id !== meta.providerId 时,仍以 meta 作为熔断/绑定/统计的 canonical 来源;同时尝试用 session.getProvidersSnapshot() 解析正确的 Provider 对象写入 providerChain,提升审计一致性
  • upstream-error-detectionreason 做脱敏+截断(Bearer/API key/email/JWT 等),降低日志/Redis/DB 泄露风险;SSE 分支仅在可能触发 message 检测时才计算 JSON 字符数,避免每个事件都 JSON.stringify
  • upstream-error-detection:明确只解析 JSON 对象({...});数组输入目前不做解析(避免语义不明导致误判),已加覆盖用例

关于 prompt injection/误判风险:按需求保留了“<1000 chars 且顶层 message 命中 /error/i 视为错误”的兜底,但只作用于结构化字段(不扫描 content/choices),且仅在 SSE 自然结束后的内部结算阶段使用。

已验证:npm test / npm run typecheck 均通过。

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 64 to 92
type FinalizeDeferredStreamingResult = {
/**
* “内部结算用”的状态码。
*
* 注意:这不会改变客户端实际收到的 HTTP 状态码(SSE 已经开始透传后无法回头改)。
* 这里的目的仅是让内部统计/熔断/会话绑定把“假 200”按失败处理。
*/
effectiveStatusCode: number;
/**
* 内部记录的错误原因(用于写入 DB/监控,帮助定位“假 200”问题)。
*/
errorMessage: string | null;
};

/**
* 若本次 SSE 被标记为“延迟结算”,则在流结束后补齐成功/失败的最终判定。
*
* 触发条件
* - Forwarder 收到 Response 且识别为 SSE 时,会在 session 上挂载 DeferredStreamingFinalization 元信息。
* - ResponseHandler 在后台读取完整 SSE 内容后,调用本函数:
* - 如果内容看起来是上游错误 JSON(假 200),则:
* - 计入熔断器失败;
* - 不更新 session 智能绑定(避免把会话粘到坏 provider);
* - 内部状态码改为 502(只影响统计与后续重试选择,不影响本次客户端响应)。
* - 如果流正常结束且未命中错误判定,则按成功结算并更新绑定/熔断/endpoint 成功率。
*
* @param streamEndedNormally - 必须是 reader 读到 done=true 的“自然结束”;超时/中断等异常结束由其它逻辑处理。
*/
async function finalizeDeferredStreamingFinalizationIfNeeded(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if session.provider has changed between response start (when meta was set) and stream end, providerForChain (line 60) uses the current session.provider while circuit breaker and session updates use meta.providerId. this creates inconsistent attribution: addProviderToChain at lines 127-137 and 207-213 will record the current provider object, but success/failure recording uses meta.providerId

the fallback logic at lines 73-90 tries to resolve this, but providerForChain is set before the mismatch check, so the initial value is wrong

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 64:92

Comment:
if `session.provider` has changed between response start (when meta was set) and stream end, `providerForChain` (line 60) uses the *current* `session.provider` while circuit breaker and session updates use `meta.providerId`. this creates inconsistent attribution: `addProviderToChain` at lines 127-137 and 207-213 will record the *current* provider object, but success/failure recording uses `meta.providerId`

the fallback logic at lines 73-90 tries to resolve this, but `providerForChain` is set before the mismatch check, so the initial value is wrong

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 197 to 205
// 情况 2:SSE 文本。快速过滤:既无 "error" 也无 "message" key 时跳过解析
// 注意:这里用 key 形式的引号匹配,尽量避免模型正文里出现 error 造成的无谓解析。
// 代价:如果上游返回的并非标准 JSON key(极少见),这里可能漏检;但我们偏向保守与低误判。
//
// 额外说明:这里刻意只匹配 `"error"` / `"message"`(含双引号),
// 若正文里出现被转义的 `\"error\"`(字符串内容),不会命中,这是为了避免误判。
if (!text.includes("\"error\"") && !text.includes("\"message\"")) {
return { isError: false };
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick filter may miss errors if SSE data contains "error" or "message" as quoted string content (not keys), e.g. data: {"content": "the word \"error\" appears here"} would incorrectly trigger parsing even though no error field exists

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/utils/upstream-error-detection.ts
Line: 197:205

Comment:
quick filter may miss errors if SSE data contains `"error"` or `"message"` as quoted string content (not keys), e.g. `data: {"content": "the word \"error\" appears here"}` would incorrectly trigger parsing even though no error field exists

How can I resolve this? If you propose a fix, please make it concise.

@greptile-apps
Copy link

greptile-apps bot commented Feb 8, 2026

Additional Comments (1)

src/app/v1/_lib/proxy/response-handler.ts
when !meta || !provider || !streamEndedNormally, the function returns upstreamStatusCode (often 200) without triggering circuit breaker updates. this means aborted streams (where finalizeStream(allContent, false) is called at line 1339) will record the fake 200 as success in finalizeRequestStats, undermining the entire fake-200 detection

consider forcing effectiveStatusCode = 502 or skipping stats updates when streamEndedNormally=false

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 51:55

Comment:
when `!meta || !provider || !streamEndedNormally`, the function returns `upstreamStatusCode` (often 200) without triggering circuit breaker updates. this means aborted streams (where `finalizeStream(allContent, false)` is called at line 1339) will record the fake 200 as success in `finalizeRequestStats`, undermining the entire fake-200 detection

consider forcing `effectiveStatusCode = 502` or skipping stats updates when `streamEndedNormally=false`

How can I resolve this? If you propose a fix, please make it concise.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/app/v1/_lib/proxy/response-handler.ts`:
- Around line 190-200: Add a new reason value "request_failed" to the
provider-chain reason enum(s) (the type used in session.addProviderToChain / the
message reason enum) and update the error-path call site in response-handler.ts
(where session.addProviderToChain is invoked) to use meta.isFirstAttempt ?
"request_failed" : "retry_failed" instead of always "retry_failed"; then update
consumer UI/formatters (provider-chain-formatter, provider-chain-popover) to
handle and display the new "request_failed" reason.
- Around line 163-174: detected.reason is a user-facing string that flows into
triggerCircuitBreakerAlert → sendCircuitBreakerAlert →
buildCircuitBreakerMessage and must be internationalized; update the block that
calls recordFailure (dynamic import of recordFailure) to pass either an i18n key
plus params or a language-independent error code instead of raw detected.reason,
e.g., map the current detected.reason cases to an i18n key (via your
i18n.translate or similar) or to a stable errorCode property and pass that (and
any non-localized metadata) into recordFailure; keep the dynamic import of
recordFailure but ensure recordFailure receives only i18n keys/error codes (not
raw Chinese strings) so downstream alert builders can localize per user locale.
🧹 Nitpick comments (4)
src/lib/utils/upstream-error-detection.ts (2)

52-54: isPlainRecordsession-messages-guards.ts 中的同名函数重复

src/app/[locale]/dashboard/sessions/[sessionId]/messages/_components/session-messages-guards.ts 中已有一个更严格的 isPlainRecord 实现(额外检查了 Object.prototype.toString.call(value) === "[object Object]")。虽然本文件的输入来源于 JSON.parse,不会产生 Date/Map/Set 等非普通对象,因此当前简化版本在功能上没有问题,但建议考虑将这两个实现统一为一个共享的工具函数,避免日后维护不一致。


197-205: SSE 快速过滤使用 text(未 trim)而 JSON 路径使用 trimmed——逻辑正确但值得加注释

Line 203 使用原始 textincludes 检查,而非 trimmed。这在功能上是正确的(trim() 不会影响中间内容),但与上方 Line 177-185 使用 trimmed 的风格不一致,乍看可能让人疑惑。建议在此处补充一句简短注释说明使用 text 的原因(如:trim 后已进入 JSON 分支返回,此处一定不以 { 开头,直接用原始文本匹配更安全)。

src/lib/utils/upstream-error-detection.test.ts (1)

4-119: 测试覆盖整体良好,建议补充两个小场景

覆盖率已覆盖主要分支,不过以下两个场景如果顺手补上会更完整:

  1. 纯空白字符串(如 " ")—— 验证 trim() 后视为空的边界行为。
  2. 自定义 options(如自定义 messageKeywordmaxJsonCharsForMessageCheck)—— 确认可配置项生效。

这些不影响合并,仅建议后续完善。

src/app/v1/_lib/proxy/response-handler.ts (1)

1335-1339: 客户端中断时 finalizeStream(allContent, false) 的语义需确认

客户端主动中断后,仍调用 finalizeStream(allContent, false) 进行统计写入。由于 streamEndedNormally=falsefinalizeDeferredStreamingFinalizationIfNeeded 会跳过错误检测并返回 effectiveStatusCode=200(upstreamStatusCode)。这意味着即使客户端中断、内容不完整,session usage 的 status 仍会被记为 "completed"

如果这是预期行为(客户端中断不算上游失败),建议在注释中明确说明这一点,以免后续维护者误解。

Comment on lines 190 to 200
// 记录到决策链(用于日志展示与 DB 持久化)。
// 注意:这里用 effectiveStatusCode(502)而不是 upstreamStatusCode(200),
// 以便让内部链路明确显示这是一次失败(否则会被误读为成功)。
session.addProviderToChain(providerForChain, {
endpointId: meta.endpointId,
endpointUrl: meta.endpointUrl,
reason: "retry_failed",
attemptNumber: meta.attemptNumber,
statusCode: effectiveStatusCode,
errorMessage: detected.reason,
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Search for reason field type definitions and enum values
rg -n --type=ts -C3 '(request_success|retry_success|retry_failed|reason.*:)' src/app/v1/_lib/proxy/

Repository: ding113/claude-code-hub

Length of output: 32527


🏁 Script executed:

# Check the response-handler.ts file to see line 275 and surrounding context
wc -l src/app/v1/_lib/proxy/response-handler.ts

Repository: ding113/claude-code-hub

Length of output: 113


🏁 Script executed:

# Search for addProviderToChain method signature and usage patterns
rg -n --type=ts 'addProviderToChain' -C2

Repository: ding113/claude-code-hub

Length of output: 9099


🏁 Script executed:

# Look for reason field type definition in broader scope
rg -n --type=ts '"request_failed"|"request_success"|"retry_' --max-count=50

Repository: ding113/claude-code-hub

Length of output: 6922


错误路径的 reason 未区分首次请求与重试,与成功路径不对称

成功路径(Line 275)会根据 meta.isFirstAttempt 区分 "request_success""retry_success",但错误路径始终使用 "retry_failed"。当 isFirstAttempt=true 时,provider chain 审计记录会显示 "retry_failed",容易造成误读(明明是首次尝试却标记为 retry 失败)。

需要同时在 src/app/v1/_lib/proxy/session.tssrc/types/message.tsreason 类型枚举中添加 "request_failed",然后更新 line 196 的逻辑,并同步更新 provider-chain-formatter.tsprovider-chain-popover.tsx 等仪表盘展示代码以处理新的 reason 值。

建议的修改

1. 更新类型定义 (src/app/v1/_lib/proxy/session.ts lines 450-460):

     reason?:
       | "session_reuse"
       | "initial_selection"
       | "concurrent_limit_failed"
       | "request_success"
+      | "request_failed"
       | "retry_success"
       | "retry_failed"

2. 更新错误路径 (src/app/v1/_lib/proxy/response-handler.ts line 196):

     session.addProviderToChain(providerForChain, {
       endpointId: meta.endpointId,
       endpointUrl: meta.endpointUrl,
-      reason: "retry_failed",
+      reason: meta.isFirstAttempt ? "request_failed" : "retry_failed",
       attemptNumber: meta.attemptNumber,
       statusCode: effectiveStatusCode,
       errorMessage: detected.reason,
     });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// 记录到决策链(用于日志展示与 DB 持久化)。
// 注意:这里用 effectiveStatusCode(502)而不是 upstreamStatusCode(200),
// 以便让内部链路明确显示这是一次失败(否则会被误读为成功)。
session.addProviderToChain(providerForChain, {
endpointId: meta.endpointId,
endpointUrl: meta.endpointUrl,
reason: "retry_failed",
attemptNumber: meta.attemptNumber,
statusCode: effectiveStatusCode,
errorMessage: detected.reason,
});
// 记录到决策链(用于日志展示与 DB 持久化)。
// 注意:这里用 effectiveStatusCode(502)而不是 upstreamStatusCode(200),
// 以便让内部链路明确显示这是一次失败(否则会被误读为成功)。
session.addProviderToChain(providerForChain, {
endpointId: meta.endpointId,
endpointUrl: meta.endpointUrl,
reason: meta.isFirstAttempt ? "request_failed" : "retry_failed",
attemptNumber: meta.attemptNumber,
statusCode: effectiveStatusCode,
errorMessage: detected.reason,
});
🤖 Prompt for AI Agents
In `@src/app/v1/_lib/proxy/response-handler.ts` around lines 190 - 200, Add a new
reason value "request_failed" to the provider-chain reason enum(s) (the type
used in session.addProviderToChain / the message reason enum) and update the
error-path call site in response-handler.ts (where session.addProviderToChain is
invoked) to use meta.isFirstAttempt ? "request_failed" : "retry_failed" instead
of always "retry_failed"; then update consumer UI/formatters
(provider-chain-formatter, provider-chain-popover) to handle and display the new
"request_failed" reason.

@tesgth032
Copy link
Contributor Author

补充按 review 意见继续加固(仅改本 PR 相关逻辑):

  • 修复“中断流被误记为 200 completed”:当 streamEndedNormally=false 时,内部 effectiveStatusCode 强制映射为失败(客户端中断=499,其它=502),并写入一条 providerChain 记录(reason=system_error),避免部分/中断流在 Redis/DB 中被当作成功
  • 假 200 检测结果改为“语言无关错误码 + 可选 detail”detectUpstreamErrorFromSseOrJsonText 现在返回 {code, detail?};落库/熔断/告警仅使用 code(如 FAKE_200_EMPTY_BODY / FAKE_200_JSON_ERROR_NON_EMPTY),detail 仅用于日志排查且会脱敏+截断,降低敏感信息与 i18n 风险

已验证:npm test / npm run typecheck 均通过。

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +229 to +238
if (messageValue.length >= merged.maxJsonCharsForMessageCheck) {
chars = merged.maxJsonCharsForMessageCheck; // >= 阈值即可跳过 message 关键字判定
} else {
try {
chars = JSON.stringify(evt.data).length;
} catch {
// stringify 失败时回退为近似值(仍保持“仅小体积 JSON 才做 message 检测”的意图)
chars = messageValue.length;
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON.stringify failure fallback uses only message length, not full object size

when stringify fails (line 233), code sets chars = messageValue.length which may be much smaller than actual JSON size, incorrectly allowing message keyword detection on large objects

Suggested change
if (messageValue.length >= merged.maxJsonCharsForMessageCheck) {
chars = merged.maxJsonCharsForMessageCheck; // >= 阈值即可跳过 message 关键字判定
} else {
try {
chars = JSON.stringify(evt.data).length;
} catch {
// stringify 失败时回退为近似值(仍保持“仅小体积 JSON 才做 message 检测”的意图)
chars = messageValue.length;
}
}
} catch {
// stringify 失败时按"超过阈值"处理,跳过 message 检测
chars = merged.maxJsonCharsForMessageCheck;
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/utils/upstream-error-detection.ts
Line: 229:238

Comment:
`JSON.stringify` failure fallback uses only message length, not full object size

when stringify fails (line 233), code sets `chars = messageValue.length` which may be much smaller than actual JSON size, incorrectly allowing message keyword detection on large objects

```suggestion
        } catch {
          // stringify 失败时按"超过阈值"处理,跳过 message 检测
          chars = merged.maxJsonCharsForMessageCheck;
        }
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/app/v1/_lib/proxy/response-handler.ts`:
- Around line 168-178: When handling the "!streamEndedNormally" branch in
response-handler.ts (the block that currently calls session.addProviderToChain
with providerForChain and returns { effectiveStatusCode, errorMessage }), also
call the circuit-breaker failure recorder—invoke session.recordFailure(...) for
providerForChain using the same metadata (meta.endpointId, meta.endpointUrl,
meta.attemptNumber, effectiveStatusCode and errorMessage) before returning so
that unexpected upstream stream interruptions are persisted to the breaker
consistently with the timeout paths; ensure you pass the appropriate failure
reason (e.g., "system_error" or similar) to match the existing
addProviderToChain usage.
🧹 Nitpick comments (3)
src/lib/utils/upstream-error-detection.test.ts (1)

2-2: 可选:考虑使用 @/ 路径别名

按照编码规范,引用 src/ 目录下的文件应使用 @/ 路径别名。此处为相邻文件的相对引用,影响不大,可根据项目偏好决定是否调整。

-import { detectUpstreamErrorFromSseOrJsonText } from "./upstream-error-detection";
+import { detectUpstreamErrorFromSseOrJsonText } from "@/lib/utils/upstream-error-detection";

As per coding guidelines, **/*.{ts,tsx,js,jsx}: "Use path alias @/ to reference files in ./src/ directory".

src/lib/utils/upstream-error-detection.ts (2)

61-63: isPlainRecordsession-messages-guards.ts 中的实现重复

src/app/[locale]/dashboard/sessions/[sessionId]/messages/_components/session-messages-guards.ts 中有一个更严格的 isPlainRecord(额外检查了 Object.prototype.toString.call)。两者对 JSON.parse 的输出行为一致,但存在代码重复。可以考虑提取到共享的 utils 中,但鉴于两个模块(dashboard 组件 vs proxy 内部工具)耦合度低,保持各自独立也可以接受。


80-114: 脱敏正则覆盖面不错,但 sanitizeErrorTextForDetail 未导出也无独立测试

当前脱敏逻辑通过集成测试(test 文件中 Bearer token / API key 两个用例)间接覆盖,但 JWT、email、通用 key-value、配置路径等分支没有直接测试。如果这些脱敏模式在生产中被绕过,可能导致敏感信息泄露到日志/DB。建议补充针对这些模式的测试用例。

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 1057 to 1068
const finalizeStream = async (
allContent: string,
streamEndedNormally: boolean,
clientAborted: boolean
): Promise<void> => {
const finalized = await finalizeDeferredStreamingFinalizationIfNeeded(
session,
allContent,
statusCode,
streamEndedNormally,
clientAborted
);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double-read of stream body

handleStream() uses controllableStream.tee() to create clientStream and internalStream for background processing, but the Gemini “stream passthrough” branch above separately spins up a background task that reads responseForStats.body.getReader() (clone’s body) and also returns the original response. If this branch is reachable inside handleStream(), you end up creating two independent background readers per request (one for the clone, one for the tee’d internal stream), doubling memory/CPU and potentially causing inconsistent finalization. Consider ensuring only one stats/finalization path runs for any given stream.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 1057:1068

Comment:
**Double-read of stream body**

`handleStream()` uses `controllableStream.tee()` to create `clientStream` and `internalStream` for background processing, but the Gemini “stream passthrough” branch above separately spins up a background task that reads `responseForStats.body.getReader()` (clone’s body) and *also* returns the original `response`. If this branch is reachable inside `handleStream()`, you end up creating two independent background readers per request (one for the clone, one for the tee’d internal stream), doubling memory/CPU and potentially causing inconsistent finalization. Consider ensuring only one stats/finalization path runs for any given stream.


How can I resolve this? If you propose a fix, please make it concise.

@greptile-apps
Copy link

greptile-apps bot commented Feb 8, 2026

Additional Comments (1)

src/app/v1/_lib/proxy/response-handler.ts
Ignores response-body storage flag

SessionManager.storeSessionResponse() already no-ops when STORE_SESSION_RESPONSE_BODY is false, but handleStream() still unconditionally builds and passes the full allContent string into that call (and also into the stats path) even when storage is disabled. This defeats the stated goal (“no persistence”) for large SSE responses because memory is still accumulated for the entire stream. If STORE_SESSION_RESPONSE_BODY=false is intended to reduce memory footprint (not just Redis writes), you need to gate the chunks accumulation / allContent construction accordingly; otherwise the env var doesn’t actually prevent large in-memory buffering in the streaming path.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 860:869

Comment:
**Ignores response-body storage flag**

`SessionManager.storeSessionResponse()` already no-ops when `STORE_SESSION_RESPONSE_BODY` is false, but `handleStream()` still unconditionally builds and passes the full `allContent` string into that call (and also into the stats path) even when storage is disabled. This defeats the stated goal (“no persistence”) for large SSE responses because memory is still accumulated for the entire stream. If `STORE_SESSION_RESPONSE_BODY=false` is intended to reduce memory footprint (not just Redis writes), you need to gate the `chunks` accumulation / `allContent` construction accordingly; otherwise the env var doesn’t actually prevent large in-memory buffering in the streaming path.


How can I resolve this? If you propose a fix, please make it concise.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 1151 to 1166
});
}

void SessionManager.updateSessionUsage(session.sessionId, {
inputTokens: usageForCost.input_tokens,
outputTokens: usageForCost.output_tokens,
cacheCreationInputTokens: usageForCost.cache_creation_input_tokens,
cacheReadInputTokens: usageForCost.cache_read_input_tokens,
costUsd: costUsdStr,
status: statusCode >= 200 && statusCode < 300 ? "completed" : "error",
statusCode: statusCode,
}).catch((error: unknown) => {
const payload: SessionUsageUpdate = {
status: effectiveStatusCode >= 200 && effectiveStatusCode < 300 ? "completed" : "error",
statusCode: effectiveStatusCode,
...(streamErrorMessage ? { errorMessage: streamErrorMessage } : {}),
};

if (usageForCost) {
payload.inputTokens = usageForCost.input_tokens;
payload.outputTokens = usageForCost.output_tokens;
payload.cacheCreationInputTokens = usageForCost.cache_creation_input_tokens;
payload.cacheReadInputTokens = usageForCost.cache_read_input_tokens;
payload.costUsd = costUsdStr;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant conditional nesting

the outer if (session.sessionId) already establishes that session.sessionId is truthy, so the inner check at line 1154 if (usageForCost && session.request.model) doesn't need to recheck session.sessionId

Suggested change
});
}
void SessionManager.updateSessionUsage(session.sessionId, {
inputTokens: usageForCost.input_tokens,
outputTokens: usageForCost.output_tokens,
cacheCreationInputTokens: usageForCost.cache_creation_input_tokens,
cacheReadInputTokens: usageForCost.cache_read_input_tokens,
costUsd: costUsdStr,
status: statusCode >= 200 && statusCode < 300 ? "completed" : "error",
statusCode: statusCode,
}).catch((error: unknown) => {
const payload: SessionUsageUpdate = {
status: effectiveStatusCode >= 200 && effectiveStatusCode < 300 ? "completed" : "error",
statusCode: effectiveStatusCode,
...(streamErrorMessage ? { errorMessage: streamErrorMessage } : {}),
};
if (usageForCost) {
payload.inputTokens = usageForCost.input_tokens;
payload.outputTokens = usageForCost.output_tokens;
payload.cacheCreationInputTokens = usageForCost.cache_creation_input_tokens;
payload.cacheReadInputTokens = usageForCost.cache_read_input_tokens;
payload.costUsd = costUsdStr;
}
if (usageForCost && session.request.model) {
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 1151:1166

Comment:
Redundant conditional nesting

the outer `if (session.sessionId)` already establishes that `session.sessionId` is truthy, so the inner check at line 1154 `if (usageForCost && session.request.model)` doesn't need to recheck `session.sessionId`

```suggestion
          if (usageForCost && session.request.model) {
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +226 to +238
const errorValue = evt.data.error;
const messageValue = evt.data.message;
if (!hasNonEmptyValue(errorValue) && typeof messageValue === "string") {
if (messageValue.length >= merged.maxJsonCharsForMessageCheck) {
chars = merged.maxJsonCharsForMessageCheck; // >= 阈值即可跳过 message 关键字判定
} else {
try {
chars = JSON.stringify(evt.data).length;
} catch {
// stringify 失败时回退为近似值(仍保持“仅小体积 JSON 才做 message 检测”的意图)
chars = messageValue.length;
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance optimization could skip stringify in more cases

when errorValue is non-empty (line 226), you know the result will be error regardless of JSON size, so the stringify at line 233 is unnecessary. only compute chars when both conditions are true: !hasNonEmptyValue(errorValue) AND typeof messageValue === "string"

Suggested change
const errorValue = evt.data.error;
const messageValue = evt.data.message;
if (!hasNonEmptyValue(errorValue) && typeof messageValue === "string") {
if (messageValue.length >= merged.maxJsonCharsForMessageCheck) {
chars = merged.maxJsonCharsForMessageCheck; // >= 阈值即可跳过 message 关键字判定
} else {
try {
chars = JSON.stringify(evt.data).length;
} catch {
// stringify 失败时回退为近似值(仍保持“仅小体积 JSON 才做 message 检测”的意图)
chars = messageValue.length;
}
}
let chars = 0;
const errorValue = evt.data.error;
const messageValue = evt.data.message;
// Only compute JSON size when we need message keyword check
if (!hasNonEmptyValue(errorValue) && typeof messageValue === "string") {
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/utils/upstream-error-detection.ts
Line: 226:238

Comment:
Performance optimization could skip stringify in more cases

when `errorValue` is non-empty (line 226), you know the result will be error regardless of JSON size, so the stringify at line 233 is unnecessary. only compute `chars` when both conditions are true: `!hasNonEmptyValue(errorValue)` AND `typeof messageValue === "string"`

```suggestion
    let chars = 0;
    const errorValue = evt.data.error;
    const messageValue = evt.data.message;
    
    // Only compute JSON size when we need message keyword check
    if (!hasNonEmptyValue(errorValue) && typeof messageValue === "string") {
```

How can I resolve this? If you propose a fix, please make it concise.

- streamEndedNormally=false 且非 client abort:记录 provider/endpoint failure\n- 补充注释说明 STORE_SESSION_RESPONSE_BODY 仅影响持久化\n- upstream-error-detection 增补脱敏/截断单测
@tesgth032
Copy link
Contributor Author

继续跟进 AI review(仅触及本 PR 新增/改动逻辑):

  • 非自然结束(streamEndedNormally=false)时:若不是 client abort,则也会记录 provider/endpoint 熔断失败(与 timeout 路径保持一致),避免中断流不计入熔断导致后续仍选到不稳定 provider。
  • 补充注释澄清:STORE_SESSION_RESPONSE_BODY 仅控制是否写入 Redis;流内容仍会在内存中累积用于 usage/cost 统计与 SSE 结束后的假 200 检测(符合本 PR 目标)。
  • upstream-error-detection 增补脱敏/截断单测覆盖(JWT/email/通用敏感键值/配置路径等)。

未采纳:新增 provider-chain reason="request_failed"(会牵涉多处类型与 UI 文案扩展;当前已有 attemptNumber/isFirstAttempt 等字段可区分语义,若 maintainer 认为必要可单独 PR 推进)。

本地验证:npm test / npm run typecheck 通过。

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines 1212 to 1227
@@ -887,13 +1220,15 @@ export class ProxyResponseHandler {
cacheCreation1hInputTokens: usageForCost?.cache_creation_1h_input_tokens,
cacheTtlApplied: usageForCost?.cache_ttl ?? null,
providerChain: session.getProviderChain(),
...(streamErrorMessage ? { errorMessage: streamErrorMessage } : {}),
model: session.getCurrentModel() ?? undefined, // ⭐ 更新重定向后的模型
providerId: session.provider?.id, // ⭐ 更新最终供应商ID(重试切换后)
context1mApplied: session.getContext1mApplied(),
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provider attribution mismatch

In deferred streaming finalization, circuit-breaker + session.addProviderToChain(...) are recorded against meta.providerId, but request details are persisted with providerId: session.provider?.id. For the fake-200 path you intentionally skip session binding updates, so session.provider can remain pointing at a different provider, making providerId inconsistent with the providerChain/stats for the same request. Consider writing providerId from the deferred meta (or the resolved providerForChain.id) here so DB/UI attribution matches the finalization target.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 1212:1227

Comment:
**Provider attribution mismatch**

In deferred streaming finalization, circuit-breaker + `session.addProviderToChain(...)` are recorded against `meta.providerId`, but request details are persisted with `providerId: session.provider?.id`. For the fake-200 path you intentionally skip session binding updates, so `session.provider` can remain pointing at a different provider, making `providerId` inconsistent with the providerChain/stats for the same request. Consider writing `providerId` from the deferred meta (or the resolved `providerForChain.id`) here so DB/UI attribution matches the finalization target.

How can I resolve this? If you propose a fix, please make it concise.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/app/v1/_lib/proxy/response-handler.ts (1)

1374-1393: ⚠️ Potential issue | 🟠 Major

上游意外中断路径缺少熔断器记录和延迟结算调用

在 catch 分支中,当判定为上游意外中断(isClientAbortError && !clientAborted && !isResponseTimeout && !isIdleTimeout)时,仅调用了 persistRequestFailure,但:

  1. 未调用 finalizeStream(或 finalizeDeferredStreamingFinalizationIfNeeded),导致延迟结算元数据未被消费,circuit breaker 不会被更新。
  2. 未直接调用 recordFailure,与响应超时(line 1314)和静默期超时(line 1350)路径不对称。

相比之下,客户端主动中断路径(line 1408-1410)正确调用了 finalizeStream(allContent, false, true)

建议与客户端中断路径保持一致,在此处补充 finalizeStream 调用或直接添加 recordFailure

建议的修复
           } else if (!clientAborted) {
             // 上游在流式过程中意外中断:视为供应商/网络错误
             logger.error("ResponseHandler: Upstream stream aborted unexpectedly", {
               taskId,
               providerId: provider.id,
               providerName: provider.name,
               messageId: messageContext.id,
               chunksCollected: chunks.length,
               errorName: err.name,
               errorMessage: err.message || "(empty message)",
             });
 
+            try {
+              const allContent = flushAndJoin();
+              await finalizeStream(allContent, false, false);
+            } catch (finalizeError) {
+              logger.error("ResponseHandler: Failed to finalize upstream-aborted stream", {
+                taskId,
+                messageId: messageContext.id,
+                finalizeError,
+              });
+            }
+
             await persistRequestFailure({
               session,
               messageContext,
               statusCode: 502,
               error: err,
               taskId,
               phase: "stream",
             });
🧹 Nitpick comments (1)
src/app/v1/_lib/proxy/response-handler.ts (1)

103-105: !streamEndedNormally 时跳过错误检测,返回 {isError: false} — 语义上可能令人困惑

当流未正常结束时,detected 被设为 {isError: false},但后续 !streamEndedNormally 分支(line 172)仍将其视为失败。逻辑上没有 bug,但 isError: false 搭配实际被当作失败处理的路径,在阅读时容易造成误解。建议加一行简短注释说明此处 isError: false 的含义是"跳过假200检测,由后续分支按中断处理"。

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/app/v1/_lib/proxy/response-handler.ts (1)

1288-1395: ⚠️ Potential issue | 🔴 Critical

三个异常路径(响应超时/空闲超时/上游中断)缺少 deferred meta 消费,导致 provider chain 记录丢失。

在 catch 分支中,response timeout(line 1338)、idle timeout(line 1375)和上游意外中断(line 1391)三条路径仅调用 persistRequestFailure() 但未调用 finalizeStream()。而 finalizeDeferredStreamingFinalizationIfNeeded() 只在 finalizeStream() 内被调用(line 1100),因此这些路径无法消费 forwarder 设置的 deferred meta,导致无法通过 addProviderToChain() 记录 provider 决策链。

建议在这三条路径中补充调用 finalizeStream() 或直接调用 finalizeDeferredStreamingFinalizationIfNeeded() 并适当处理 allContent,确保异常流式响应也能完整记录 provider chain 信息。

🧹 Nitpick comments (3)
src/lib/utils/upstream-error-detection.test.ts (2)

2-2: 建议使用 @/ 路径别名引用同目录模块。

编码规范要求使用 @/ 别名引用 ./src/ 下的文件。

建议修改
-import { detectUpstreamErrorFromSseOrJsonText } from "./upstream-error-detection";
+import { detectUpstreamErrorFromSseOrJsonText } from "@/lib/utils/upstream-error-detection";

As per coding guidelines, **/*.{ts,tsx,js,jsx}: "Use path alias @/ to reference files in ./src/ directory".


149-188: SSE 场景测试覆盖合理。

error 字段、error 对象、message 关键字、message 为对象不误判、无 error/message key、[DONE] 等关键路径均已测试。

建议后续可补充两个边缘用例以进一步提升覆盖率:

  1. 纯空白字符串输入(如 " ")—— 验证 trim() 后走空体逻辑。
  2. 自定义 options(如自定义 messageKeywordmaxJsonCharsForMessageCheck)—— 验证配置覆盖生效。
src/app/v1/_lib/proxy/response-handler.ts (1)

93-99: 函数命名冗余:finalizeDeferredStreamingFinalizationIfNeeded

函数名中 "Finalize...Finalization" 存在语义重复,建议简化为 finalizeDeferredStreamingIfNeededhandleDeferredStreamingFinalization

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +226 to +234
if (!hasNonEmptyValue(errorValue) && typeof messageValue === "string") {
if (messageValue.length >= merged.maxJsonCharsForMessageCheck) {
chars = merged.maxJsonCharsForMessageCheck; // >= 阈值即可跳过 message 关键字判定
} else {
try {
chars = JSON.stringify(evt.data).length;
} catch {
// stringify 失败时回退为近似值(仍保持“仅小体积 JSON 才做 message 检测”的意图)
chars = messageValue.length;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could optimize by skipping stringify when error already detected

when errorValue is non-empty (line 84-147), result is error regardless of JSON size, so computing chars via JSON.stringify on line 233 is unnecessary work. only compute when both conditions need checking: !hasNonEmptyValue(errorValue) AND message keyword detection is needed

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/utils/upstream-error-detection.ts
Line: 226:234

Comment:
could optimize by skipping stringify when error already detected

when `errorValue` is non-empty (line 84-147), result is error regardless of JSON size, so computing `chars` via `JSON.stringify` on line 233 is unnecessary work. only compute when both conditions need checking: `!hasNonEmptyValue(errorValue)` AND message keyword detection is needed

How can I resolve this? If you propose a fix, please make it concise.

- 超时/中断/异常路径也调用 finalizeStream,补齐 providerChain 与熔断归因\n- 持久化 providerId 优先使用 deferred meta,避免假200/异常流归因错位\n- upstream-error-detection 增补 options/空白输入用例
@tesgth032
Copy link
Contributor Author

已按最新 CodeRabbit/Greptile 建议补齐(仅涉及本 PR 新增/改动逻辑):

  • 流式异常路径(response timeout / idle timeout / 上游中断 / 其它处理异常)现在也会调用 inalizeStream(...),确保消费 deferred meta,补齐 providerChain 记录,并让熔断/故障转移按失败统计。
  • DB 持久化 providerId 优先使用 deferred meta 的 providerId(避免假200/异常流导致 providerId 与 providerChain/熔断归因不一致);同时 inalizeRequestStats 增加可选的 providerId override。
  • upstream-error-detection 单测补充:纯空白输入、options.messageKeyword 覆盖、options.maxJsonCharsForMessageCheck 覆盖;并改用 @/ 别名导入。

本地验证:
pm run typecheck、
pm test 均通过。

备注:上游仓库最新两次 GitHub Actions 运行显示 �ction_required(0s/无 job),需要 maintainer 在 UI 里批准工作流运行后才会产出 CI checks。

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

try {
chars = JSON.stringify(evt.data).length;
} catch {
// stringify 失败时回退为近似值(仍保持“仅小体积 JSON 才做 message 检测”的意图)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary work when error already detected

JSON.stringify is expensive - if hasNonEmptyValue(errorValue) is true (lines 128-147), the function will return error regardless of JSON size, so computing chars here serves no purpose

skip stringify when error detected:

Suggested change
// stringify 失败时回退为近似值(仍保持“仅小体积 JSON 才做 message 检测”的意图)
const messageValue = evt.data.message;
if (hasNonEmptyValue(errorValue)) {
chars = 0; // error already detected, skip message check
} else if (typeof messageValue === "string") {
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/utils/upstream-error-detection.ts
Line: 233:233

Comment:
unnecessary work when error already detected

`JSON.stringify` is expensive - if `hasNonEmptyValue(errorValue)` is true (lines 128-147), the function will return error regardless of JSON size, so computing `chars` here serves no purpose

skip stringify when error detected:
```suggestion
        const messageValue = evt.data.message;
        if (hasNonEmptyValue(errorValue)) {
          chars = 0; // error already detected, skip message check
        } else if (typeof messageValue === "string") {
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 207 to 211
// 情况 2:SSE 文本。快速过滤:既无 "error" 也无 "message" key 时跳过解析
// 注意:这里用 key 形式的引号匹配,尽量避免模型正文里出现 error 造成的无谓解析。
// 代价:如果上游返回的并非标准 JSON key(极少见),这里可能漏检;但我们偏向保守与低误判。
//
// 额外说明:这里刻意只匹配 `"error"` / `"message"`(含双引号),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick filter can cause false positives

the check text.includes('"error"') will match legitimate assistant responses like:

data: {"content": "to handle \"error\" cases, use try-catch"}

this triggers unnecessary SSE parsing for most model responses that discuss error handling

tradeoff noted in comments, but consider tracking false positive rate in production

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/utils/upstream-error-detection.ts
Line: 207:211

Comment:
quick filter can cause false positives

the check `text.includes('"error"')` will match legitimate assistant responses like:
```
data: {"content": "to handle \"error\" cases, use try-catch"}
```
this triggers unnecessary SSE parsing for most model responses that discuss error handling

tradeoff noted in comments, but consider tracking false positive rate in production

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +148 to +178
let providerForChain = provider;
if (provider.id !== meta.providerId) {
logger.warn("[ResponseHandler] Deferred streaming meta provider mismatch", {
sessionId: session.sessionId ?? null,
metaProviderId: meta.providerId,
currentProviderId: provider.id,
canonicalProviderId: meta.providerId,
});

// 尝试用 meta.providerId 找回正确的 Provider 对象,保证 providerChain 的审计数据一致
try {
const providers = await session.getProvidersSnapshot();
const resolved = providers.find((p) => p.id === meta.providerId);
if (resolved) {
providerForChain = resolved;
} else {
logger.warn("[ResponseHandler] Deferred streaming meta provider not found in snapshot", {
sessionId: session.sessionId ?? null,
metaProviderId: meta.providerId,
currentProviderId: provider.id,
});
}
} catch (resolveError) {
logger.warn("[ResponseHandler] Failed to resolve meta provider from snapshot", {
sessionId: session.sessionId ?? null,
metaProviderId: meta.providerId,
currentProviderId: provider.id,
error: resolveError,
});
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verify provider resolution doesn't fail silently

when session.provider.id !== meta.providerId, the code attempts to resolve the correct provider from snapshot. if resolution fails (provider not in snapshot or error thrown), providerForChain remains as the mismatched provider, causing inconsistent attribution

verify circuit breaker / session updates use meta.providerId directly (not providerForChain.id) to ensure stats accuracy

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 148:178

Comment:
verify provider resolution doesn't fail silently

when `session.provider.id !== meta.providerId`, the code attempts to resolve the correct provider from snapshot. if resolution fails (provider not in snapshot or error thrown), `providerForChain` remains as the mismatched provider, causing inconsistent attribution

verify circuit breaker / session updates use `meta.providerId` directly (not `providerForChain.id`) to ensure stats accuracy

How can I resolve this? If you propose a fix, please make it concise.

providerId: session.provider?.id, // ⭐ 更新最终供应商ID(重试切换后)
...(streamErrorMessage ? { errorMessage: streamErrorMessage } : {}),
model: session.getCurrentModel() ?? undefined, // 更新重定向后的模型
providerId: providerIdForPersistence ?? session.provider?.id, // 更新最终供应商ID(重试切换后)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DB attribution should match circuit breaker

for fake-200 path, circuit breaker uses meta.providerId (line 240), but here providerId falls back to session.provider?.id which could differ

use providerIdForPersistence from finalized result:

Suggested change
providerId: providerIdForPersistence ?? session.provider?.id, // 更新最终供应商ID(重试切换后)
providerId: providerIdForPersistence ?? session.provider?.id, // 更新最终供应商ID(重试切换后)
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 1244:1244

Comment:
DB attribution should match circuit breaker

for fake-200 path, circuit breaker uses `meta.providerId` (line 240), but here `providerId` falls back to `session.provider?.id` which could differ

use `providerIdForPersistence` from finalized result:
```suggestion
          providerId: providerIdForPersistence ?? session.provider?.id, // 更新最终供应商ID(重试切换后)
```

How can I resolve this? If you propose a fix, please make it concise.

@greptile-apps
Copy link

greptile-apps bot commented Feb 8, 2026

Greptile encountered an error while reviewing this PR. Please reach out to support@greptile.com for assistance.

@tesgth032
Copy link
Contributor Author

补充:我这边本地已验证
pm run typecheck\ /
pm test\ 通过。\n\n目前 GitHub Actions 的 \PR Build Check\(run \21794179136\)与 \🧪 Test Suite\(run \21794179129\)状态为 \�ction_required\ 且无 jobs,应该是 fork PR 的 workflow 需要在 UI 手动批准后才会真正执行。麻烦 maintainer 在 Checks/Actions 页面点一下批准运行。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core area:provider bug Something isn't working size/M Medium PR (< 500 lines)

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants