feat(circuit): unify provider-endpoint circuit visibility and notifications#755
feat(circuit): unify provider-endpoint circuit visibility and notifications#755
Conversation
Add IncidentSource type and resolveEndpointDisplayStatus() with priority: circuit-open > circuit-half-open > enabled > disabled. 25 tests covering all status combinations.
Add batchGetEndpointCircuitInfo() action for bulk endpoint circuit state reads, avoiding N+1 queries in table views.
- ProviderEndpointsTable: display circuit state badge + reset button - ProviderManager: endpoint circuits included in filter/counters - ProviderRichListItem: layered badges (key vs endpoint) - Notification: unified provider+endpoint alerts under one toggle - i18n: 5 languages (en, zh-CN, zh-TW, ja, ru) Tests: 28 passed
- Add strictBlockCause and endpointFilterStats to chain metadata - New getEndpointFilterStats function for endpoint pool statistics - Record no_endpoint_candidates with full stats when strict policy blocks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthrough新增端点级熔断可观测性:扩展国际化文案、在 UI 中展示与重置端点熔断状态、批量查询端点熔断信息、在代理决策链记录端点池耗尽与统计,并新增告警/模板与多处类型与测试覆盖。 Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @ding113, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the circuit breaker functionality by unifying the visibility, management, and notification systems for both provider-level and individual endpoint-level circuit breakers. It introduces new UI elements for monitoring and resetting endpoint circuits, improves filtering capabilities, and provides more granular detail in the audit trail when endpoint pools are exhausted. The changes aim to provide administrators with a clearer understanding and better control over the health and availability of their configured providers and endpoints. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
🧪 测试结果
总体结果: ✅ 所有测试通过 |
There was a problem hiding this comment.
Code Review
This pull request is a significant and well-executed feature enhancement that unifies circuit breaker visibility and notifications for both providers and endpoints. The changes are comprehensive, covering backend logic, UI components, internationalization, and extensive testing. The introduction of IncidentSource, batch endpoint queries, and detailed provider chain logging for endpoint exhaustion are excellent improvements for maintainability and observability. My review focuses on minor improvements related to internationalization, notification enrichment, and UI responsiveness.
| return { | ||
| ok: false, | ||
| error: "无权限执行此操作", | ||
| errorCode: ERROR_CODES.PERMISSION_DENIED, | ||
| }; |
There was a problem hiding this comment.
Hardcoded Chinese error messages should be avoided for better internationalization and maintainability. Please use English messages or error codes that can be translated on the client side.
| return { | |
| ok: false, | |
| error: "无权限执行此操作", | |
| errorCode: ERROR_CODES.PERMISSION_DENIED, | |
| }; | |
| return { | |
| ok: false, | |
| error: "Permission denied", | |
| errorCode: ERROR_CODES.PERMISSION_DENIED, | |
| }; |
| return { ok: true, data: results }; | ||
| } catch (error) { | ||
| logger.error("batchGetEndpointCircuitInfo:error", error); | ||
| const message = error instanceof Error ? error.message : "批量获取端点熔断状态失败"; |
There was a problem hiding this comment.
This hardcoded Chinese error message should be in English for consistency and maintainability, allowing for proper internationalization on the client side.
| const message = error instanceof Error ? error.message : "批量获取端点熔断状态失败"; | |
| const message = error instanceof Error ? error.message : "Failed to batch get endpoint circuit info"; |
| return map; | ||
| }, | ||
| enabled: endpointIds.length > 0, | ||
| staleTime: 15_000, |
There was a problem hiding this comment.
src/lib/endpoint-circuit-breaker.ts
Outdated
| // Try to enrich with endpoint URL from database | ||
| let endpointUrl: string | undefined; | ||
| try { | ||
| const { findProviderEndpointById } = await import("@/repository"); | ||
| const endpoint = await findProviderEndpointById(endpointId); | ||
| if (endpoint) { | ||
| endpointUrl = endpoint.url; | ||
| } | ||
| } catch { | ||
| // DB lookup failure should not block alert | ||
| } | ||
|
|
||
| await sendCircuitBreakerAlert({ | ||
| providerId: endpointId, // Use endpointId as providerId for dedup purposes | ||
| providerName: "", | ||
| failureCount, | ||
| retryAt, | ||
| lastError, | ||
| incidentSource: "endpoint", | ||
| endpointId, | ||
| endpointUrl, | ||
| }); |
There was a problem hiding this comment.
The providerName is currently hardcoded to an empty string when triggering an endpoint circuit breaker alert. This results in a less informative notification message (e.g., '供应商 的端点...'). To improve this, you can fetch the vendor's display name using the vendorId from the endpoint details. This will provide more context in the alert.
// Try to enrich with endpoint URL and vendor name from database
let endpointUrl: string | undefined;
let providerName = "";
try {
const { findProviderEndpointById, findProviderVendorById } = await import("@/repository");
const endpoint = await findProviderEndpointById(endpointId);
if (endpoint) {
endpointUrl = endpoint.url;
const vendor = await findProviderVendorById(endpoint.vendorId);
if (vendor) {
providerName = vendor.displayName || vendor.websiteDomain || `Vendor #${vendor.id}`;
}
}
} catch (dbError) {
// DB lookup failure should not block alert
logger.warn("[triggerEndpointCircuitBreakerAlert] DB lookup failed", {
endpointId,
error: dbError instanceof Error ? dbError.message : String(dbError),
});
}
await sendCircuitBreakerAlert({
providerId: endpointId, // Use endpointId as providerId for dedup purposes
providerName,
failureCount,
retryAt,
lastError,
incidentSource: "endpoint",
endpointId,
endpointUrl,
});There was a problem hiding this comment.
Actionable comments posted: 13
🤖 Fix all issues with AI agents
In `@messages/en/settings/providers/filter.json`:
- Around line 3-4: The English translations for the keys "keyCircuitBroken" and
"endpointCircuitBroken" are missing the "Broken" suffix; update their string
values to match other locales (e.g., change "Key Circuit" to "Key Circuit
Broken" and "Endpoint Circuit" to "Endpoint Circuit Broken" or use the preferred
phrasing like "Key Circuit Open" / "Endpoint Circuit Open") so the English
entries align semantically with other languages.
In `@messages/ru/provider-chain.json`:
- Line 41: The new translation uses the transliterated term "эндпоинт" for keys
like "endpointPoolExhausted" (and the other new keys around lines 67-68 and
193-200), but the file already uses the formal Russian term "Конечная точка" in
the existing key "details.endpoint"; update the new keys to use the same formal
term ("Конечная точка") so terminology is consistent across the file, and run a
quick grep/scan for any other occurrences of "эндпоинт" to replace them with
"Конечная точка".
In `@messages/ru/settings/providers/filter.json`:
- Around line 3-4: The Russian translation currently leaves "endpoint"
untranslated for the "endpointCircuitBroken" key; update the value for
"endpointCircuitBroken" to a proper Russian term (either the transliteration
"эндпоинт" or the translation "конечная точка") to match how other languages
translate "endpoint" and ensure consistency with the "keyCircuitBroken" entry
and other locales.
In `@src/actions/provider-endpoints.ts`:
- Around line 126-128: The BatchGetEndpointCircuitInfoSchema currently allows an
unbounded endpointIds array (z.array(EndpointIdSchema).min(0)), which can lead
to huge concurrent calls to getEndpointHealthInfo via Promise.all; update
BatchGetEndpointCircuitInfoSchema to add a sensible upper bound (e.g., .max(200)
or .max(500)) on endpointIds to cap batch size and prevent mass concurrency, and
ensure any callers or docs reflect the new limit so callers split requests when
needed.
In `@src/app/`[locale]/settings/providers/_components/provider-vendor-view.tsx:
- Around line 37-38: The prop endpointCircuitInfo declared on
ProviderVendorViewProps is unused inside the ProviderVendorView component;
either remove it from the interface or thread it through to the components that
need it (e.g., pass endpointCircuitInfo from ProviderVendorView into VendorCard
and/or into VendorKeysCompactList and ProviderEndpointsSection), and update
those child component prop types to accept EndpointCircuitInfoMap where required
so the value is consumed rather than left unused.
In `@src/app/v1/_lib/proxy/forwarder.ts`:
- Around line 498-528: The code unsafely casts exhaustionContext to
ProviderChainItem["decisionContext"] before calling session.addProviderToChain;
remove the type assertion and either (A) omit decisionContext entirely and pass
strictBlockCause and selectorError via their existing fields (keep
endpointFilterStats as-is), or (B) construct a complete decisionContext object
with the required properties (e.g., totalProviders, enabledProviders,
targetType, etc.) before passing it to session.addProviderToChain so no required
fields are undefined; update the session.addProviderToChain call to use the safe
option and drop the "as ProviderChainItem['decisionContext']" cast on
exhaustionContext.
In `@src/lib/endpoint-circuit-breaker.ts`:
- Around line 250-259: The call to sendCircuitBreakerAlert is populating
providerId with endpointId and leaving providerName empty, which is misleading
to webhook consumers; update the payload in the sendCircuitBreakerAlert
invocation so providerId is an explicit placeholder (e.g., -1 or 0) instead of
endpointId, set providerName to a clear readable placeholder (e.g., "Endpoint"
or `endpoint:${endpointId}`), and ensure the payload includes endpointId and
endpointUrl prominently so templates can display those instead of provider
fields (change in the invocation around sendCircuitBreakerAlert, adjust
providerId and providerName there).
In `@src/lib/notification/notifier.ts`:
- Around line 27-28: The deduplication suffix construction uses
`data.endpointId` when `source === "endpoint"`, which can produce
"endpoint:undefined" because `endpointId` is optional in
`CircuitBreakerAlertData`; update the logic in notifier.ts where `source` and
`dedupSuffix` are computed to defensively handle a missing endpointId: if
`source === "endpoint"` and `data.endpointId` is falsy/undefined, either
return/emit a warning or fall back to a stable alternative suffix (e.g.,
"endpoint:unknown" or include only "endpoint") so alerts for different endpoints
are not collapsed; change the expression that sets `dedupSuffix` (and any
related downstream key generation) to check `data.endpointId` explicitly before
concatenation.
In `@src/lib/webhook/templates/circuit-breaker.ts`:
- Around line 29-32: The description template can emit "ID: undefined" because
endpointId is optional; update the logic that builds description (the const
description using isEndpoint and data.endpointId) to only include the "(ID:
...)" segment when data.endpointId is defined (e.g., use a conditional/ternary
to append ` (ID: ${data.endpointId})` only if data.endpointId !== undefined,
otherwise omit or use a clear placeholder), ensuring you keep the existing
isEndpoint check and the providerName/providerId text intact.
In `@tests/unit/lib/endpoint-circuit-breaker.test.ts`:
- Around line 158-199: The second test for triggerEndpointCircuitBreakerAlert is
missing a vi.resetModules() call before setting up vi.doMock, which can cause
the dynamic import of "@/lib/endpoint-circuit-breaker" to return a cached module
and ignore the new findProviderEndpointById mock; add vi.resetModules()
immediately before vi.doMock(...) in that test so the subsequent await
import("@/lib/endpoint-circuit-breaker") loads a fresh module that uses the
mocked findProviderEndpointById and sendCircuitBreakerAlert.
In `@tests/unit/settings/providers/endpoint-status.test.ts`:
- Around line 201-210: Rename the test case to accurately reflect the actual
expected behavior: update the test title for the spec that calls
createEndpoint(...) and resolveEndpointDisplayStatus(endpoint, "closed") so it
states that when isEnabled is null the endpoint is treated as enabled (e.g.,
"should return enabled when circuit is closed and isEnabled is null"); no
changes to resolveEndpointDisplayStatus or createEndpoint are required—only the
test description string should be corrected to match the asserted expectation of
status: "enabled", source: "endpoint", priority: 2.
In `@tests/unit/settings/providers/provider-manager.test.tsx`:
- Around line 328-340: The test silently skips assertions when toggle is
missing; change it to first assert the toggle exists (e.g.,
expect(toggle).not.toBeNull() / toBeTruthy()) immediately after querying
"#circuit-broken-filter" before calling act, then proceed to dispatch the click
on the toggle and assert the provider list
(querySelectorAll("[data-testid^='provider-']") and providerNames expectations).
Ensure you reference the same variable names (toggle, container) and keep the
act wrapper around the click.
In `@tests/unit/settings/providers/provider-vendor-view-circuit-ui.test.tsx`:
- Around line 520-650: Tests for resetEndpointCircuit currently call
providerEndpointsActionMocks.resetEndpointCircuit directly instead of exercising
the UI; update each of the three tests that reference
providerEndpointsActionMocks.resetEndpointCircuit so they locate the reset
button rendered by ProviderVendorView (e.g., find the endpoint row for
endpointId 1 and querySelector by the reset button's aria-label or CSS
selector), simulate a user click on that button, await flushTicks to let the
component call the action, then assert
providerEndpointsActionMocks.resetEndpointCircuit was called with
expect.objectContaining({ endpointId: 1 }) and verify the resulting UI feedback
(success or error) is displayed; use the existing ProviderVendorView render and
flushTicks utilities and the providerEndpointsActionMocks.resetEndpointCircuit
mock to drive expected responses.
🧹 Nitpick comments (12)
src/types/message.ts (1)
58-65: 将endpointFilterStats改用已定义的EndpointFilterStats类型。
EndpointFilterStats接口已在src/lib/provider-endpoints/endpoint-selector.ts中定义并导出(lines 56-61),结构与此处的内联类型完全相同。建议导入该类型以避免重复定义。建议修改
import type { EndpointFilterStats } from "@/lib/provider-endpoints/endpoint-selector"; // 在 message.ts 中,将 endpointFilterStats?: { total: number; enabled: number; circuitOpen: number; available: number; }; // 改为 endpointFilterStats?: EndpointFilterStats;tests/unit/webhook/notifier-circuit-breaker.test.ts (1)
50-180: 测试覆盖全面,去重 key 的各场景验证清晰。建议补充一个
incidentSource: "endpoint"但endpointId为undefined的用例,以验证边界行为(与notifier.ts中可能产生"endpoint:undefined"的问题对应)。src/lib/endpoint-circuit-breaker.ts (1)
213-223:EndpointCircuitAlertData接口与实际发送的数据不一致此接口定义的字段(
endpointId、failureCount、retryAt、lastError、endpointUrl)无法准确反映triggerEndpointCircuitBreakerAlert实际发送的数据结构。函数实际发送的是CircuitBreakerAlertData类型,包含额外的providerId、providerName和incidentSource字段。此外,EndpointCircuitAlertData被导出但在整个代码库中未被使用。建议移除此未使用的导出接口,或如果有外部消费需求,应更新其定义以匹配实际发送的数据结构。
src/lib/webhook/templates/circuit-breaker.ts (1)
19-27: 新增的硬编码中文字符串与已有代码风格一致,但不符合 i18n 规范。"端点ID"、"端点地址" 等新增字符串与文件中已有的 "失败次数"、"预计恢复" 等保持了一致的风格。不过按照项目编码规范,用户可见的文本应使用 i18n。如果当前 webhook 模板有计划后续迁移 i18n,可暂时保持现状;否则建议在适当时机统一处理。
src/lib/webhook/templates/placeholders.ts (1)
23-61: 所有模板占位符的label和description均为硬编码中文字符串。这是现有的模式,非本 PR 引入。但按照编码规范要求所有用户可见字符串使用 i18n(支持 5 种语言),如果这些占位符元数据会在 UI 中展示给用户,建议后续迭代中将其国际化。
src/app/[locale]/settings/providers/_components/endpoint-status.ts (1)
22-36: 新增类型定义语义清晰。
IncidentSource和EndpointDisplayStatus类型定义合理。不过EndpointDisplayStatus.status声明为string,而实际返回值只有"circuit-open" | "circuit-half-open" | "enabled" | "disabled"四种。建议使用联合类型以获得更好的类型安全性。建议收窄 status 类型
+export type EndpointDisplayStatusToken = + | "circuit-open" + | "circuit-half-open" + | "enabled" + | "disabled"; + export interface EndpointDisplayStatus { - status: string; + status: EndpointDisplayStatusToken; source: IncidentSource; priority: number; }tests/unit/settings/providers/provider-vendor-view-circuit-ui.test.tsx (1)
685-688: 通过 Tailwind CSS 类名断言 UI 状态较为脆弱。
[class*="text-rose-500"]和[class*="text-amber-500"]依赖具体的样式类名,一旦颜色调整或类名变更测试就会失败。考虑使用data-testid属性或检查文本内容作为更稳定的断言方式。src/app/[locale]/settings/providers/_components/provider-rich-list-item.tsx (1)
511-527: 移动端和桌面端的端点熔断徽章渲染逻辑重复。Lines 518-527 和 Lines 717-726 的 JSX 结构几乎完全相同(Badge 组件、className、图标和文本)。虽然这是响应式布局中移动/桌面分离 JSX 的常见模式,可以考虑提取为一个小组件以减少重复。
tests/unit/settings/providers/provider-manager.test.tsx (2)
389-440: 测试名称 "passes endpointCircuitInfo to ProviderList" 具有误导性当前 mock 的
ProviderList(第 28-38 行)没有使用或检查endpointCircuitInfo属性。此测试实际验证的是circuitBrokenCount显示为(3),而非 prop 是否传递到了ProviderList。建议修改测试名称使其更准确,或在 mock 中捕获endpointCircuitInfoprop 并进行断言。
162-186: 断言container.textContent包含"(1)"较为脆弱
"(1)"可能在其他 UI 元素中意外匹配。不过考虑到组件已被大量 mock,当前上下文中出现误匹配的风险较低,可以接受。tests/unit/proxy/proxy-forwarder-endpoint-audit.test.ts (1)
622-678: 建议将两个场景拆分为独立测试用例当前测试在一个
test中包含两个完整的 session 场景(selector_error和no_endpoint_candidates)。如果第一个场景失败,第二个场景将被跳过,降低了诊断效率。不过功能上并无问题。src/app/[locale]/settings/providers/_components/provider-endpoints-table.tsx (1)
133-149: 查询键结构可优化:避免端点顺序变化导致的缓存碎片化当前
queryKey: ["endpoint-circuit-info", ...endpointIds]将所有端点 ID 展开至数组中。当端点顺序变化时(即使 ID 集合相同),会创建不同的缓存条目。建议使用排序后的 ID 数组作为单一元素,确保相同端点集合始终使用同一缓存:建议优化
- const endpointIds = useMemo(() => endpoints.map((ep) => ep.id), [endpoints]); + const endpointIds = useMemo( + () => endpoints.map((ep) => ep.id).sort((a, b) => a - b), + [endpoints], + ); const { data: circuitInfoMap = {} } = useQuery({ - queryKey: ["endpoint-circuit-info", ...endpointIds], + queryKey: ["endpoint-circuit-info", endpointIds],注:第 290 行的
invalidateQueries({ queryKey: ["endpoint-circuit-info"] })使用前缀匹配,无论采用哪种方式都能正确失效该查询。
messages/ru/provider-chain.json
Outdated
| "http2Fallback": "Откат HTTP/2", | ||
| "clientError": "Ошибка клиента" | ||
| "clientError": "Ошибка клиента", | ||
| "endpointPoolExhausted": "Пул эндпоинтов исчерпан" |
There was a problem hiding this comment.
术语不一致:「эндпоинт」vs「Конечная точка」
新增的所有翻译键均使用音译词 «эндпоинт»(如 Line 41、67-68、193-200),但同一文件中 Line 73 已有的 details.endpoint 键使用的是正式俄语翻译 «Конечная точка»。建议在整个文件中统一术语,避免用户在同一界面看到两种不同的表述。
Also applies to: 67-68, 193-200
🤖 Prompt for AI Agents
In `@messages/ru/provider-chain.json` at line 41, The new translation uses the
transliterated term "эндпоинт" for keys like "endpointPoolExhausted" (and the
other new keys around lines 67-68 and 193-200), but the file already uses the
formal Russian term "Конечная точка" in the existing key "details.endpoint";
update the new keys to use the same formal term ("Конечная точка") so
terminology is consistent across the file, and run a quick grep/scan for any
other occurrences of "эндпоинт" to replace them with "Конечная точка".
src/app/[locale]/settings/providers/_components/provider-vendor-view.tsx
Outdated
Show resolved
Hide resolved
| test("reset endpoint circuit action is available when circuit is open", async () => { | ||
| // Mock circuit state as open | ||
| providerEndpointsActionMocks.batchGetEndpointCircuitInfo.mockResolvedValue({ | ||
| ok: true, | ||
| data: [ | ||
| { | ||
| endpointId: 1, | ||
| circuitState: "open", | ||
| failureCount: 5, | ||
| circuitOpenUntil: Date.now() + 60000, | ||
| }, | ||
| ], | ||
| }); | ||
| providerEndpointsActionMocks.resetEndpointCircuit.mockResolvedValue({ ok: true }); | ||
|
|
||
| const { unmount } = renderWithProviders( | ||
| <ProviderVendorView | ||
| providers={[makeProviderDisplay()]} | ||
| currentUser={ADMIN_USER} | ||
| enableMultiProviderTypes={true} | ||
| healthStatus={{}} | ||
| statistics={{}} | ||
| statisticsLoading={false} | ||
| currencyCode="USD" | ||
| /> | ||
| ); | ||
|
|
||
| await flushTicks(8); | ||
|
|
||
| // Verify the circuit state query was called with correct endpoint | ||
| expect(providerEndpointsActionMocks.batchGetEndpointCircuitInfo).toHaveBeenCalled(); | ||
|
|
||
| // Verify circuit badge is shown | ||
| expect(document.body.textContent || "").toContain("Circuit Open"); | ||
|
|
||
| // Call the reset action directly and verify it succeeds | ||
| const resetResult = await providerEndpointsActionMocks.resetEndpointCircuit({ endpointId: 1 }); | ||
| expect(resetResult.ok).toBe(true); | ||
|
|
||
| unmount(); | ||
| }); | ||
|
|
||
| test("resetEndpointCircuit action is called correctly when circuit is open", async () => { | ||
| providerEndpointsActionMocks.batchGetEndpointCircuitInfo.mockResolvedValue({ | ||
| ok: true, | ||
| data: [ | ||
| { | ||
| endpointId: 1, | ||
| circuitState: "open", | ||
| failureCount: 5, | ||
| circuitOpenUntil: Date.now() + 60000, | ||
| }, | ||
| ], | ||
| }); | ||
| providerEndpointsActionMocks.resetEndpointCircuit.mockResolvedValue({ ok: true }); | ||
|
|
||
| const { unmount } = renderWithProviders( | ||
| <ProviderVendorView | ||
| providers={[makeProviderDisplay()]} | ||
| currentUser={ADMIN_USER} | ||
| enableMultiProviderTypes={true} | ||
| healthStatus={{}} | ||
| statistics={{}} | ||
| statisticsLoading={false} | ||
| currencyCode="USD" | ||
| /> | ||
| ); | ||
|
|
||
| await flushTicks(8); | ||
|
|
||
| // Verify the circuit state query was called with correct endpoint | ||
| expect(providerEndpointsActionMocks.batchGetEndpointCircuitInfo).toHaveBeenCalled(); | ||
|
|
||
| // Verify circuit badge is shown | ||
| expect(document.body.textContent || "").toContain("Circuit Open"); | ||
|
|
||
| // The reset endpoint circuit action is available and will be called when user clicks | ||
| // Verify the mock is properly set up to handle the call | ||
| const resetResult = await providerEndpointsActionMocks.resetEndpointCircuit({ endpointId: 1 }); | ||
| expect(resetResult.ok).toBe(true); | ||
| expect(providerEndpointsActionMocks.resetEndpointCircuit).toHaveBeenCalledWith( | ||
| expect.objectContaining({ endpointId: 1 }) | ||
| ); | ||
|
|
||
| unmount(); | ||
| }); | ||
|
|
||
| test("resetEndpointCircuit action handles failure correctly", async () => { | ||
| providerEndpointsActionMocks.batchGetEndpointCircuitInfo.mockResolvedValue({ | ||
| ok: true, | ||
| data: [ | ||
| { | ||
| endpointId: 1, | ||
| circuitState: "open", | ||
| failureCount: 5, | ||
| circuitOpenUntil: Date.now() + 60000, | ||
| }, | ||
| ], | ||
| }); | ||
| providerEndpointsActionMocks.resetEndpointCircuit.mockResolvedValue({ | ||
| ok: false, | ||
| error: "Failed to reset circuit", | ||
| }); | ||
|
|
||
| const { unmount } = renderWithProviders( | ||
| <ProviderVendorView | ||
| providers={[makeProviderDisplay()]} | ||
| currentUser={ADMIN_USER} | ||
| enableMultiProviderTypes={true} | ||
| healthStatus={{}} | ||
| statistics={{}} | ||
| statisticsLoading={false} | ||
| currencyCode="USD" | ||
| /> | ||
| ); | ||
|
|
||
| await flushTicks(8); | ||
|
|
||
| // Verify the circuit state query was called | ||
| expect(providerEndpointsActionMocks.batchGetEndpointCircuitInfo).toHaveBeenCalled(); | ||
|
|
||
| // Verify circuit badge is shown | ||
| expect(document.body.textContent || "").toContain("Circuit Open"); | ||
|
|
||
| // Call the reset action and expect failure | ||
| const resetResult = await providerEndpointsActionMocks.resetEndpointCircuit({ endpointId: 1 }); | ||
| expect(resetResult.ok).toBe(false); | ||
| expect(resetResult.error).toBe("Failed to reset circuit"); | ||
|
|
||
| unmount(); | ||
| }); |
There was a problem hiding this comment.
重置操作测试仅调用了 mock 函数,未测试实际的 UI 交互。
resetEndpointCircuit 相关的三个测试(成功/失败/关闭状态)都是直接调用 providerEndpointsActionMocks.resetEndpointCircuit(),而非通过点击 UI 中的重置按钮触发。这实际上只验证了 mock 函数本身,没有覆盖"用户点击重置按钮 → 调用 action → 显示结果"的完整交互路径。
建议找到重置按钮元素并模拟点击,验证 resetEndpointCircuit 是否被组件正确调用:
// 示例:通过 UI 触发重置
const resetButton = endpointRow?.querySelector('[aria-label="Reset Circuit"]');
resetButton?.click();
await flushTicks(4);
expect(providerEndpointsActionMocks.resetEndpointCircuit).toHaveBeenCalledWith(
expect.objectContaining({ endpointId: 1 })
);🤖 Prompt for AI Agents
In `@tests/unit/settings/providers/provider-vendor-view-circuit-ui.test.tsx`
around lines 520 - 650, Tests for resetEndpointCircuit currently call
providerEndpointsActionMocks.resetEndpointCircuit directly instead of exercising
the UI; update each of the three tests that reference
providerEndpointsActionMocks.resetEndpointCircuit so they locate the reset
button rendered by ProviderVendorView (e.g., find the endpoint row for
endpointId 1 and querySelector by the reset button's aria-label or CSS
selector), simulate a user click on that button, await flushTicks to let the
component call the action, then assert
providerEndpointsActionMocks.resetEndpointCircuit was called with
expect.objectContaining({ endpointId: 1 }) and verify the resulting UI feedback
(success or error) is displayed; use the existing ProviderVendorView render and
flushTicks utilities and the providerEndpointsActionMocks.resetEndpointCircuit
mock to drive expected responses.
| fields.push({ label: "最后错误", value: data.lastError }); | ||
| } | ||
|
|
There was a problem hiding this comment.
Endpoint id dropped
When incidentSource === "endpoint", the endpoint fields are only added if data.endpointId is truthy. If endpointId can be 0 (or any falsy numeric ID), it will be omitted from both the fields and the description, making endpoint alerts ambiguous. Use an explicit undefined check (e.g. data.endpointId !== undefined) instead of a truthiness check.
Also applies to endpointId use in the description string in this function.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/webhook/templates/circuit-breaker.ts
Line: 16:18
Comment:
**Endpoint id dropped**
When `incidentSource === "endpoint"`, the endpoint fields are only added if `data.endpointId` is truthy. If `endpointId` can be `0` (or any falsy numeric ID), it will be omitted from both the fields and the description, making endpoint alerts ambiguous. Use an explicit undefined check (e.g. `data.endpointId !== undefined`) instead of a truthiness check.
Also applies to `endpointId` use in the description string in this function.
How can I resolve this? If you propose a fix, please make it concise.
src/lib/notification/notifier.ts
Outdated
| const redisClient = getRedisClient(); | ||
| const source = data.incidentSource ?? "provider"; | ||
| const dedupSuffix = source === "endpoint" ? `endpoint:${data.endpointId}` : source; |
There was a problem hiding this comment.
Dedup key collision
The new dedup suffix for endpoint incidents uses `endpoint:${data.endpointId}` even when endpointId is missing/undefined. That will collapse all endpoint alerts without an endpointId into the same Redis key (...:endpoint:undefined), suppressing unrelated alerts for 5 minutes. If endpoint incidents are possible without an ID, incorporate endpointUrl (or require endpointId when incidentSource === "endpoint") to avoid accidental suppression.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/notification/notifier.ts
Line: 26:28
Comment:
**Dedup key collision**
The new dedup suffix for endpoint incidents uses `` `endpoint:${data.endpointId}` `` even when `endpointId` is missing/undefined. That will collapse all endpoint alerts without an `endpointId` into the same Redis key (`...:endpoint:undefined`), suppressing unrelated alerts for 5 minutes. If endpoint incidents are possible without an ID, incorporate `endpointUrl` (or require `endpointId` when `incidentSource === "endpoint"`) to avoid accidental suppression.
How can I resolve this? If you propose a fix, please make it concise.| } | ||
|
|
||
| // Collect endpoint filter stats for no_endpoint_candidates (selector_error has no data) |
There was a problem hiding this comment.
Undefined stats leaked
endpointFilterStats: filterStats is passed to session.addProviderToChain(...) even when filterStats was never assigned (e.g., getEndpointFilterStats throws). This will emit endpointFilterStats: undefined into the chain item payload, which can break downstream formatting/rendering code that assumes the object exists when the field is present. Consider only including endpointFilterStats when stats were successfully collected (or explicitly set it to null and handle that consistently).
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/forwarder.ts
Line: 503:505
Comment:
**Undefined stats leaked**
`endpointFilterStats: filterStats` is passed to `session.addProviderToChain(...)` even when `filterStats` was never assigned (e.g., `getEndpointFilterStats` throws). This will emit `endpointFilterStats: undefined` into the chain item payload, which can break downstream formatting/rendering code that assumes the object exists when the field is present. Consider only including `endpointFilterStats` when stats were successfully collected (or explicitly set it to `null` and handle that consistently).
How can I resolve this? If you propose a fix, please make it concise.| // 端点过滤统计 | ||
| if (item.endpointFilterStats) { | ||
| const stats = item.endpointFilterStats; |
There was a problem hiding this comment.
Stats fields may be undefined
endpointFilterStats is treated as fully populated when present, but the producer can attach endpointFilterStats: undefined (or a partial object) on failure to collect stats. That will render translations with count: undefined (e.g. "Total Endpoints: undefined"). Consider guarding that the required numeric fields exist before rendering, or ensure the chain item never includes the field unless stats were successfully collected.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/utils/provider-chain-formatter.ts
Line: 722:724
Comment:
**Stats fields may be undefined**
`endpointFilterStats` is treated as fully populated when present, but the producer can attach `endpointFilterStats: undefined` (or a partial object) on failure to collect stats. That will render translations with `count: undefined` (e.g. "Total Endpoints: undefined"). Consider guarding that the required numeric fields exist before rendering, or ensure the chain item never includes the field unless stats were successfully collected.
How can I resolve this? If you propose a fix, please make it concise.| const endpoints = endpointCircuitInfo[providerId]; | ||
| if (endpoints?.some((ep) => ep.circuitState === "open")) { | ||
| return true; |
There was a problem hiding this comment.
Type assumption on endpoints
endpoints?.some(...) assumes endpointCircuitInfo[providerId] is an array when defined. If the data source ever returns a non-array value (e.g. null or an object due to API/serialization issues), this will throw at runtime. A defensive Array.isArray(endpoints) check avoids crashing the settings page on bad data.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/[locale]/settings/providers/_components/provider-manager.tsx
Line: 111:113
Comment:
**Type assumption on endpoints**
`endpoints?.some(...)` assumes `endpointCircuitInfo[providerId]` is an array when defined. If the data source ever returns a non-array value (e.g. `null` or an object due to API/serialization issues), this will throw at runtime. A defensive `Array.isArray(endpoints)` check avoids crashing the settings page on bad data.
How can I resolve this? If you propose a fix, please make it concise.
src/lib/endpoint-circuit-breaker.ts
Outdated
| await sendCircuitBreakerAlert({ | ||
| providerId: endpointId, // Use endpointId as providerId for dedup purposes | ||
| providerName: "", |
There was a problem hiding this comment.
Wrong providerId semantics
Endpoint circuit alerts call sendCircuitBreakerAlert with providerId: endpointId (comment says for dedup), but downstream templates and logs treat providerId as the provider's identifier. This makes endpoint alerts misleading (wrong providerId) and also skews dedup keys/counters. Pass the real providerId (and keep dedup uniqueness via incidentSource + endpointId) rather than overloading providerId with endpointId.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/endpoint-circuit-breaker.ts
Line: 250:252
Comment:
**Wrong providerId semantics**
Endpoint circuit alerts call `sendCircuitBreakerAlert` with `providerId: endpointId` (comment says for dedup), but downstream templates and logs treat `providerId` as the provider's identifier. This makes endpoint alerts misleading (wrong providerId) and also skews dedup keys/counters. Pass the real providerId (and keep dedup uniqueness via `incidentSource` + `endpointId`) rather than overloading `providerId` with `endpointId`.
How can I resolve this? If you propose a fix, please make it concise.
src/lib/endpoint-circuit-breaker.ts
Outdated
| if (endpoint) { | ||
| endpointUrl = endpoint.url; | ||
| } | ||
| } catch { |
There was a problem hiding this comment.
[Critical] [ERROR-SWALLOWED] Empty catch silently ignores alert enrichment failures
File: src/lib/endpoint-circuit-breaker.ts:246
Why this is a problem: The DB lookup failure is swallowed:
} catch {
// DB lookup failure should not block alert
}This makes production failures (DB outage/query bug/permission issue) invisible and removes evidence when endpoint alerts are missing endpointUrl.
Suggested fix:
} catch (error) {
logger.warn("[EndpointCircuitBreaker] Failed to enrich alert with endpoint info", {
endpointId,
error: error instanceof Error ? error.message : String(error),
});
}
src/lib/endpoint-circuit-breaker.ts
Outdated
|
|
||
| await sendCircuitBreakerAlert({ | ||
| providerId: endpointId, // Use endpointId as providerId for dedup purposes | ||
| providerName: "", |
There was a problem hiding this comment.
[High] [LOGIC-BUG] Endpoint circuit breaker alerts lose provider identity (providerName: "")
File: src/lib/endpoint-circuit-breaker.ts:252
Why this is a problem: The alert payload is sent as:
await sendCircuitBreakerAlert({
providerId: endpointId,
providerName: "",
incidentSource: "endpoint",
endpointId,
endpointUrl,
});But the circuit-breaker template renders endpoint incidents as:
`供应商 ${data.providerName} 的端点 (ID: ${data.endpointId}) 已触发熔断保护`(src/lib/webhook/templates/circuit-breaker.ts:31) so the notification will show an empty provider name and a misleading providerId for endpoint incidents.
Suggested fix:
const { findProviderEndpointById, findProviderVendorById } = await import("@/repository");
const endpoint = await findProviderEndpointById(endpointId);
const vendor = endpoint ? await findProviderVendorById(endpoint.vendorId) : null;
const providerId = endpoint?.vendorId ?? endpointId;
const providerName =
vendor?.displayName ?? vendor?.websiteDomain ?? endpoint?.url ?? String(providerId);
await sendCircuitBreakerAlert({
providerId,
providerName,
failureCount,
retryAt,
lastError,
incidentSource: "endpoint",
endpointId,
endpointUrl: endpoint?.url,
});| "doForward" | ||
| ); | ||
|
|
||
| await ProxyForwarder.send(session).catch(() => {}); |
There was a problem hiding this comment.
[Medium] [TEST-INCOMPLETE] Swallowing ProxyForwarder.send() rejection hides regressions
File: tests/unit/proxy/proxy-forwarder-endpoint-audit.test.ts:535
Why this is a problem: This test currently does:
await ProxyForwarder.send(session).catch(() => {});That silences the failure mode completely (it will keep passing even if send() starts throwing an unexpected error, or starts resolving when it should reject). The earlier test in this file already asserts rejection explicitly; these new cases should do the same.
Suggested fix:
await expect(ProxyForwarder.send(session)).rejects.toThrow();Apply similarly to the other swallowed rejections in this file (:590, :634, :662, :699).
There was a problem hiding this comment.
Code Review Summary
This PR is XL and mixes proxy forwarding behavior changes, circuit breaker alerting/template changes, UI changes, and test updates into one diff, which increases regression and review risk. The highest-risk issues are in the new endpoint circuit breaker alert path: one new catch {} swallows DB failures without logging, and endpoint alerts are emitted with an empty provider identity, producing unclear notifications.
PR Size: XL
- Lines changed: 2648
- Files changed: 48
- Split suggestion (recommended for follow-ups):
- PR 1: Endpoint circuit breaker alerting + webhook types/templates/dedup key changes
- PR 2: ProxyForwarder strict endpoint pool behavior + provider-chain audit fields + formatter updates
- PR 3: Settings UI changes + i18n message updates
- PR 4: Tests-only stabilization/coverage adjustments
Issues Found
| Category | Critical | High | Medium | Low |
|---|---|---|---|---|
| Logic/Bugs | 0 | 1 | 0 | 0 |
| Security | 0 | 0 | 0 | 0 |
| Error Handling | 1 | 0 | 0 | 0 |
| Types | 0 | 0 | 0 | 0 |
| Comments/Docs | 0 | 0 | 0 | 0 |
| Tests | 0 | 0 | 1 | 0 |
| Simplification | 0 | 0 | 0 | 0 |
Critical Issues (Must Fix)
- (Confidence 95-100) [ERROR-SWALLOWED] Empty
catchsilently ignores DB lookup failures during endpoint alert enrichment (src/lib/endpoint-circuit-breaker.ts:246). - (Confidence 95-100) [LOGIC-BUG] Endpoint circuit breaker alerts send
providerName: ""and useproviderId: endpointId, producing unclear notifications (src/lib/endpoint-circuit-breaker.ts:252, see template usage insrc/lib/webhook/templates/circuit-breaker.ts:31).
High Priority Issues (Should Fix)
- (Confidence 80-94) [TEST-INCOMPLETE] New tests swallow
ProxyForwarder.send()rejections via.catch(() => {}), allowing false positives (tests/unit/proxy/proxy-forwarder-endpoint-audit.test.ts:535, also:590,:634,:662,:699).
Review Coverage
- Logic and correctness
- Security (OWASP Top 10)
- Error handling
- Type safety
- Documentation accuracy
- Test coverage
- Code clarity
Automated review by Codex AI
There was a problem hiding this comment.
- Identified PR
#755(“feat(circuit): unify provider-endpoint circuit visibility and notifications”) and applied thesize/XLlabel. - Posted 3 inline review comments on new/modified diff lines:
src/lib/endpoint-circuit-breaker.ts:246— [Critical][ERROR-SWALLOWED]emptycatch {}hides DB lookup failures during alert enrichment.src/lib/endpoint-circuit-breaker.ts:252— [High][LOGIC-BUG]endpoint circuit alerts sendproviderName: ""/ misleadingproviderId, producing unclear notifications.tests/unit/proxy/proxy-forwarder-endpoint-audit.test.ts:535— [Medium][TEST-INCOMPLETE].catch(() => {})swallowsProxyForwarder.send()rejections (also at:590,:634,:662,:699).
- Submitted the required PR review summary via
gh pr review --comment.
There was a problem hiding this comment.
Code Review Summary
No significant issues identified in this PR. The implementation is well-structured with comprehensive test coverage across all new functionality. The 6-perspective review (logic, security, error handling, types, documentation, tests) found no issues meeting the confidence threshold for reporting.
PR Size: XL
- Lines changed: 2648 (2599 additions, 49 deletions)
- Files changed: 48
Split suggestion (XL PR): While the PR is large, the changes are cohesive around a single feature (endpoint circuit breaker visibility). The bulk of the additions are i18n strings (20 JSON files) and test files (11 test files with ~1200 lines). The core logic changes span ~400 lines across 9 source files, which is reasonable for a unified feature. No split is strictly necessary, but if desired, the notification/webhook changes could be separated from the UI changes.
Reviewed Areas
Potential concerns investigated and cleared:
-
providerId: endpointIdintriggerEndpointCircuitBreakerAlert(src/lib/endpoint-circuit-breaker.ts): The comment documents this is intentional for dedup purposes. The new dedup key formatcircuit-breaker-alert:{id}:{source}:{endpointId}in the notifier correctly differentiates provider vs endpoint alerts, preventing key collisions. The webhook template usesdata.endpointIddirectly for endpoint alerts, so the display is correct. -
Dedup key with undefined
endpointId(src/lib/notification/notifier.ts): ThetriggerEndpointCircuitBreakerAlertfunction always passes a concreteendpointId, so the dedup suffixendpoint:${data.endpointId}will always have a valid value in the endpoint alert path. Provider-level alerts use theprovidersuffix and don't touchendpointId. -
endpointFilterStatsundefined in provider chain (src/app/v1/_lib/proxy/forwarder.ts): The formatter inprovider-chain-formatter.tsguards withif (item.endpointFilterStats)before rendering stats. The testexhaustedNoStatsexplicitly validates graceful degradation when stats are missing. The forwarder correctly handlesgetEndpointFilterStatsfailures by logging a warning and leavingfilterStatsas undefined. -
batchGetEndpointCircuitInfounbounded array: The function requires admin session and is called from the UI with endpoint IDs for a single provider. While there's no explicit.max()on the array schema, this matches the existing pattern in the codebase where admin-only batch operations don't impose artificial limits. -
Empty catch block (
src/lib/endpoint-circuit-breaker.ts): The} catch { // DB lookup failure should not block alert }pattern is consistent with 20+ similar instances in the codebase for non-critical enrichment operations. -
if (data.endpointId)falsy check (src/lib/webhook/templates/circuit-breaker.ts):EndpointIdSchemarequiresz.number().int().positive(), so endpoint IDs are always >= 1. The falsy check is safe.
Review Coverage
- Logic and correctness - Clean
- Security (OWASP Top 10) - Clean
- Error handling - Clean
- Type safety - Clean
- Documentation accuracy - Clean
- Test coverage - Comprehensive (11 test files, covering all new functions)
- Code clarity - Good
Automated review by Claude AI
- EN/RU i18n: fix inconsistent endpoint terminology and missing "Broken" label - provider-endpoints: add .max(500) bound to batch query schema - provider-vendor-view: remove unused endpointCircuitInfo prop - forwarder: remove unsafe decisionContext cast, use typed errorMessage field - endpoint-circuit-breaker: fix providerId semantics (use vendorId) - notifier: add null check for endpointId in dedup key - circuit-breaker template: use strict !== undefined check, add N/A fallback - provider-manager: use Array.isArray guard for endpoint circuit check - provider-chain-formatter: add typeof number guard for stats - tests: fix assertions to match updated API contracts
🧪 测试结果
总体结果: ✅ 所有测试通过 |
| manualOpen: z.boolean(), | ||
| }); | ||
|
|
||
| const BatchGetEndpointCircuitInfoSchema = z.object({ |
There was a problem hiding this comment.
Invalid Zod constraint
BatchGetEndpointCircuitInfoSchema uses z.array(...).min(0), but min(0) is redundant and may not behave as intended (Zod array min is for length; 0 is already the default). If you want to allow empty arrays, you can drop .min(0) and keep .max(500), or set .min(1) if empties should be rejected.
Also, batchGetEndpointCircuitInfo() currently returns a Chinese permission error ("无权限执行此操作") in an otherwise English/locale-agnostic server action; this will surface inconsistent UX in non-zh locales. Consider using the project’s standard error message constant / i18n strategy for server actions.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/actions/provider-endpoints.ts
Line: 123:126
Comment:
**Invalid Zod constraint**
`BatchGetEndpointCircuitInfoSchema` uses `z.array(...).min(0)`, but `min(0)` is redundant and may not behave as intended (Zod array min is for length; 0 is already the default). If you want to allow empty arrays, you can drop `.min(0)` and keep `.max(500)`, or set `.min(1)` if empties should be rejected.
Also, `batchGetEndpointCircuitInfo()` currently returns a Chinese permission error (`"无权限执行此操作"`) in an otherwise English/locale-agnostic server action; this will surface inconsistent UX in non-zh locales. Consider using the project’s standard error message constant / i18n strategy for server actions.
How can I resolve this? If you propose a fix, please make it concise.|
|
||
| const parsed = BatchGetEndpointCircuitInfoSchema.safeParse(input); | ||
| if (!parsed.success) { | ||
| return { | ||
| ok: false, | ||
| error: formatZodError(parsed.error), | ||
| errorCode: extractZodErrorCode(parsed.error), | ||
| }; | ||
| } | ||
|
|
||
| if (parsed.data.endpointIds.length === 0) { | ||
| return { ok: true, data: [] }; | ||
| } | ||
|
|
||
| const results = await Promise.all( | ||
| parsed.data.endpointIds.map(async (endpointId) => { | ||
| const { health } = await getEndpointHealthInfo(endpointId); |
There was a problem hiding this comment.
N+1 health lookups
batchGetEndpointCircuitInfo() does Promise.all(endpointIds.map(getEndpointHealthInfo)), which becomes an N+1 pattern against whatever backing store getEndpointHealthInfo uses (often Redis/DB). With up to 500 endpoints, this can add significant latency to the settings page and increase load. This needs a real batch read path (e.g., multi-get in Redis / single query) or a lower hard cap with pagination, otherwise the “batch” action is still effectively per-endpoint.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/actions/provider-endpoints.ts
Line: 573:589
Comment:
**N+1 health lookups**
`batchGetEndpointCircuitInfo()` does `Promise.all(endpointIds.map(getEndpointHealthInfo))`, which becomes an N+1 pattern against whatever backing store `getEndpointHealthInfo` uses (often Redis/DB). With up to 500 endpoints, this can add significant latency to the settings page and increase load. This needs a real batch read path (e.g., multi-get in Redis / single query) or a lower hard cap with pagination, otherwise the “batch” action is still effectively per-endpoint.
How can I resolve this? If you propose a fix, please make it concise.| available: number; | ||
| } | ||
|
|
||
| /** | ||
| * Collect endpoint filter statistics for a given vendor/type. | ||
| * | ||
| * Used for audit trail when all endpoints are exhausted (strict block). | ||
| * Returns null only when the raw endpoint query itself fails. | ||
| */ | ||
| export async function getEndpointFilterStats(input: { | ||
| vendorId: number; | ||
| providerType: ProviderType; | ||
| }): Promise<EndpointFilterStats> { | ||
| const endpoints = await findProviderEndpointsByVendorAndType(input.vendorId, input.providerType); | ||
| const total = endpoints.length; | ||
| const enabled = endpoints.filter((e) => e.isEnabled && !e.deletedAt).length; | ||
|
|
||
| const circuitResults = await Promise.all( | ||
| endpoints | ||
| .filter((e) => e.isEnabled && !e.deletedAt) | ||
| .map(async (e) => isEndpointCircuitOpen(e.id)) | ||
| ); | ||
| const circuitOpen = circuitResults.filter(Boolean).length; | ||
| const available = enabled - circuitOpen; |
There was a problem hiding this comment.
Circuit stats can be expensive
getEndpointFilterStats() calls isEndpointCircuitOpen(e.id) for every enabled endpoint via Promise.all. In cases with many endpoints, this adds a burst of per-endpoint circuit checks right on the strict-block path, which can amplify latency/load during incidents (exactly when strict blocks happen). This should be implemented as a batched circuit-state read (or reuse the same data already computed in selection) to avoid hammering the circuit store.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/provider-endpoints/endpoint-selector.ts
Line: 60:83
Comment:
**Circuit stats can be expensive**
`getEndpointFilterStats()` calls `isEndpointCircuitOpen(e.id)` for every enabled endpoint via `Promise.all`. In cases with many endpoints, this adds a burst of per-endpoint circuit checks right on the strict-block path, which can amplify latency/load during incidents (exactly when strict blocks happen). This should be implemented as a batched circuit-state read (or reuse the same data already computed in selection) to avoid hammering the circuit store.
How can I resolve this? If you propose a fix, please make it concise.| }); | ||
| }, [rawEndpoints]); | ||
|
|
||
| // Fetch circuit breaker states for all endpoints in batch | ||
| const endpointIds = useMemo(() => endpoints.map((ep) => ep.id), [endpoints]); | ||
| const { data: circuitInfoMap = {} } = useQuery({ | ||
| queryKey: ["endpoint-circuit-info", ...endpointIds], | ||
| queryFn: async () => { | ||
| if (endpointIds.length === 0) return {}; | ||
| const res = await batchGetEndpointCircuitInfo({ endpointIds }); | ||
| if (!res.ok || !res.data) return {}; | ||
| const map: Record<number, EndpointCircuitState> = {}; | ||
| for (const item of res.data) { | ||
| map[item.endpointId] = item.circuitState as EndpointCircuitState; | ||
| } |
There was a problem hiding this comment.
Unstable queryKey growth
The query key is ["endpoint-circuit-info", ...endpointIds]. If endpointIds order changes between renders (e.g., sorting/filtering endpoints), React Query will treat it as a different query and refetch unnecessarily (and potentially cache multiple equivalent entries). Prefer a stable key such as ["endpoint-circuit-info", vendorId] or ["endpoint-circuit-info", endpointIds.sort().join(",")] so the cache is order-insensitive.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/[locale]/settings/providers/_components/provider-endpoints-table.tsx
Line: 130:144
Comment:
**Unstable queryKey growth**
The query key is `["endpoint-circuit-info", ...endpointIds]`. If `endpointIds` order changes between renders (e.g., sorting/filtering endpoints), React Query will treat it as a different query and refetch unnecessarily (and potentially cache multiple equivalent entries). Prefer a stable key such as `["endpoint-circuit-info", vendorId]` or `["endpoint-circuit-info", endpointIds.sort().join(",")]` so the cache is order-insensitive.
How can I resolve this? If you propose a fix, please make it concise.| }, | ||
| }); | ||
|
|
||
| const resetCircuitMutation = useMutation({ | ||
| mutationFn: async () => { | ||
| const res = await resetEndpointCircuit({ endpointId: endpoint.id }); | ||
| if (!res.ok) throw new Error(res.error); | ||
| return res; | ||
| }, | ||
| onSuccess: () => { | ||
| queryClient.invalidateQueries({ queryKey: ["endpoint-circuit-info"] }); | ||
| queryClient.invalidateQueries({ queryKey: ["provider-endpoints"] }); | ||
| toast.success(tStatus("resetCircuitSuccess")); | ||
| }, | ||
| onError: () => { | ||
| toast.error(tStatus("resetCircuitFailed")); | ||
| }, |
There was a problem hiding this comment.
InvalidateQueries key mismatch
resetCircuitMutation invalidates { queryKey: ["endpoint-circuit-info"] }, but the actual circuit info query uses a key shaped like ["endpoint-circuit-info", ...endpointIds]. Depending on React Query’s exact partial-match behavior here, this may not invalidate the active query, leaving stale circuit badges after reset.
To be safe, invalidate with predicate/exact: false (if supported) or match the full key shape, e.g. invalidate all keys starting with "endpoint-circuit-info".
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/[locale]/settings/providers/_components/provider-endpoints-table.tsx
Line: 280:296
Comment:
**InvalidateQueries key mismatch**
`resetCircuitMutation` invalidates `{ queryKey: ["endpoint-circuit-info"] }`, but the actual circuit info query uses a key shaped like `["endpoint-circuit-info", ...endpointIds]`. Depending on React Query’s exact partial-match behavior here, this may not invalidate the active query, leaving stale circuit badges after reset.
To be safe, invalidate with `predicate`/`exact: false` (if supported) or match the full key shape, e.g. invalidate all keys starting with `"endpoint-circuit-info"`.
How can I resolve this? If you propose a fix, please make it concise.| borderColor: "border-slate-400/30", | ||
| }; | ||
| } | ||
|
|
||
| /** | ||
| * Resolves the display status for an endpoint with unified semantics. | ||
| * | ||
| * Priority order: | ||
| * 1. circuit-open (priority 0) - Circuit breaker has opened | ||
| * 2. circuit-half-open (priority 1) - Circuit breaker is testing recovery | ||
| * 3. enabled (priority 2) - Circuit closed and endpoint is enabled | ||
| * 4. disabled (priority 3) - Circuit closed but endpoint is disabled | ||
| * | ||
| * @param endpoint - Endpoint data with optional isEnabled property | ||
| * @param circuitState - Current circuit breaker state | ||
| * @returns Display status with source indicator and priority | ||
| */ | ||
| export function resolveEndpointDisplayStatus( | ||
| endpoint: { lastProbeOk: boolean | null; isEnabled?: boolean | null }, | ||
| circuitState?: EndpointCircuitState | null | ||
| ): EndpointDisplayStatus { | ||
| // Priority 0: Circuit Open | ||
| if (circuitState === "open") { | ||
| return { | ||
| status: "circuit-open", |
There was a problem hiding this comment.
Incorrect incident source
resolveEndpointDisplayStatus() always returns source: "endpoint", even for the enabled/disabled cases where there is no circuit incident. This makes the “incident source” metadata misleading (it will label normal enabled/disabled states as endpoint incidents).
If source is intended to mean “which subsystem determined the status”, this should likely be "provider" or omitted for non-incident states, or only set to "endpoint" when circuitState is open|half-open.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/[locale]/settings/providers/_components/endpoint-status.ts
Line: 121:145
Comment:
**Incorrect incident source**
`resolveEndpointDisplayStatus()` always returns `source: "endpoint"`, even for the `enabled`/`disabled` cases where there is no circuit incident. This makes the “incident source” metadata misleading (it will label normal enabled/disabled states as endpoint incidents).
If `source` is intended to mean “which subsystem determined the status”, this should likely be `"provider"` or omitted for non-incident states, or only set to `"endpoint"` when `circuitState` is `open|half-open`.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@tests/unit/settings/providers/provider-manager.test.tsx`:
- Around line 389-440: The test "passes endpointCircuitInfo to ProviderList for
rendering" currently only asserts the broken count text and does not verify
endpointCircuitInfo is forwarded; update the ProviderList mock (the mock defined
around the ProviderList mock block) to capture props (e.g., record the last call
into providerListProps or similar via a jest.fn or variable) and in this test
assert that ProviderList was invoked with an objectContaining({
endpointCircuitInfo: expect.any(Object) }) or a direct equality check against
the endpointCircuitInfo passed to ProviderManager; reference ProviderManager,
the ProviderList mock, and the endpointCircuitInfo fixture when making these
changes.
🧹 Nitpick comments (4)
src/lib/endpoint-circuit-breaker.ts (1)
250-252: 空catch块吞掉了 DB 查询异常,建议加日志。DB 查询失败时完全静默处理,不利于排查端点数据丢失或数据库连接问题。建议至少记录一条
warn级别日志,与文件中其他 Redis 错误处理风格保持一致(如 Line 82-85)。建议修复
- } catch { - // DB lookup failure should not block alert + } catch (enrichError) { + logger.warn({ + action: "endpoint_circuit_alert_enrich_failed", + endpointId, + error: enrichError instanceof Error ? enrichError.message : String(enrichError), + }); }src/app/[locale]/settings/providers/_components/provider-manager.tsx (1)
619-629:ProviderVendorView未接收endpointCircuitInfo。列表视图(
ProviderList)传递了endpointCircuitInfo,但厂商视图(ProviderVendorView)没有。如果后续需要在厂商视图中展示端点级熔断状态,需要补充此 prop。如果当前是有意为之,可以忽略。tests/unit/settings/providers/provider-manager.test.tsx (2)
116-136:renderWithProviders使用原生createRoot而非@testing-library/react。当前实现可以工作,但使用
@testing-library/react的render方法可以自动处理清理、提供更丰富的查询 API(如getByRole、getByText),并且更符合 React 测试生态的惯例。当前方案在beforeEach中手动清理 DOM,也是可以接受的。
162-186: 基于文本内容的断言方式较脆弱。多个测试通过
expect(text).toContain("(1)")来验证 circuit broken count,这种方式依赖渲染文本中恰好出现特定字符串。虽然在当前高度 mock 化的环境下不太可能产生误判,但如果后续 UI 增加了包含相同数字格式的内容(如 "(1 provider)" 或分页信息),断言可能会产生假阳性。如果未来发现测试不稳定,可以考虑给 circuit broken count 元素添加
data-testid并直接断言其内容。Also applies to: 236-238, 282-284, 323-325, 435-437
| test("passes endpointCircuitInfo to ProviderList for rendering", () => { | ||
| const healthStatus = { | ||
| 1: { | ||
| circuitState: "open" as const, | ||
| failureCount: 5, | ||
| lastFailureTime: Date.now(), | ||
| circuitOpenUntil: Date.now() + 60000, | ||
| recoveryMinutes: 1, | ||
| }, | ||
| 3: { | ||
| circuitState: "open" as const, | ||
| failureCount: 3, | ||
| lastFailureTime: Date.now(), | ||
| circuitOpenUntil: Date.now() + 30000, | ||
| recoveryMinutes: 0.5, | ||
| }, | ||
| }; | ||
|
|
||
| const endpointCircuitInfo = { | ||
| 2: [ | ||
| { | ||
| endpointId: 20, | ||
| circuitState: "open" as const, | ||
| failureCount: 2, | ||
| circuitOpenUntil: Date.now() + 60000, | ||
| }, | ||
| ], | ||
| 3: [ | ||
| { | ||
| endpointId: 30, | ||
| circuitState: "open" as const, | ||
| failureCount: 4, | ||
| circuitOpenUntil: Date.now() + 60000, | ||
| }, | ||
| ], | ||
| }; | ||
|
|
||
| const { unmount, container } = renderWithProviders( | ||
| <ProviderManager | ||
| providers={providers} | ||
| healthStatus={healthStatus} | ||
| endpointCircuitInfo={endpointCircuitInfo} | ||
| enableMultiProviderTypes={true} | ||
| /> | ||
| ); | ||
|
|
||
| // The circuit broken count should be 3 (all three providers have some form of circuit open) | ||
| const text = container.textContent || ""; | ||
| expect(text).toContain("(3)"); | ||
|
|
||
| unmount(); | ||
| }); |
There was a problem hiding this comment.
测试名称 "passes endpointCircuitInfo to ProviderList" 与实际断言不符。
此测试名声称验证 endpointCircuitInfo 是否传递给了 ProviderList,但实际上只断言了 circuit broken count 的文本内容 "(3)"。由于 ProviderList 的 mock(Line 28-38)并未捕获或断言 endpointCircuitInfo prop,无法真正验证 prop 传递。
建议:要么修改测试名称更准确地描述其行为(如 "counts all providers with any circuit open for layered labels"),要么让 mock 捕获 props 并断言 endpointCircuitInfo 确实被传递。
方案一:修改测试名称
- test("passes endpointCircuitInfo to ProviderList for rendering", () => {
+ test("counts all providers with any form of open circuit", () => {方案二:在 mock 中捕获 props 并断言
+const providerListProps = vi.fn();
+
vi.mock("@/app/[locale]/settings/providers/_components/provider-list", () => ({
- ProviderList: ({ providers }: { providers: ProviderDisplay[] }) => (
- <ul data-testid="provider-list">
- {providers.map((p) => (
- <li key={p.id} data-testid={`provider-${p.id}`}>
- {p.name}
- </li>
- ))}
- </ul>
- ),
+ ProviderList: (props: { providers: ProviderDisplay[]; endpointCircuitInfo?: Record<number, unknown[]> }) => {
+ providerListProps(props);
+ return (
+ <ul data-testid="provider-list">
+ {props.providers.map((p) => (
+ <li key={p.id} data-testid={`provider-${p.id}`}>
+ {p.name}
+ </li>
+ ))}
+ </ul>
+ );
+ },
}));然后在测试中添加:
expect(providerListProps).toHaveBeenCalledWith(
expect.objectContaining({ endpointCircuitInfo: expect.any(Object) })
);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| test("passes endpointCircuitInfo to ProviderList for rendering", () => { | |
| const healthStatus = { | |
| 1: { | |
| circuitState: "open" as const, | |
| failureCount: 5, | |
| lastFailureTime: Date.now(), | |
| circuitOpenUntil: Date.now() + 60000, | |
| recoveryMinutes: 1, | |
| }, | |
| 3: { | |
| circuitState: "open" as const, | |
| failureCount: 3, | |
| lastFailureTime: Date.now(), | |
| circuitOpenUntil: Date.now() + 30000, | |
| recoveryMinutes: 0.5, | |
| }, | |
| }; | |
| const endpointCircuitInfo = { | |
| 2: [ | |
| { | |
| endpointId: 20, | |
| circuitState: "open" as const, | |
| failureCount: 2, | |
| circuitOpenUntil: Date.now() + 60000, | |
| }, | |
| ], | |
| 3: [ | |
| { | |
| endpointId: 30, | |
| circuitState: "open" as const, | |
| failureCount: 4, | |
| circuitOpenUntil: Date.now() + 60000, | |
| }, | |
| ], | |
| }; | |
| const { unmount, container } = renderWithProviders( | |
| <ProviderManager | |
| providers={providers} | |
| healthStatus={healthStatus} | |
| endpointCircuitInfo={endpointCircuitInfo} | |
| enableMultiProviderTypes={true} | |
| /> | |
| ); | |
| // The circuit broken count should be 3 (all three providers have some form of circuit open) | |
| const text = container.textContent || ""; | |
| expect(text).toContain("(3)"); | |
| unmount(); | |
| }); | |
| test("counts all providers with any form of open circuit", () => { | |
| const healthStatus = { | |
| 1: { | |
| circuitState: "open" as const, | |
| failureCount: 5, | |
| lastFailureTime: Date.now(), | |
| circuitOpenUntil: Date.now() + 60000, | |
| recoveryMinutes: 1, | |
| }, | |
| 3: { | |
| circuitState: "open" as const, | |
| failureCount: 3, | |
| lastFailureTime: Date.now(), | |
| circuitOpenUntil: Date.now() + 30000, | |
| recoveryMinutes: 0.5, | |
| }, | |
| }; | |
| const endpointCircuitInfo = { | |
| 2: [ | |
| { | |
| endpointId: 20, | |
| circuitState: "open" as const, | |
| failureCount: 2, | |
| circuitOpenUntil: Date.now() + 60000, | |
| }, | |
| ], | |
| 3: [ | |
| { | |
| endpointId: 30, | |
| circuitState: "open" as const, | |
| failureCount: 4, | |
| circuitOpenUntil: Date.now() + 60000, | |
| }, | |
| ], | |
| }; | |
| const { unmount, container } = renderWithProviders( | |
| <ProviderManager | |
| providers={providers} | |
| healthStatus={healthStatus} | |
| endpointCircuitInfo={endpointCircuitInfo} | |
| enableMultiProviderTypes={true} | |
| /> | |
| ); | |
| // The circuit broken count should be 3 (all three providers have some form of circuit open) | |
| const text = container.textContent || ""; | |
| expect(text).toContain("(3)"); | |
| unmount(); | |
| }); |
🤖 Prompt for AI Agents
In `@tests/unit/settings/providers/provider-manager.test.tsx` around lines 389 -
440, The test "passes endpointCircuitInfo to ProviderList for rendering"
currently only asserts the broken count text and does not verify
endpointCircuitInfo is forwarded; update the ProviderList mock (the mock defined
around the ProviderList mock block) to capture props (e.g., record the last call
into providerListProps or similar via a jest.fn or variable) and in this test
assert that ProviderList was invoked with an objectContaining({
endpointCircuitInfo: expect.any(Object) }) or a direct equality check against
the endpointCircuitInfo passed to ProviderManager; reference ProviderManager,
the ProviderList mock, and the endpointCircuitInfo fixture when making these
changes.
- Remove redundant .min(0) from zod array schema
- Fix endpoint status source to "provider" when no circuit incident
- Stabilize React Query key by sorting endpoint IDs
- Log endpoint info lookup failures instead of silently swallowing
- Replace .catch(() => {}) with expect().rejects.toThrow() in tests
- Improve test descriptions for clarity
🧪 测试结果
总体结果: ✅ 所有测试通过 |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@messages/ru/provider-chain.json`:
- Line 41: Replace the grammatically incorrect Russian phrase "конечная точкаов"
with the proper genitive plural "конечных точек" in this locale file;
specifically update the value for the endpointPoolExhausted key (and any other
keys showing "конечная точкаов" on lines noted) so the string becomes "Пул
конечных точек исчерпан" (or the equivalent correct form where used), and search
the file for any other occurrences of "конечная точкаов" to correct them all.
In `@src/actions/provider-endpoints.ts`:
- Around line 587-597: The batch endpoint health aggregation uses Promise.all
which fails the whole batch if any getEndpointHealthInfo(endpointId) rejects;
change to Promise.allSettled over parsed.data.endpointIds, then filter for
entries with status === "fulfilled", map their values to the existing shape ({
endpointId, circuitState, failureCount, circuitOpenUntil }) and discard or log
rejected entries so a single failure doesn't abort the entire results array;
ensure the variable names (parsed.data.endpointIds, getEndpointHealthInfo,
results) are preserved.
In `@tests/unit/proxy/proxy-forwarder-endpoint-audit.test.ts`:
- Around line 601-603: The test documents that the "selector_error" path should
NOT call getEndpointFilterStats but lacks an explicit mock call assertion;
update the selector_error case in proxy-forwarder-endpoint-audit.test.ts to
assert the mocked getEndpointFilterStats was not invoked (e.g.,
expect(getEndpointFilterStatsMock).not.toHaveBeenCalled() or
toHaveBeenCalledTimes(0)) after inspecting exhaustedItem for selector_error,
referencing the existing mock defined near the top of the test file and the
exhaustedItem variable to ensure the behavior is enforced rather than relying on
default mock return values.
🧹 Nitpick comments (10)
src/lib/notification/notifier.ts (1)
49-113: Redis 可用/不可用两条路径的通知发送逻辑完全重复,可考虑提取公共函数。
if (redisClient)两个分支中,legacy mode 判断、bindings 查询、addNotificationJob/addNotificationJobForTarget调用逻辑完全一致(lines 49-76 与 86-113),唯一差别是 Redis 路径多了 dedup 读写。提取为一个内部 helper 可以消除 ~30 行重复,也降低后续维护时两处逻辑不同步的风险。重构示意
+async function dispatchCircuitBreakerAlert( + settings: NotificationSettings, + data: CircuitBreakerAlertData, +): Promise<void> { + const { addNotificationJob, addNotificationJobForTarget } = await import( + "./notification-queue" + ); + + if (settings.useLegacyMode) { + if (!settings.circuitBreakerWebhook) { + logger.info({ + action: "circuit_breaker_alert_disabled", + providerId: data.providerId, + reason: "legacy_webhook_missing", + }); + return; + } + await addNotificationJob("circuit-breaker", settings.circuitBreakerWebhook, data); + } else { + const { getEnabledBindingsByType } = await import("@/repository/notification-bindings"); + const bindings = await getEnabledBindingsByType("circuit_breaker"); + if (bindings.length === 0) { + logger.info({ + action: "circuit_breaker_alert_skipped", + providerId: data.providerId, + reason: "no_bindings", + }); + return; + } + for (const binding of bindings) { + await addNotificationJobForTarget("circuit-breaker", binding.targetId, binding.id, data); + } + } +}然后在两个分支中调用
await dispatchCircuitBreakerAlert(settings, data)即可。tests/unit/proxy/proxy-forwarder-endpoint-audit.test.ts (2)
609-660: 可选优化:考虑拆分为独立测试用例。当前将
selector_error和no_endpoint_candidates两个场景合并在一个test()中。若第一个场景(session1)断言失败,第二个场景(session2)将不会执行,降低了可调试性。可考虑拆分为两个独立的 test case,或使用test.each参数化。此外,与上面相同,session1 的
selector_error路径(第 627 行)也缺少expect(mocks.getEndpointFilterStats).not.toHaveBeenCalled()断言。
662-691: 降级场景覆盖良好,建议补充日志验证。测试正确验证了
getEndpointFilterStats失败时流程不中断、endpointFilterStats为 undefined 的行为。生产代码确实在失败时调用logger.warn("[ProxyForwarder] Failed to collect endpoint filter stats", {...}),但当前测试未验证此日志记录。可参考同文件其他测试(如第434行)的模式,补充断言:expect(logger.warn).toHaveBeenCalledWith("[ProxyForwarder] Failed to collect endpoint filter stats", expect.objectContaining({...})),确保故障被正确记录而非静默吞掉。src/app/[locale]/settings/providers/_components/endpoint-status.ts (2)
32-36:EndpointDisplayStatus.status建议使用联合类型替代string。
resolveEndpointDisplayStatus实际只返回四种状态值,使用string会丢失类型安全性,下游消费方需要手动判断字符串值。建议修改
+export type EndpointDisplayStatusToken = "circuit-open" | "circuit-half-open" | "enabled" | "disabled"; + export interface EndpointDisplayStatus { - status: string; + status: EndpointDisplayStatusToken; source: IncidentSource; priority: number; }
138-166:lastProbeOk参数在resolveEndpointDisplayStatus中未被使用。该函数签名接受
lastProbeOk,但函数体仅使用了isEnabled和circuitState。如果是为了与getEndpointStatusModel保持签名一致以便调用方统一传参,建议添加注释说明意图;否则可以从签名中移除以避免误导。src/app/[locale]/settings/providers/_components/provider-manager.tsx (1)
33-42:circuitState的类型字面量与EndpointCircuitState重复定义。
EndpointCircuitInfoMap中内联定义了"closed" | "open" | "half-open",而endpoint-status.ts已导出EndpointCircuitState类型。建议复用以避免不一致风险。建议修改
+import type { EndpointCircuitState } from "./endpoint-status"; + export type EndpointCircuitInfoMap = Record< number, Array<{ endpointId: number; - circuitState: "closed" | "open" | "half-open"; + circuitState: EndpointCircuitState; failureCount: number; circuitOpenUntil: number | null; }> >;tests/unit/settings/providers/endpoint-status.test.ts (2)
116-117:createEndpoint辅助函数的isEnabled参数类型应与被测函数签名一致。
resolveEndpointDisplayStatus接受isEnabled?: boolean | null,但此处createEndpoint的isEnabled参数类型为boolean | undefined(隐式),导致第 202 行需要null as unknown as undefined强制转换。建议直接对齐类型。建议修改
- const createEndpoint = (lastProbeOk: boolean | null, isEnabled?: boolean) => - ({ lastProbeOk, isEnabled }) as { lastProbeOk: boolean | null; isEnabled?: boolean }; + const createEndpoint = (lastProbeOk: boolean | null, isEnabled?: boolean | null) => + ({ lastProbeOk, isEnabled }) as { lastProbeOk: boolean | null; isEnabled?: boolean | null };这样第 202 行可以简化为:
- const endpoint = createEndpoint(true, null as unknown as undefined); + const endpoint = createEndpoint(true, null);
105-113:IncidentSource类型测试价值有限。这个测试仅验证 TypeScript 类型赋值是否合法,编译时已能保证。不过作为文档性质的测试保留也无妨。
tests/unit/settings/providers/provider-manager.test.tsx (1)
116-136:renderWithProviders建议添加afterEach清理或使用 try/finally。如果测试在
unmount()调用前抛异常,DOM 节点会残留。虽然beforeEach中有清理逻辑(第 147-149 行),但在同一测试内多次渲染时可能产生干扰。可考虑注册afterEach自动清理。src/app/[locale]/settings/providers/_components/provider-endpoints-table.tsx (1)
133-149: 查询键建议直接使用排序后的数组而非拼接字符串。
endpointIds.toSorted(...).join(",")在端点数量较多时会产生很长的字符串键。React Query 原生支持数组作为查询键且内部使用深比较,直接传入排序后的数组即可。建议修改
- const { data: circuitInfoMap = {} } = useQuery({ - queryKey: ["endpoint-circuit-info", endpointIds.toSorted((a, b) => a - b).join(",")], + const sortedEndpointIds = useMemo(() => [...endpointIds].sort((a, b) => a - b), [endpointIds]); + const { data: circuitInfoMap = {} } = useQuery({ + queryKey: ["endpoint-circuit-info", sortedEndpointIds],
| "http2Fallback": "Откат HTTP/2", | ||
| "clientError": "Ошибка клиента" | ||
| "clientError": "Ошибка клиента", | ||
| "endpointPoolExhausted": "Пул конечная точкаов исчерпан" |
There was a problem hiding this comment.
俄语语法严重错误:"конечная точкаов" 不是有效的俄语词形。
多处新增翻译中出现了 "конечная точкаов" 这一拼写,这是将主格单数 "конечная точка" 与属格复数后缀 "-ов" 错误拼接的结果,在俄语中完全不合语法。正确的属格复数形式应为 "конечных точек"。
涉及行:41、53、67、193、194、195、196、198、199。
📝 建议修改
- "endpointPoolExhausted": "Пул конечная точкаов исчерпан"
+ "endpointPoolExhausted": "Пул конечных точек исчерпан"- "endpoint_pool_exhausted": "Пул конечная точкаов исчерпан"
+ "endpoint_pool_exhausted": "Пул конечных точек исчерпан"- "endpoint_circuit_open": "Автомат конечная точкаа открыт",
+ "endpoint_circuit_open": "Автомат конечной точки открыт",- "endpointPoolExhausted": "Пул конечная точкаов исчерпан (все конечная точкаы недоступны)",
- "endpointStats": "Статистика фильтрации конечная точкаов",
- "endpointStatsTotal": "Всего конечная точкаов: {count}",
- "endpointStatsEnabled": "Включено конечная точкаов: {count}",
- "endpointStatsCircuitOpen": "Эндпоинтов с открытым автоматом: {count}",
- "endpointStatsAvailable": "Доступных конечная точкаов: {count}",
- "strictBlockNoEndpoints": "Строгий режим: нет доступных кандидатов конечная точкаов, провайдер пропущен без отката",
- "strictBlockSelectorError": "Строгий режим: ошибка селектора конечная точкаов, провайдер пропущен без отката"
+ "endpointPoolExhausted": "Пул конечных точек исчерпан (все конечные точки недоступны)",
+ "endpointStats": "Статистика фильтрации конечных точек",
+ "endpointStatsTotal": "Всего конечных точек: {count}",
+ "endpointStatsEnabled": "Включено конечных точек: {count}",
+ "endpointStatsCircuitOpen": "Конечных точек с открытым автоматом: {count}",
+ "endpointStatsAvailable": "Доступных конечных точек: {count}",
+ "strictBlockNoEndpoints": "Строгий режим: нет доступных кандидатов среди конечных точек, провайдер пропущен без отката",
+ "strictBlockSelectorError": "Строгий режим: ошибка селектора конечных точек, провайдер пропущен без отката"Also applies to: 53-53, 67-67, 193-200
🤖 Prompt for AI Agents
In `@messages/ru/provider-chain.json` at line 41, Replace the grammatically
incorrect Russian phrase "конечная точкаов" with the proper genitive plural
"конечных точек" in this locale file; specifically update the value for the
endpointPoolExhausted key (and any other keys showing "конечная точкаов" on
lines noted) so the string becomes "Пул конечных точек исчерпан" (or the
equivalent correct form where used), and search the file for any other
occurrences of "конечная точкаов" to correct them all.
| const results = await Promise.all( | ||
| parsed.data.endpointIds.map(async (endpointId) => { | ||
| const { health } = await getEndpointHealthInfo(endpointId); | ||
| return { | ||
| endpointId, | ||
| circuitState: health.circuitState, | ||
| failureCount: health.failureCount, | ||
| circuitOpenUntil: health.circuitOpenUntil, | ||
| }; | ||
| }) | ||
| ); |
There was a problem hiding this comment.
Promise.all 会因单个端点查询失败导致整个批量请求失败。
当前实现中,如果 500 个端点中有 1 个查询 getEndpointHealthInfo 抛异常,整个请求都会返回错误。建议使用 Promise.allSettled 并过滤成功结果,使批量查询具备部分容错能力。
建议修改
- const results = await Promise.all(
- parsed.data.endpointIds.map(async (endpointId) => {
- const { health } = await getEndpointHealthInfo(endpointId);
- return {
- endpointId,
- circuitState: health.circuitState,
- failureCount: health.failureCount,
- circuitOpenUntil: health.circuitOpenUntil,
- };
- })
- );
+ const settled = await Promise.allSettled(
+ parsed.data.endpointIds.map(async (endpointId) => {
+ const { health } = await getEndpointHealthInfo(endpointId);
+ return {
+ endpointId,
+ circuitState: health.circuitState,
+ failureCount: health.failureCount,
+ circuitOpenUntil: health.circuitOpenUntil,
+ };
+ })
+ );
+
+ const results = settled
+ .filter((r): r is PromiseFulfilledResult<typeof r extends PromiseFulfilledResult<infer T> ? T : never> => r.status === "fulfilled")
+ .map((r) => r.value);🤖 Prompt for AI Agents
In `@src/actions/provider-endpoints.ts` around lines 587 - 597, The batch endpoint
health aggregation uses Promise.all which fails the whole batch if any
getEndpointHealthInfo(endpointId) rejects; change to Promise.allSettled over
parsed.data.endpointIds, then filter for entries with status === "fulfilled",
map their values to the existing shape ({ endpointId, circuitState,
failureCount, circuitOpenUntil }) and discard or log rejected entries so a
single failure doesn't abort the entire results array; ensure the variable names
(parsed.data.endpointIds, getEndpointHealthInfo, results) are preserved.
| // selector_error should NOT call getEndpointFilterStats (exception path, no data available) | ||
| // endpointFilterStats should be undefined for selector_error | ||
| expect(exhaustedItem!.endpointFilterStats).toBeUndefined(); |
There was a problem hiding this comment.
注释声明了意图但缺少对应断言。
第 601 行注释说明 selector_error 路径不应调用 getEndpointFilterStats,但没有对应的 mock 调用断言来验证。由于默认 mock 返回 null(第 6 行),若生产代码意外调用了该函数,endpointFilterStats 仍可能为 undefined(取决于 null 的处理方式),测试会静默通过而掩盖行为偏差。
建议添加显式断言
// selector_error should NOT call getEndpointFilterStats (exception path, no data available)
+ expect(mocks.getEndpointFilterStats).not.toHaveBeenCalled();
// endpointFilterStats should be undefined for selector_error
expect(exhaustedItem!.endpointFilterStats).toBeUndefined();📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // selector_error should NOT call getEndpointFilterStats (exception path, no data available) | |
| // endpointFilterStats should be undefined for selector_error | |
| expect(exhaustedItem!.endpointFilterStats).toBeUndefined(); | |
| // selector_error should NOT call getEndpointFilterStats (exception path, no data available) | |
| expect(mocks.getEndpointFilterStats).not.toHaveBeenCalled(); | |
| // endpointFilterStats should be undefined for selector_error | |
| expect(exhaustedItem!.endpointFilterStats).toBeUndefined(); |
🤖 Prompt for AI Agents
In `@tests/unit/proxy/proxy-forwarder-endpoint-audit.test.ts` around lines 601 -
603, The test documents that the "selector_error" path should NOT call
getEndpointFilterStats but lacks an explicit mock call assertion; update the
selector_error case in proxy-forwarder-endpoint-audit.test.ts to assert the
mocked getEndpointFilterStats was not invoked (e.g.,
expect(getEndpointFilterStatsMock).not.toHaveBeenCalled() or
toHaveBeenCalledTimes(0)) after inspecting exhaustedItem for selector_error,
referencing the existing mock defined near the top of the test file and the
exhaustedItem variable to ensure the behavior is enforced rather than relying on
default mock return values.
| if (!res.ok) throw new Error(res.error); | ||
| return res; | ||
| }, | ||
| onSuccess: () => { | ||
| queryClient.invalidateQueries({ queryKey: ["endpoint-circuit-info"] }); | ||
| queryClient.invalidateQueries({ queryKey: ["provider-endpoints"] }); | ||
| toast.success(tStatus("resetCircuitSuccess")); |
There was a problem hiding this comment.
Query invalidation may miss
The circuit state query is keyed as ["endpoint-circuit-info", endpointIds.toSorted(...).join(",")], but the reset mutation only invalidates { queryKey: ["endpoint-circuit-info"] }. If your React Query version/config doesn’t treat this as a partial prefix match, the active query won’t be invalidated and badges can stay stale after reset. Consider invalidating with a predicate (keys starting with "endpoint-circuit-info") or invalidate the exact composed key you used for the query.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/[locale]/settings/providers/_components/provider-endpoints-table.tsx
Line: 286:292
Comment:
**Query invalidation may miss**
The circuit state query is keyed as `["endpoint-circuit-info", endpointIds.toSorted(...).join(",")]`, but the reset mutation only invalidates `{ queryKey: ["endpoint-circuit-info"] }`. If your React Query version/config doesn’t treat this as a partial prefix match, the active query won’t be invalidated and badges can stay stale after reset. Consider invalidating with a predicate (keys starting with `"endpoint-circuit-info"`) or invalidate the exact composed key you used for the query.
How can I resolve this? If you propose a fix, please make it concise.| } | ||
| } | ||
|
|
||
| export async function batchGetEndpointCircuitInfo(input: unknown): Promise< | ||
| ActionResult< | ||
| Array<{ |
There was a problem hiding this comment.
Not actually batched read
batchGetEndpointCircuitInfo() still does Promise.all(endpointIds.map(getEndpointHealthInfo)), which is effectively an N+1 against the underlying store (often Redis/DB). With .max(500) this can create a large burst of reads and slow the settings page. This likely needs a true batch/multi-get path in getEndpointHealthInfo (or a dedicated batch helper) rather than per-endpoint calls.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/actions/provider-endpoints.ts
Line: 551:556
Comment:
**Not actually batched read**
`batchGetEndpointCircuitInfo()` still does `Promise.all(endpointIds.map(getEndpointHealthInfo))`, which is effectively an N+1 against the underlying store (often Redis/DB). With `.max(500)` this can create a large burst of reads and slow the settings page. This likely needs a true batch/multi-get path in `getEndpointHealthInfo` (or a dedicated batch helper) rather than per-endpoint calls.
How can I resolve this? If you propose a fix, please make it concise.| failureCount: number; | ||
| circuitOpenUntil: number | null; | ||
| }> | ||
| > | ||
| > { | ||
| try { | ||
| const session = await getAdminSession(); | ||
| if (!session) { |
There was a problem hiding this comment.
Hardcoded Chinese errors
This action returns Chinese-only error strings (e.g. "无权限执行此操作", "批量获取端点熔断状态失败"). If these surface in non-zh locales, UX becomes inconsistent versus the rest of the admin actions. Use the project’s standard error message strategy (shared constants / i18n / error codes only) so callers can present locale-appropriate text.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/actions/provider-endpoints.ts
Line: 559:566
Comment:
**Hardcoded Chinese errors**
This action returns Chinese-only error strings (e.g. `"无权限执行此操作"`, `"批量获取端点熔断状态失败"`). If these surface in non-zh locales, UX becomes inconsistent versus the rest of the admin actions. Use the project’s standard error message strategy (shared constants / i18n / error codes only) so callers can present locale-appropriate text.
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.| * Used for audit trail when all endpoints are exhausted (strict block). | ||
| * Returns null only when the raw endpoint query itself fails. | ||
| */ | ||
| export async function getEndpointFilterStats(input: { | ||
| vendorId: number; | ||
| providerType: ProviderType; | ||
| }): Promise<EndpointFilterStats> { | ||
| const endpoints = await findProviderEndpointsByVendorAndType(input.vendorId, input.providerType); | ||
| const total = endpoints.length; |
There was a problem hiding this comment.
Per-endpoint circuit checks
getEndpointFilterStats() calls isEndpointCircuitOpen(e.id) for every enabled endpoint via Promise.all. On vendors with many endpoints this will generate a burst of circuit-store reads right when strict blocking is happening (incident conditions), increasing latency/load. Prefer a batched circuit-state lookup (or reuse circuit state already computed during endpoint selection) to avoid amplifying incidents.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/provider-endpoints/endpoint-selector.ts
Line: 66:74
Comment:
**Per-endpoint circuit checks**
`getEndpointFilterStats()` calls `isEndpointCircuitOpen(e.id)` for every enabled endpoint via `Promise.all`. On vendors with many endpoints this will generate a burst of circuit-store reads right when strict blocking is happening (incident conditions), increasing latency/load. Prefer a batched circuit-state lookup (or reuse circuit state already computed during endpoint selection) to avoid amplifying incidents.
How can I resolve this? If you propose a fix, please make it concise.
Summary
Unify provider/endpoint circuit breaker visibility and notification semantics across the settings UI, proxy decision chain, and webhook alerts.
Problem
The system has three layers of circuit breakers (provider / endpoint / vendor-type), but the UI and decision chain only had full visibility into provider (key) level circuits. Endpoint-level circuit breakers had blind spots:
isEnabledfrom DB, not runtime circuit state -- users saw "enabled" for actually-unavailable endpointsendpoint-selectorsilently filtered circuit-broken endpoints without recording toProviderChainItem, making it impossible to diagnose why requests hungRelated Issues:
Solution
Unified Status Semantics (Foundation)
IncidentSourceenum (provider|endpoint) andresolveEndpointDisplayStatus()with priority: circuit-open > circuit-half-open > enabled/disabledbatchGetEndpointCircuitInfo()server action for bulk endpoint circuit state queriesSettings UI (Feature Surfaces)
Proxy Explainability (Decision Chain)
endpoint_pool_exhaustedreason toProviderChainItemwithstrictBlockCauseandendpointFilterStatsmetadataNotifications
CircuitBreakerAlertDatawithincidentSource,endpointId,endpointUrlcircuit-breaker-alert:{id}:endpoint:{endpointId}){{incident_source}},{{endpoint_id}},{{endpoint_url}}template placeholders for custom webhook templatesChanges
Core Changes
src/lib/endpoint-circuit-breaker.ts- Trigger endpoint circuit alerts on OPEN, addtriggerEndpointCircuitBreakerAlert()src/lib/provider-endpoints/endpoint-selector.ts- AddgetEndpointFilterStats()for audit trailsrc/app/v1/_lib/proxy/forwarder.ts- Recordendpoint_pool_exhaustedin provider chain with filter statssrc/app/v1/_lib/proxy/session.ts- AcceptstrictBlockCauseandendpointFilterStatsin chain metadatasrc/types/message.ts- ExtendProviderChainItemwithendpoint_pool_exhaustedreason and stats fieldssrc/lib/webhook/types.ts- ExtendCircuitBreakerAlertDatawith incident source fieldssrc/lib/notification/notifier.ts- Source-aware dedup keys and loggingsrc/lib/webhook/templates/circuit-breaker.ts- Endpoint-specific alert title/description/fieldssrc/lib/webhook/templates/placeholders.ts- New template variables for endpoint circuit eventsUI Changes
src/app/[locale]/settings/providers/_components/endpoint-status.ts- AddIncidentSource,EndpointDisplayStatus,resolveEndpointDisplayStatus()src/app/[locale]/settings/providers/_components/provider-endpoints-table.tsx- Batch-fetch circuit states, show badges, add reset actionsrc/app/[locale]/settings/providers/_components/provider-manager.tsx- UnifiedhasAnyCircuitOpen()for filter and countersrc/app/[locale]/settings/providers/_components/provider-rich-list-item.tsx- Layered "Key Circuit" / "Endpoint Circuit" badgessrc/app/[locale]/settings/providers/_components/provider-list.tsx- PassendpointCircuitInfopropsrc/app/[locale]/settings/providers/_components/provider-vendor-view.tsx- AcceptendpointCircuitInfopropsrc/actions/provider-endpoints.ts- AddbatchGetEndpointCircuitInfo()actionsrc/lib/utils/provider-chain-formatter.ts- Renderendpoint_pool_exhaustedin summary, description, and timelinei18n
Testing
Automated Tests
batchGetEndpointCircuitInfoaction (tests/unit/actions/provider-endpoints.test.ts)triggerEndpointCircuitBreakerAlert(tests/unit/lib/endpoint-circuit-breaker.test.ts)getEndpointFilterStats(tests/unit/lib/provider-endpoints/endpoint-selector.test.ts)tests/unit/proxy/proxy-forwarder-endpoint-audit.test.ts)resolveEndpointDisplayStatus(tests/unit/settings/providers/endpoint-status.test.ts)tests/unit/settings/providers/provider-manager.test.tsx)tests/unit/settings/providers/provider-vendor-view-circuit-ui.test.tsx)tests/unit/webhook/notifier-circuit-breaker.test.ts)tests/unit/webhook/templates/placeholders.test.ts)tests/unit/webhook/templates/templates.test.ts)src/lib/utils/provider-chain-formatter.test.ts)Manual Testing
Checklist
Description enhanced by Claude AI
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Greptile Overview
Greptile Summary
This PR unifies circuit-breaker visibility and semantics across the settings UI, proxy decision chain, and webhook notifications by adding an endpoint incident source, bulk endpoint circuit-state lookups, and a new
endpoint_pool_exhaustedprovider-chain reason with filter statistics.Key integrations:
Blocking issues to address before merge:
Promise.all), which can become a large burst (up to 500 calls) and undermine UI performance.Confidence Score: 3/5
Important Files Changed
Sequence Diagram
sequenceDiagram participant UI as Settings UI participant Actions as provider-endpoints actions participant Redis as Endpoint circuit store participant Proxy as ProxyForwarder participant Session as Proxy Session (ProviderChain) participant Webhook as Notifier/Webhook UI->>Actions: batchGetEndpointCircuitInfo({endpointIds[]}) Actions->>Redis: getEndpointHealthInfo(endpointId) (per id) Redis-->>Actions: {circuitState,failureCount,circuitOpenUntil} Actions-->>UI: map endpointId -> circuitState UI->>Actions: resetEndpointCircuit({endpointId}) Actions->>Redis: deleteEndpointCircuitState(endpointId) Actions-->>UI: ok Proxy->>Proxy: getPreferredProviderEndpoints(...) alt strict policy + no candidates Proxy->>Proxy: getEndpointFilterStats(vendorId, providerType) Proxy->>Session: addProviderToChain(reason=endpoint_pool_exhausted, strictBlockCause, endpointFilterStats?) Session-->>Proxy: chain updated end Proxy->>Redis: recordEndpointFailure(endpointId) alt circuit opens Proxy->>Webhook: triggerEndpointCircuitBreakerAlert(endpointId,...) Webhook->>Webhook: sendCircuitBreakerAlert(incidentSource=endpoint, endpointId, endpointUrl?) end