refactor(provider): improve provider page performance#789
Conversation
📝 WalkthroughWalkthrough引入端点探针的空闲 DB 轮询配置;新增批量端点/探针日志与供应商-类型统计 API;在 Redis 上实现批量电路断路器状态同步与 TTL/LRU 管理;重构探针调度以支持空闲轮询与下一次工作提示;添加配额费用聚合与轻量使用日志查询;扩展多处前端仪表盘与设置组件;新增多项索引与迁移校验工具。 Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 分钟 Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @ding113, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request delivers a comprehensive set of performance improvements and architectural refinements. It focuses on optimizing data access patterns through new database indexes and refactored queries, enhancing the responsiveness of various user interfaces, and strengthening the reliability of background processes like migrations and endpoint probing. These changes collectively aim to provide a faster, more stable, and scalable application experience. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
🧪 测试结果
总体结果: ✅ 所有测试通过 |
There was a problem hiding this comment.
Code Review
This is an excellent pull request that introduces significant performance and robustness improvements across the application, particularly on the provider and usage-related pages. The changes are well-thought-out and demonstrate a deep understanding of both backend and frontend optimization techniques.
Key improvements include:
- Database Performance: New indexes and optimized queries (using
FILTER,LATERALjoins, and reducing N+1 problems) will greatly improve database efficiency. - API Batching: Many frontend components now batch requests for data like endpoint stats and circuit breaker information, which will reduce network overhead and improve page load times.
- UI Responsiveness: The use of lazy loading (
useInViewOnce), request cancellation (AbortController), and other hooks enhances the user experience by making the UI faster and more robust. - System Robustness: The addition of advisory locks for startup tasks and more granular cache invalidation strategies makes the system more reliable, especially in multi-instance deployments.
I have reviewed the code changes in detail and found them to be of very high quality. I did not identify any issues or bugs. This is a fantastic contribution to the project's performance and stability.
src/repository/provider.ts
Outdated
| const promise = (async () => { | ||
| // 使用 providerChain 最后一项的 providerId 来确定最终供应商(兼容重试切换) | ||
| // 如果 provider_chain 为空(无重试),则使用 provider_id 字段 | ||
| const query = sql` | ||
| WITH bounds AS ( | ||
| SELECT | ||
| (DATE_TRUNC('day', CURRENT_TIMESTAMP AT TIME ZONE ${timezone}) AT TIME ZONE ${timezone}) AS today_start, | ||
| ((DATE_TRUNC('day', CURRENT_TIMESTAMP AT TIME ZONE ${timezone}) + INTERVAL '1 day') AT TIME ZONE ${timezone}) AS tomorrow_start, | ||
| ((DATE_TRUNC('day', CURRENT_TIMESTAMP AT TIME ZONE ${timezone}) - INTERVAL '7 days') AT TIME ZONE ${timezone}) AS last7_start | ||
| ), | ||
| provider_stats AS ( | ||
| -- 先按最终供应商聚合,再与 providers 做 LEFT JOIN,避免 providers × 今日请求 的笛卡尔积 | ||
| SELECT | ||
| mr.final_provider_id, | ||
| COALESCE(SUM(mr.cost_usd), 0) AS today_cost, | ||
| COUNT(*)::integer AS today_calls | ||
| FROM ( | ||
| SELECT | ||
| CASE | ||
| WHEN provider_chain IS NULL OR provider_chain = '[]'::jsonb THEN provider_id | ||
| WHEN (provider_chain->-1->>'id') ~ '^[0-9]+$' THEN (provider_chain->-1->>'id')::int | ||
| ELSE provider_id | ||
| END AS final_provider_id, | ||
| cost_usd | ||
| FROM message_request | ||
| WHERE deleted_at IS NULL | ||
| AND (blocked_by IS NULL OR blocked_by <> 'warmup') | ||
| AND created_at >= (SELECT today_start FROM bounds) | ||
| AND created_at < (SELECT tomorrow_start FROM bounds) | ||
| ) mr | ||
| GROUP BY mr.final_provider_id | ||
| ), | ||
| latest_call AS ( | ||
| SELECT DISTINCT ON (final_provider_id) | ||
| final_provider_id, | ||
| created_at AS last_call_time, | ||
| model AS last_call_model | ||
| FROM ( | ||
| SELECT | ||
| CASE | ||
| WHEN provider_chain IS NULL OR provider_chain = '[]'::jsonb THEN provider_id | ||
| WHEN (provider_chain->-1->>'id') ~ '^[0-9]+$' THEN (provider_chain->-1->>'id')::int | ||
| ELSE provider_id | ||
| END AS final_provider_id, | ||
| id, | ||
| created_at, | ||
| model | ||
| FROM message_request | ||
| WHERE deleted_at IS NULL | ||
| AND (blocked_by IS NULL OR blocked_by <> 'warmup') | ||
| AND created_at >= (SELECT last7_start FROM bounds) | ||
| ) mr | ||
| -- 性能优化:添加 7 天时间范围限制(避免扫描历史数据) | ||
| ORDER BY final_provider_id, created_at DESC, id DESC | ||
| ) | ||
| SELECT | ||
| p.id, | ||
| COALESCE( | ||
| SUM(CASE | ||
| WHEN (mr.created_at AT TIME ZONE ${timezone})::date = (CURRENT_TIMESTAMP AT TIME ZONE ${timezone})::date | ||
| AND ( | ||
| -- 情况1:无重试(provider_chain 为 NULL 或空数组),使用 provider_id | ||
| (mr.provider_chain IS NULL OR jsonb_array_length(mr.provider_chain) = 0) AND mr.provider_id = p.id | ||
| OR | ||
| -- 情况2:有重试,使用 providerChain 最后一项的 id | ||
| (mr.provider_chain IS NOT NULL AND jsonb_array_length(mr.provider_chain) > 0 | ||
| AND (mr.provider_chain->-1->>'id')::int = p.id) | ||
| ) | ||
| THEN mr.cost_usd ELSE 0 END), | ||
| 0 | ||
| ) AS today_cost, | ||
| COUNT(CASE | ||
| WHEN (mr.created_at AT TIME ZONE ${timezone})::date = (CURRENT_TIMESTAMP AT TIME ZONE ${timezone})::date | ||
| AND ( | ||
| (mr.provider_chain IS NULL OR jsonb_array_length(mr.provider_chain) = 0) AND mr.provider_id = p.id | ||
| OR | ||
| (mr.provider_chain IS NOT NULL AND jsonb_array_length(mr.provider_chain) > 0 | ||
| AND (mr.provider_chain->-1->>'id')::int = p.id) | ||
| ) | ||
| THEN 1 END)::integer AS today_calls | ||
| COALESCE(ps.today_cost, 0) AS today_cost, | ||
| COALESCE(ps.today_calls, 0) AS today_calls, | ||
| lc.last_call_time, | ||
| lc.last_call_model | ||
| FROM providers p | ||
| -- 性能优化:添加日期过滤条件,仅扫描今日数据(避免全表扫描) | ||
| LEFT JOIN message_request mr ON mr.deleted_at IS NULL | ||
| AND (mr.blocked_by IS NULL OR mr.blocked_by <> 'warmup') | ||
| AND mr.created_at >= (CURRENT_DATE AT TIME ZONE ${timezone}) | ||
| LEFT JOIN provider_stats ps ON p.id = ps.final_provider_id | ||
| LEFT JOIN latest_call lc ON p.id = lc.final_provider_id | ||
| WHERE p.deleted_at IS NULL | ||
| GROUP BY p.id | ||
| ), | ||
| latest_call AS ( | ||
| SELECT DISTINCT ON (final_provider_id) | ||
| -- 计算最终供应商ID:优先使用 providerChain 最后一项的 id | ||
| CASE | ||
| WHEN provider_chain IS NULL OR jsonb_array_length(provider_chain) = 0 THEN provider_id | ||
| ELSE (provider_chain->-1->>'id')::int | ||
| END AS final_provider_id, | ||
| created_at AS last_call_time, | ||
| model AS last_call_model | ||
| FROM message_request | ||
| -- 性能优化:添加 7 天时间范围限制(避免扫描历史数据) | ||
| WHERE deleted_at IS NULL | ||
| AND (blocked_by IS NULL OR blocked_by <> 'warmup') | ||
| AND created_at >= (CURRENT_DATE AT TIME ZONE ${timezone} - INTERVAL '7 days') | ||
| ORDER BY final_provider_id, created_at DESC | ||
| ) | ||
| SELECT | ||
| ps.id, | ||
| ps.today_cost, | ||
| ps.today_calls, | ||
| lc.last_call_time, | ||
| lc.last_call_model | ||
| FROM provider_stats ps | ||
| LEFT JOIN latest_call lc ON ps.id = lc.final_provider_id | ||
| ORDER BY ps.id ASC | ||
| `; | ||
|
|
||
| logger.trace("getProviderStatistics:executing_query"); | ||
|
|
||
| const result = await db.execute(query); | ||
|
|
||
| logger.trace("getProviderStatistics:result", { | ||
| count: Array.isArray(result) ? result.length : 0, | ||
| }); | ||
| ORDER BY p.id ASC | ||
| `; | ||
|
|
||
| logger.trace("getProviderStatistics:executing_query"); | ||
|
|
||
| const result = await db.execute(query); | ||
| const data = Array.from(result) as ProviderStatisticsRow[]; | ||
|
|
||
| logger.trace("getProviderStatistics:result", { | ||
| count: data.length, | ||
| }); | ||
|
|
||
| // 注意:返回结果中的 today_cost 为 numeric,使用字符串表示; | ||
| // last_call_time 由数据库返回为时间戳(UTC)。 | ||
| // 这里保持原样,交由上层进行展示格式化。 | ||
| return result as unknown as Array<{ | ||
| id: number; | ||
| today_cost: string; | ||
| today_calls: number; | ||
| last_call_time: Date | null; | ||
| last_call_model: string | null; | ||
| }>; | ||
| // 注意:返回结果中的 today_cost 为 numeric,使用字符串表示; | ||
| // last_call_time 由数据库返回为时间戳(UTC)。 | ||
| // 这里保持原样,交由上层进行展示格式化。 | ||
| providerStatisticsCache = { | ||
| timezone, | ||
| expiresAt: Date.now() + PROVIDER_STATISTICS_CACHE_TTL_MS, | ||
| data, | ||
| }; | ||
|
|
||
| return data; | ||
| })(); | ||
|
|
||
| providerStatisticsInFlight = { timezone, promise }; |
There was a problem hiding this comment.
In-flight dedup race condition
The providerStatisticsInFlight is set at line 1224, after the promise is already created and started executing at line 1135. Between promise creation and the assignment, any concurrent caller will pass the in-flight check at line 1131 and start a duplicate query, defeating the dedup.
Move the assignment before await:
| const promise = (async () => { | |
| // 使用 providerChain 最后一项的 providerId 来确定最终供应商(兼容重试切换) | |
| // 如果 provider_chain 为空(无重试),则使用 provider_id 字段 | |
| const query = sql` | |
| WITH bounds AS ( | |
| SELECT | |
| (DATE_TRUNC('day', CURRENT_TIMESTAMP AT TIME ZONE ${timezone}) AT TIME ZONE ${timezone}) AS today_start, | |
| ((DATE_TRUNC('day', CURRENT_TIMESTAMP AT TIME ZONE ${timezone}) + INTERVAL '1 day') AT TIME ZONE ${timezone}) AS tomorrow_start, | |
| ((DATE_TRUNC('day', CURRENT_TIMESTAMP AT TIME ZONE ${timezone}) - INTERVAL '7 days') AT TIME ZONE ${timezone}) AS last7_start | |
| ), | |
| provider_stats AS ( | |
| -- 先按最终供应商聚合,再与 providers 做 LEFT JOIN,避免 providers × 今日请求 的笛卡尔积 | |
| SELECT | |
| mr.final_provider_id, | |
| COALESCE(SUM(mr.cost_usd), 0) AS today_cost, | |
| COUNT(*)::integer AS today_calls | |
| FROM ( | |
| SELECT | |
| CASE | |
| WHEN provider_chain IS NULL OR provider_chain = '[]'::jsonb THEN provider_id | |
| WHEN (provider_chain->-1->>'id') ~ '^[0-9]+$' THEN (provider_chain->-1->>'id')::int | |
| ELSE provider_id | |
| END AS final_provider_id, | |
| cost_usd | |
| FROM message_request | |
| WHERE deleted_at IS NULL | |
| AND (blocked_by IS NULL OR blocked_by <> 'warmup') | |
| AND created_at >= (SELECT today_start FROM bounds) | |
| AND created_at < (SELECT tomorrow_start FROM bounds) | |
| ) mr | |
| GROUP BY mr.final_provider_id | |
| ), | |
| latest_call AS ( | |
| SELECT DISTINCT ON (final_provider_id) | |
| final_provider_id, | |
| created_at AS last_call_time, | |
| model AS last_call_model | |
| FROM ( | |
| SELECT | |
| CASE | |
| WHEN provider_chain IS NULL OR provider_chain = '[]'::jsonb THEN provider_id | |
| WHEN (provider_chain->-1->>'id') ~ '^[0-9]+$' THEN (provider_chain->-1->>'id')::int | |
| ELSE provider_id | |
| END AS final_provider_id, | |
| id, | |
| created_at, | |
| model | |
| FROM message_request | |
| WHERE deleted_at IS NULL | |
| AND (blocked_by IS NULL OR blocked_by <> 'warmup') | |
| AND created_at >= (SELECT last7_start FROM bounds) | |
| ) mr | |
| -- 性能优化:添加 7 天时间范围限制(避免扫描历史数据) | |
| ORDER BY final_provider_id, created_at DESC, id DESC | |
| ) | |
| SELECT | |
| p.id, | |
| COALESCE( | |
| SUM(CASE | |
| WHEN (mr.created_at AT TIME ZONE ${timezone})::date = (CURRENT_TIMESTAMP AT TIME ZONE ${timezone})::date | |
| AND ( | |
| -- 情况1:无重试(provider_chain 为 NULL 或空数组),使用 provider_id | |
| (mr.provider_chain IS NULL OR jsonb_array_length(mr.provider_chain) = 0) AND mr.provider_id = p.id | |
| OR | |
| -- 情况2:有重试,使用 providerChain 最后一项的 id | |
| (mr.provider_chain IS NOT NULL AND jsonb_array_length(mr.provider_chain) > 0 | |
| AND (mr.provider_chain->-1->>'id')::int = p.id) | |
| ) | |
| THEN mr.cost_usd ELSE 0 END), | |
| 0 | |
| ) AS today_cost, | |
| COUNT(CASE | |
| WHEN (mr.created_at AT TIME ZONE ${timezone})::date = (CURRENT_TIMESTAMP AT TIME ZONE ${timezone})::date | |
| AND ( | |
| (mr.provider_chain IS NULL OR jsonb_array_length(mr.provider_chain) = 0) AND mr.provider_id = p.id | |
| OR | |
| (mr.provider_chain IS NOT NULL AND jsonb_array_length(mr.provider_chain) > 0 | |
| AND (mr.provider_chain->-1->>'id')::int = p.id) | |
| ) | |
| THEN 1 END)::integer AS today_calls | |
| COALESCE(ps.today_cost, 0) AS today_cost, | |
| COALESCE(ps.today_calls, 0) AS today_calls, | |
| lc.last_call_time, | |
| lc.last_call_model | |
| FROM providers p | |
| -- 性能优化:添加日期过滤条件,仅扫描今日数据(避免全表扫描) | |
| LEFT JOIN message_request mr ON mr.deleted_at IS NULL | |
| AND (mr.blocked_by IS NULL OR mr.blocked_by <> 'warmup') | |
| AND mr.created_at >= (CURRENT_DATE AT TIME ZONE ${timezone}) | |
| LEFT JOIN provider_stats ps ON p.id = ps.final_provider_id | |
| LEFT JOIN latest_call lc ON p.id = lc.final_provider_id | |
| WHERE p.deleted_at IS NULL | |
| GROUP BY p.id | |
| ), | |
| latest_call AS ( | |
| SELECT DISTINCT ON (final_provider_id) | |
| -- 计算最终供应商ID:优先使用 providerChain 最后一项的 id | |
| CASE | |
| WHEN provider_chain IS NULL OR jsonb_array_length(provider_chain) = 0 THEN provider_id | |
| ELSE (provider_chain->-1->>'id')::int | |
| END AS final_provider_id, | |
| created_at AS last_call_time, | |
| model AS last_call_model | |
| FROM message_request | |
| -- 性能优化:添加 7 天时间范围限制(避免扫描历史数据) | |
| WHERE deleted_at IS NULL | |
| AND (blocked_by IS NULL OR blocked_by <> 'warmup') | |
| AND created_at >= (CURRENT_DATE AT TIME ZONE ${timezone} - INTERVAL '7 days') | |
| ORDER BY final_provider_id, created_at DESC | |
| ) | |
| SELECT | |
| ps.id, | |
| ps.today_cost, | |
| ps.today_calls, | |
| lc.last_call_time, | |
| lc.last_call_model | |
| FROM provider_stats ps | |
| LEFT JOIN latest_call lc ON ps.id = lc.final_provider_id | |
| ORDER BY ps.id ASC | |
| `; | |
| logger.trace("getProviderStatistics:executing_query"); | |
| const result = await db.execute(query); | |
| logger.trace("getProviderStatistics:result", { | |
| count: Array.isArray(result) ? result.length : 0, | |
| }); | |
| ORDER BY p.id ASC | |
| `; | |
| logger.trace("getProviderStatistics:executing_query"); | |
| const result = await db.execute(query); | |
| const data = Array.from(result) as ProviderStatisticsRow[]; | |
| logger.trace("getProviderStatistics:result", { | |
| count: data.length, | |
| }); | |
| // 注意:返回结果中的 today_cost 为 numeric,使用字符串表示; | |
| // last_call_time 由数据库返回为时间戳(UTC)。 | |
| // 这里保持原样,交由上层进行展示格式化。 | |
| return result as unknown as Array<{ | |
| id: number; | |
| today_cost: string; | |
| today_calls: number; | |
| last_call_time: Date | null; | |
| last_call_model: string | null; | |
| }>; | |
| // 注意:返回结果中的 today_cost 为 numeric,使用字符串表示; | |
| // last_call_time 由数据库返回为时间戳(UTC)。 | |
| // 这里保持原样,交由上层进行展示格式化。 | |
| providerStatisticsCache = { | |
| timezone, | |
| expiresAt: Date.now() + PROVIDER_STATISTICS_CACHE_TTL_MS, | |
| data, | |
| }; | |
| return data; | |
| })(); | |
| providerStatisticsInFlight = { timezone, promise }; | |
| const promise = (async () => { | |
| // 使用 providerChain 最后一项的 providerId 来确定最终供应商(兼容重试切换) | |
| // 如果 provider_chain 为空(无重试),则使用 provider_id 字段 | |
| const query = sql` | |
| WITH bounds AS ( | |
| SELECT | |
| (DATE_TRUNC('day', CURRENT_TIMESTAMP AT TIME ZONE ${timezone}) AT TIME ZONE ${timezone}) AS today_start, | |
| ((DATE_TRUNC('day', CURRENT_TIMESTAMP AT TIME ZONE ${timezone}) + INTERVAL '1 day') AT TIME ZONE ${timezone}) AS tomorrow_start, | |
| ((DATE_TRUNC('day', CURRENT_TIMESTAMP AT TIME ZONE ${timezone}) - INTERVAL '7 days') AT TIME ZONE ${timezone}) AS last7_start | |
| ), | |
| provider_stats AS ( | |
| -- 先按最终供应商聚合,再与 providers 做 LEFT JOIN,避免 providers × 今日请求 的笛卡尔积 | |
| SELECT | |
| mr.final_provider_id, | |
| COALESCE(SUM(mr.cost_usd), 0) AS today_cost, | |
| COUNT(*)::integer AS today_calls | |
| FROM ( | |
| SELECT | |
| CASE | |
| WHEN provider_chain IS NULL OR provider_chain = '[]'::jsonb THEN provider_id | |
| WHEN (provider_chain->-1->>'id') ~ '^[0-9]+$' THEN (provider_chain->-1->>'id')::int | |
| ELSE provider_id | |
| END AS final_provider_id, | |
| cost_usd | |
| FROM message_request | |
| WHERE deleted_at IS NULL | |
| AND (blocked_by IS NULL OR blocked_by <> 'warmup') | |
| AND created_at >= (SELECT today_start FROM bounds) | |
| AND created_at < (SELECT tomorrow_start FROM bounds) | |
| ) mr | |
| GROUP BY mr.final_provider_id | |
| ), | |
| latest_call AS ( | |
| SELECT DISTINCT ON (final_provider_id) | |
| final_provider_id, | |
| created_at AS last_call_time, | |
| model AS last_call_model | |
| FROM ( | |
| SELECT | |
| CASE | |
| WHEN provider_chain IS NULL OR provider_chain = '[]'::jsonb THEN provider_id | |
| WHEN (provider_chain->-1->>'id') ~ '^[0-9]+$' THEN (provider_chain->-1->>'id')::int | |
| ELSE provider_id | |
| END AS final_provider_id, | |
| id, | |
| created_at, | |
| model | |
| FROM message_request | |
| WHERE deleted_at IS NULL | |
| AND (blocked_by IS NULL OR blocked_by <> 'warmup') | |
| AND created_at >= (SELECT last7_start FROM bounds) | |
| ) mr | |
| -- 性能优化:添加 7 天时间范围限制(避免扫描历史数据) | |
| ORDER BY final_provider_id, created_at DESC, id DESC | |
| ) | |
| SELECT | |
| p.id, | |
| COALESCE(ps.today_cost, 0) AS today_cost, | |
| COALESCE(ps.today_calls, 0) AS today_calls, | |
| lc.last_call_time, | |
| lc.last_call_model | |
| FROM providers p | |
| LEFT JOIN provider_stats ps ON p.id = ps.final_provider_id | |
| LEFT JOIN latest_call lc ON p.id = lc.final_provider_id | |
| WHERE p.deleted_at IS NULL | |
| ORDER BY p.id ASC | |
| `; | |
| logger.trace("getProviderStatistics:executing_query"); | |
| const result = await db.execute(query); | |
| const data = Array.from(result) as ProviderStatisticsRow[]; | |
| logger.trace("getProviderStatistics:result", { | |
| count: data.length, | |
| }); | |
| // 注意:返回结果中的 today_cost 为 numeric,使用字符串表示; | |
| // last_call_time 由数据库返回为时间戳(UTC)。 | |
| // 这里保持原样,交由上层进行展示格式化。 | |
| providerStatisticsCache = { | |
| timezone, | |
| expiresAt: Date.now() + PROVIDER_STATISTICS_CACHE_TTL_MS, | |
| data, | |
| }; | |
| return data; | |
| })(); | |
| // Set in-flight BEFORE awaiting to prevent concurrent callers from starting duplicate queries | |
| providerStatisticsInFlight = { timezone, promise }; |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/repository/provider.ts
Line: 1135:1224
Comment:
**In-flight dedup race condition**
The `providerStatisticsInFlight` is set at line 1224, **after** the promise is already created and started executing at line 1135. Between promise creation and the assignment, any concurrent caller will pass the in-flight check at line 1131 and start a duplicate query, defeating the dedup.
Move the assignment before `await`:
```suggestion
const promise = (async () => {
// 使用 providerChain 最后一项的 providerId 来确定最终供应商(兼容重试切换)
// 如果 provider_chain 为空(无重试),则使用 provider_id 字段
const query = sql`
WITH bounds AS (
SELECT
(DATE_TRUNC('day', CURRENT_TIMESTAMP AT TIME ZONE ${timezone}) AT TIME ZONE ${timezone}) AS today_start,
((DATE_TRUNC('day', CURRENT_TIMESTAMP AT TIME ZONE ${timezone}) + INTERVAL '1 day') AT TIME ZONE ${timezone}) AS tomorrow_start,
((DATE_TRUNC('day', CURRENT_TIMESTAMP AT TIME ZONE ${timezone}) - INTERVAL '7 days') AT TIME ZONE ${timezone}) AS last7_start
),
provider_stats AS (
-- 先按最终供应商聚合,再与 providers 做 LEFT JOIN,避免 providers × 今日请求 的笛卡尔积
SELECT
mr.final_provider_id,
COALESCE(SUM(mr.cost_usd), 0) AS today_cost,
COUNT(*)::integer AS today_calls
FROM (
SELECT
CASE
WHEN provider_chain IS NULL OR provider_chain = '[]'::jsonb THEN provider_id
WHEN (provider_chain->-1->>'id') ~ '^[0-9]+$' THEN (provider_chain->-1->>'id')::int
ELSE provider_id
END AS final_provider_id,
cost_usd
FROM message_request
WHERE deleted_at IS NULL
AND (blocked_by IS NULL OR blocked_by <> 'warmup')
AND created_at >= (SELECT today_start FROM bounds)
AND created_at < (SELECT tomorrow_start FROM bounds)
) mr
GROUP BY mr.final_provider_id
),
latest_call AS (
SELECT DISTINCT ON (final_provider_id)
final_provider_id,
created_at AS last_call_time,
model AS last_call_model
FROM (
SELECT
CASE
WHEN provider_chain IS NULL OR provider_chain = '[]'::jsonb THEN provider_id
WHEN (provider_chain->-1->>'id') ~ '^[0-9]+$' THEN (provider_chain->-1->>'id')::int
ELSE provider_id
END AS final_provider_id,
id,
created_at,
model
FROM message_request
WHERE deleted_at IS NULL
AND (blocked_by IS NULL OR blocked_by <> 'warmup')
AND created_at >= (SELECT last7_start FROM bounds)
) mr
-- 性能优化:添加 7 天时间范围限制(避免扫描历史数据)
ORDER BY final_provider_id, created_at DESC, id DESC
)
SELECT
p.id,
COALESCE(ps.today_cost, 0) AS today_cost,
COALESCE(ps.today_calls, 0) AS today_calls,
lc.last_call_time,
lc.last_call_model
FROM providers p
LEFT JOIN provider_stats ps ON p.id = ps.final_provider_id
LEFT JOIN latest_call lc ON p.id = lc.final_provider_id
WHERE p.deleted_at IS NULL
ORDER BY p.id ASC
`;
logger.trace("getProviderStatistics:executing_query");
const result = await db.execute(query);
const data = Array.from(result) as ProviderStatisticsRow[];
logger.trace("getProviderStatistics:result", {
count: data.length,
});
// 注意:返回结果中的 today_cost 为 numeric,使用字符串表示;
// last_call_time 由数据库返回为时间戳(UTC)。
// 这里保持原样,交由上层进行展示格式化。
providerStatisticsCache = {
timezone,
expiresAt: Date.now() + PROVIDER_STATISTICS_CACHE_TTL_MS,
data,
};
return data;
})();
// Set in-flight BEFORE awaiting to prevent concurrent callers from starting duplicate queries
providerStatisticsInFlight = { timezone, promise };
```
How can I resolve this? If you propose a fix, please make it concise.| if (endpoints.length <= 1) return endpoints; | ||
|
|
||
| return enabled.slice().sort((a, b) => { | ||
| endpoints.sort((a, b) => { |
There was a problem hiding this comment.
In-place sort mutates parameter
endpoints.sort(...) mutates the input array. This is currently safe because rankProviderEndpoints always creates a fresh array via .filter() first. But rankActiveProviderEndpoints is a footgun if called directly with a shared reference in the future. Consider using endpoints.slice().sort(...) or Array.from(endpoints).sort(...) to avoid mutating the caller's array.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/provider-endpoints/endpoint-selector.ts
Line: 20:20
Comment:
**In-place sort mutates parameter**
`endpoints.sort(...)` mutates the input array. This is currently safe because `rankProviderEndpoints` always creates a fresh array via `.filter()` first. But `rankActiveProviderEndpoints` is a footgun if called directly with a shared reference in the future. Consider using `endpoints.slice().sort(...)` or `Array.from(endpoints).sort(...)` to avoid mutating the caller's array.
How can I resolve this? If you propose a fix, please make it concise.
Additional Comments (1)
This is a minor observation - the redundant Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time! Prompt To Fix With AIThis is a comment left during a code review.
Path: src/lib/endpoint-circuit-breaker.ts
Line: 17:17
Comment:
**`rankActiveProviderEndpoints` mutates input array**
`rankActiveProviderEndpoints` sorts the `endpoints` array in place (line 20 of `endpoint-selector.ts`). Its caller `rankProviderEndpoints` creates a new array via `.filter()`, so that's fine. However, the hot-path `getPreferredProviderEndpoints` passes `circuitCandidates` (which may be the *same* reference as the `endpoints` array returned from the DB query when `excludeSet` is null) through `rankProviderEndpoints` -> `rankActiveProviderEndpoints`. Since `rankProviderEndpoints` always calls `.filter()` first, this is currently safe, but the in-place sort inside `rankActiveProviderEndpoints` is fragile and could silently corrupt the caller's data if refactored.
This is a minor observation - the redundant `filter` in `rankProviderEndpoints` when called from `getPreferredProviderEndpoints` (where endpoints are already known to be enabled) actually serves as a protective copy. Worth being aware of for future refactoring.
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise. |
There was a problem hiding this comment.
Actionable comments posted: 8
🤖 Fix all issues with AI agents
In `@src/actions/provider-endpoints.ts`:
- Around line 416-430: The user-facing error strings in provider-endpoints
handlers (e.g., the branch using isDirectEndpointEditConflictError and
isForeignKeyViolationError) are hardcoded Chinese; replace them with i18n keys
or remove the message and return only errorCode so the frontend handles
localization. Update the return objects in functions/methods that use
isDirectEndpointEditConflictError, isForeignKeyViolationError (and other similar
branches at the noted ranges) to set either error:
"i18n.provider.endpoint_conflict" / "i18n.provider.not_found" (or another agreed
key) or omit error and rely on ERROR_CODES.CONFLICT / ERROR_CODES.NOT_FOUND
being returned; ensure the same change is applied for the other occurrences
referenced (around lines 554-567, 749-754, 818-823) so no hardcoded display
strings remain.
In `@src/app/`[locale]/dashboard/users/users-page-client.tsx:
- Around line 80-89: The current debouncing keys pendingTagFiltersKey and
pendingKeyGroupFiltersKey build strings via sort().join("|"), which can collide
if tag values contain "|"—update the key generation to use a collision-safe
encoding (e.g., sort the arrays and then use JSON.stringify on the sorted
arrays, or join with a null character "\0", or escape values with
encodeURIComponent) before passing into useDebounce; change the expressions that
compute pendingTagFiltersKey and pendingKeyGroupFiltersKey (which derive from
pendingTagFilters and pendingKeyGroupFilters) accordingly so useDebounce
receives a stable, unambiguous key.
In `@src/lib/migrate.ts`:
- Line 77: The code currently uses the internal helper readMigrationFiles
(referenced as readMigrationFiles) from drizzle-orm which is undocumented and
unstable; replace its usage with the documented driver-specific migrate API
(e.g., migrate from drizzle-orm/postgres-js/migrator) by wiring your DB client
into migrate(db, config) and using its returned results instead of
readMigrationFiles output (drop reliance on hash/folderMillis from
readMigrationFiles and map to the migrate result shape); update imports to use
the official migrator (migrate) and adjust any call sites in this module (e.g.,
where migrations is used) to the migrate() promise/result.
In `@src/lib/provider-endpoints/probe-scheduler.ts`:
- Around line 32-39: The code only applies the BASE_INTERVAL_MS upper bound when
using the default, but allows the env var
ENDPOINT_PROBE_IDLE_DB_POLL_INTERVAL_MS to exceed BASE_INTERVAL_MS; update the
IDLE_DB_POLL_INTERVAL_MS computation to parse the env var via
parseIntWithDefault and then clamp the resulting value between 1 and
BASE_INTERVAL_MS (e.g., use Math.max(1, Math.min(parsedValue,
BASE_INTERVAL_MS))) so the effective poll interval respects the comment; update
references to DEFAULT_IDLE_DB_POLL_INTERVAL_MS, IDLE_DB_POLL_INTERVAL_MS,
BASE_INTERVAL_MS, parseIntWithDefault, and
process.env.ENDPOINT_PROBE_IDLE_DB_POLL_INTERVAL_MS accordingly.
In `@src/repository/leaderboard.ts`:
- Around line 163-170: buildDateCondition currently interpolates
dateRange.startDate/endDate directly into SQL which allows invalid strings (e.g.
"not-a-date") to reach PostgreSQL and cause runtime errors; add defensive
validation of DateRangeParams (ensure YYYY-MM-DD format and valid calendar date)
before building SQL (either inside buildDateCondition or at the caller
findCustomRangeLeaderboard) and reject/throw a clear validation error for
invalid inputs. Use a strict YYYY-MM-DD regex and/or Date parsing to confirm the
values are valid dates, reference the symbols dateRange.startDate,
dateRange.endDate, buildDateCondition, findCustomRangeLeaderboard,
messageRequest.createdAt and timezone when implementing the check, and only
construct the SQL when validation passes.
In `@src/repository/overview.ts`:
- Around line 143-144: The yesterday query uses a closed interval (gte + lte)
while the today query uses a half-open interval (gte + lt), causing asymmetric
comparison; update the yesterday time-window condition that currently uses
lte(messageRequest.createdAt, yesterdayEnd) to use lt(messageRequest.createdAt,
yesterdayEnd) so both windows use gte + lt for messageRequest.createdAt (refer
to the existing today window using todayStart/tomorrowStart for consistency).
In `@src/repository/provider-endpoints-batch.ts`:
- Around line 68-173: The function findProviderEndpointProbeLogsBatch allows
input.limitPerEndpoint to be NaN which makes Math.max(1, NaN) return NaN and
produces an invalid SQL LIMIT; fix by validating and coercing
input.limitPerEndpoint to a safe positive integer before use (e.g. check
Number.isFinite(input.limitPerEndpoint), fallback to 1, use
Math.floor/Math.trunc to convert to an integer and enforce >=1), assign that
sanitized value to limitPerEndpoint and use it in the SQL and later checks so
LIMIT never receives NaN.
In `@src/repository/provider.ts`:
- Around line 748-752: 在 deleteProvider 的事务更新逻辑(tx.update(providers).set({
deletedAt: now }))中同时更新 updatedAt 字段以保持与批量删除分支行为一致:在 .set(...) 中加入 updatedAt:
now,并在需要审计返回值的地方(如 .returning(...))一并返回 providers.updatedAt 以确保审计时间一致可追溯。
🧹 Nitpick comments (27)
tests/unit/settings/providers/endpoint-latency-sparkline-ui.test.tsx (1)
92-95: 占位块选择器易碎,建议换成稳定标识。目前依赖
bg-muted/*类名做选择,样式调整会导致测试误报。建议为占位元素加data-testid(或语义角色)并改为基于该标识查找。src/app/api/availability/endpoints/route.ts (1)
22-47: 该 API 路由未使用 Hono 框架。根据编码规范,
src/app/api/**/*.{ts,tsx}路径下的 API 路由应使用 Hono 框架。当前文件直接使用NextRequest/NextResponse,属于已有代码,本次 PR 未涉及此部分重构,可考虑在后续迭代中迁移。As per coding guidelines: "API routes should use Hono framework and follow Next.js 16 App Router conventions".src/repository/overview.ts (1)
44-48: 时区边界计算在两个函数中重复,建议提取为共享辅助函数。
getOverviewMetrics(第 44-48 行)和getOverviewMetricsWithComparison(第 96-104 行)中nowLocal、todayStartLocal、todayStart、tomorrowStart的计算逻辑完全相同。可以提取一个内部辅助函数返回这些 SQL 表达式,减少重复并确保后续修改时两处保持一致。示例重构
function buildDayBoundaries(timezone: string) { const nowLocal = sql`CURRENT_TIMESTAMP AT TIME ZONE ${timezone}`; const todayStartLocal = sql`DATE_TRUNC('day', ${nowLocal})`; const todayStart = sql`(${todayStartLocal} AT TIME ZONE ${timezone})`; const tomorrowStart = sql`((${todayStartLocal} + INTERVAL '1 day') AT TIME ZONE ${timezone})`; return { nowLocal, todayStartLocal, todayStart, tomorrowStart }; }Also applies to: 96-104
src/repository/leaderboard.ts (1)
175-197: 日历周期(daily/weekly/monthly)的日期窗口构建逻辑重复,可提取公共辅助函数。三个分支的模式完全一致,仅
DATE_TRUNC单位和INTERVAL值不同。可考虑提取为辅助函数减少重复。♻️ 建议的重构方案
+function buildTruncatedWindowCondition( + unit: "day" | "week" | "month", + nowLocal: ReturnType<typeof sql>, + timezone: string, +) { + const startLocal = sql`DATE_TRUNC(${unit}, ${nowLocal})`; + const endExclusiveLocal = sql`${startLocal} + INTERVAL ${`1 ${unit}`}`; + const start = sql`(${startLocal} AT TIME ZONE ${timezone})`; + const endExclusive = sql`(${endExclusiveLocal} AT TIME ZONE ${timezone})`; + return sql`${messageRequest.createdAt} >= ${start} AND ${messageRequest.createdAt} < ${endExclusive}`; +} + // 然后在 switch 中: - case "daily": { - const startLocal = sql`DATE_TRUNC('day', ${nowLocal})`; - const endExclusiveLocal = sql`${startLocal} + INTERVAL '1 day'`; - const start = sql`(${startLocal} AT TIME ZONE ${timezone})`; - const endExclusive = sql`(${endExclusiveLocal} AT TIME ZONE ${timezone})`; - return sql`${messageRequest.createdAt} >= ${start} AND ${messageRequest.createdAt} < ${endExclusive}`; - } - case "weekly": { - ... - } - case "monthly": { - ... - } + case "daily": + return buildTruncatedWindowCondition("day", nowLocal, timezone); + case "weekly": + return buildTruncatedWindowCondition("week", nowLocal, timezone); + case "monthly": + return buildTruncatedWindowCondition("month", nowLocal, timezone);注意:
DATE_TRUNC和INTERVAL的参数是 SQL 关键字/字面量,需要确认 Drizzle 的sql模板对字符串参数的处理方式 —— PostgreSQL 的DATE_TRUNC第一个参数需要是字符串字面量,而INTERVAL '1 day'中的值也需要是字面量。如果 Drizzle 将这些参数化为$1,PostgreSQL 可能不接受。实际重构时可能需要使用sql.raw()来内联这些值(注意此处值是硬编码常量,不存在注入风险)。src/app/[locale]/dashboard/users/users-page-client.tsx (1)
201-221: 双向同步 effect 逻辑正确但认知复杂度较高Lines 201-207 将
tagFilters→pendingTagFilters同步,Lines 209-221 通过去抖后的 key 将pendingTagFilters→tagFilters反向应用。两组 effect 加上handleTagCommit/handleApplyFilters共同构成了三条写入tagFilters的路径,理解和维护成本偏高。可考虑后续将"待定 → 已应用"的同步逻辑收敛到单一机制(如仅保留显式提交或仅保留去抖自动应用),降低隐式状态流转的复杂度。
src/app/[locale]/settings/providers/_components/forms/provider-form/index.tsx (1)
722-722: 建议移除冗余的默认导出。第 644 行已有命名导出
export function ProviderForm,此处的export default是冗余的。根据编码规范,应优先使用命名导出。建议修改
-export default ProviderForm;As per coding guidelines,
**/*.{ts,tsx}: Prefer named exports over default exports.src/app/[locale]/dashboard/availability/_components/endpoint/probe-grid.tsx (1)
107-108: Tooltip 中的显示名称与卡片标题不一致。第 129 行使用
displayName(label → hostname → url),但第 176 行 Tooltip 仍使用endpoint.label || endpoint.url(跳过了 hostname 回退)。当label为空时,卡片标题显示 hostname,Tooltip 却显示完整 URL,行为不统一。如果是有意为之(Tooltip 展示更完整的信息),可以忽略。否则建议统一:
建议修改
- <p className="font-medium">{endpoint.label || endpoint.url}</p> + <p className="font-medium">{displayName}</p>Also applies to: 129-129, 176-176
src/app/[locale]/settings/providers/_components/provider-endpoint-hover.tsx (2)
438-441: 翻译 keykeyLoading语义不准确。此处加载的是端点数据,但使用了
t("keyLoading"),语义上 "key" 指密钥加载,与当前上下文(端点列表加载中)不符。建议使用更贴切的 key,如endpointStatus.loading或新增专用 key。
248-294: 降级路径的并发控制考虑周全,但建议添加日志。当
batchGetVendorTypeEndpointStats抛异常时,降级为逐个查询(限并发 8)。建议在 catch 块中添加console.warn或logger.warn日志,便于排查降级原因。当前的catch {}静默吞掉了异常信息。建议修改
- } catch { - // 降级路径:batch action 异常时按 vendorId 逐个查询。为避免 chunk 较大时触发请求风暴,这里限制并发。 + } catch (batchError) { + // 降级路径:batch action 异常时按 vendorId 逐个查询。为避免 chunk 较大时触发请求风暴,这里限制并发。 + console.warn("[VendorStatsBatcher] batch fetch failed, falling back to per-vendor queries", batchError);src/lib/migrate.ts (1)
68-134:repairDrizzleMigrationsCreatedAt逐行 UPDATE 在大多数场景下没有问题,但可考虑批量优化。修复通常只涉及少量行,逐条 UPDATE(lines 123-129)可以接受。如果未来 journal 条目规模增长显著,可改用单条
UPDATE ... FROM (VALUES ...)减少往返。目前不阻塞合并。src/instrumentation.ts (1)
413-442: 开发环境 backfill 未使用 advisory lock,与生产环境不一致。开发环境通常是单实例运行,此处不加锁不会导致实际问题。但如果后续开发环境也启用多实例调试(如 Docker Compose),可能出现重复 backfill。建议保持一致或在注释中说明此差异的原因。
src/lib/hooks/use-in-view-once.ts (1)
159-179: 阈值的"序列化→反序列化"路径存在微小精度风险。
threshold先转为字符串thresholdKey(line 159-164),再在useMemo中通过parseFloat转回数值(line 170-178)。对于实际场景中的常见阈值(0、0.5、1 等)不会有问题,但如果传入高精度浮点数(如0.1 + 0.2),parseFloat("0.30000000000000004")的结果可能与原始值微有差异。当前使用场景下风险极低,仅作提示。
src/app/[locale]/settings/providers/_components/endpoint-latency-sparkline.tsx (1)
127-144: 模块级可变状态isBatchProbeLogsEndpointAvailable和batchProbeLogsEndpointDisabledAt缺少并发安全说明。虽然浏览器 JS 是单线程的,这些变量在所有组件实例间共享是合理的。但
isBatchProbeLogsDisabled()中 lines 134-136 的防御逻辑(disabledAt为 null 时重置available)暗示过去可能出现过不一致状态。建议添加简短注释说明这些变量的预期生命周期和重置条件。src/app/[locale]/dashboard/availability/_components/endpoint/endpoint-tab.tsx (1)
258-304: Focus/visibility 刷新的节流逻辑合理。
focus和visibilitychange在切回标签页时常同时触发,2 秒节流避免了双倍请求。silent: true刷新 vendor 不会闪烁 loading 态——体验友好。一个小建议:
refreshThrottled内的void refresh()没有 catch,如果refresh()中的某个 awaited 调用抛出未预期异常,会产生 unhandled promise rejection。可在refresh末尾或调用处加.catch。建议在 refresh 调用处捕获异常
const refreshThrottled = () => { const now = Date.now(); if (now - lastFocusRefreshAtRef.current < 2000) return; lastFocusRefreshAtRef.current = now; - void refresh(); + void refresh().catch((err) => { + console.error("[EndpointTab] Background refresh failed:", err); + }); };src/repository/statistics.ts (1)
1079-1140:sumUserQuotaCosts与sumKeyQuotaCostsById逻辑高度重复,建议提取共享查询构建器。两个函数仅在 WHERE 条件上不同(
userIdvskey),但scanStart/scanEnd计算、FILTER 子句构造、结果映射完全相同。可提取一个内部 helper 接收不同的 filter condition,消除约 60 行重复代码。当前不阻塞合并,作为后续优化建议。
示例:提取共享 helper
// 内部 helper async function sumQuotaCostsInternal( filterCondition: SQL, ranges: QuotaCostRanges, maxAgeDays: number, ): Promise<QuotaCostSummary> { // ... shared scanStart/scanEnd/costTotal/query logic ... // WHERE 子句使用 filterCondition 替代具体的 eq(userId) 或 eq(key) } export async function sumUserQuotaCosts(userId: number, ranges: QuotaCostRanges, maxAgeDays = 365) { return sumQuotaCostsInternal(eq(messageRequest.userId, userId), ranges, maxAgeDays); } export async function sumKeyQuotaCostsById(keyId: number, ranges: QuotaCostRanges, maxAgeDays = 365) { const keyString = await getKeyStringByIdCached(keyId); if (!keyString) return { cost5h: 0, costDaily: 0, costWeekly: 0, costMonthly: 0, costTotal: 0 }; return sumQuotaCostsInternal(eq(messageRequest.key, keyString), ranges, maxAgeDays); }Also applies to: 1145-1211
src/drizzle/schema.ts (2)
323-329:provider_vendor_id IS NOT NULL条件冗余
providerVendorId列定义为.notNull()(第 160-161 行),因此索引 WHERE 子句中的provider_vendor_id IS NOT NULL永远为 true,属于冗余条件。虽然不影响正确性,但会让后续维护者产生困惑(以为该列可能为 NULL)。建议移除冗余条件
providersEnabledVendorTypeIdx: index('idx_providers_enabled_vendor_type').on( table.providerVendorId, table.providerType ).where( - sql`${table.deletedAt} IS NULL AND ${table.isEnabled} = true AND ${table.providerVendorId} IS NOT NULL AND ${table.providerVendorId} > 0` + sql`${table.deletedAt} IS NULL AND ${table.isEnabled} = true AND ${table.providerVendorId} > 0` ),
477-524:message_request表索引数量较多,注意写入性能影响此表现已有约 15 个索引(含本次新增的 6 个)。作为高写入表(注释中已提到),每个索引都会增加 INSERT/UPDATE 的开销。建议:
- 定期通过
pg_stat_user_indexes监控各索引的实际使用情况(idx_scan计数),清理未被查询命中的冗余索引。- 注意
messageRequestKeyIdx(第 495 行,单列key)与messageRequestKeyCreatedAtIdIdx(第 497-501 行,key + created_at + id)之间存在前缀重叠——后者的复合索引已覆盖纯key等值查询,前者可能成为冗余索引。src/lib/endpoint-circuit-breaker.ts (1)
200-268: 批量 Redis 同步的 in-flight 去重设计良好,但存在重复去重
syncHealthFromRedisBatch内部对endpointIds做了new Set()去重(第 201 行),而调用方getAllEndpointHealthStatusAsync在第 280 行已经做过一次去重,且needsRefresh是从去重后的数组 filter 出来的。虽然不影响正确性,但属于冗余操作。可选:信任调用方已去重
-async function syncHealthFromRedisBatch(endpointIds: readonly number[], refreshNow: number) { - const uniqueEndpointIds = Array.from(new Set(endpointIds)); - const toLoad: number[] = []; +async function syncHealthFromRedisBatch(endpointIds: readonly number[], refreshNow: number) { + const toLoad: number[] = []; const waitPromises: Promise<void>[] = []; - for (const endpointId of uniqueEndpointIds) { + for (const endpointId of endpointIds) {src/lib/provider-endpoints/probe.ts (1)
4-4:getEnvConfig此处使用静态导入,与文件中其他模块一致注意到
endpoint-circuit-breaker.ts中使用的是动态await import("@/lib/config/env.schema"),而此处是静态导入。两种方式在此场景下均可正常工作,但如果项目有统一的导入风格偏好,建议保持一致。src/app/[locale]/dashboard/availability/_components/availability-dashboard.tsx (1)
124-124: 代码注释建议使用英文,便于国际化协作。当前注释为中文。考虑到项目支持 5 种语言(zh-CN, zh-TW, en, ja, ru),建议将代码注释统一为英文,以便更广泛的贡献者理解代码意图。
建议修改
- // 当页面从后台回到前台时,做一次节流刷新,避免看到陈旧数据;同时配合 visibility 判断减少后台请求。 + // Throttled refresh when tab regains focus/visibility to avoid stale data; skips background requests.src/lib/redis/client.ts (1)
98-114:getRedisClient内部与buildRedisOptionsForUrl存在配置重复。
getRedisClient(Lines 98-114)和buildRedisOptionsForUrl(Lines 45-74)各自独立定义了retryStrategy、enableOfflineQueue、maxRetriesPerRequest等相同的配置。可以考虑在getRedisClient内部复用buildRedisOptionsForUrl以消除重复。这不是本次变更引入的问题,可作为后续重构处理。
建议修改
try { - const useTls = redisUrl.startsWith("rediss://"); - - // 1. 定义基础配置 - const redisOptions: RedisOptions = { - enableOfflineQueue: false, - maxRetriesPerRequest: 3, - retryStrategy(times) { - if (times > 5) { - logger.error("[Redis] Max retries reached, giving up"); - return null; - } - const delay = Math.min(times * 200, 2000); - logger.warn(`[Redis] Retry ${times}/5 after ${delay}ms`); - return delay; - }, - }; - - // 2. 如果使用 rediss://,则添加显式的 TLS 配置 - if (useTls) { - const raw = process.env.REDIS_TLS_REJECT_UNAUTHORIZED?.trim(); - const rejectUnauthorized = raw !== "false" && raw !== "0"; - logger.info("[Redis] Using TLS connection (rediss://)", { - redisUrl: safeRedisUrl, - rejectUnauthorized, - }); - redisOptions.tls = buildTlsConfig(redisUrl); - } + const { isTLS, options: redisOptions } = buildRedisOptionsForUrl(redisUrl); + + if (isTLS) { + logger.info("[Redis] Using TLS connection (rediss://)", { + redisUrl: safeRedisUrl, + }); + } - redisClient = new Redis(redisUrl, redisOptions); + redisClient = new Redis(redisUrl, redisOptions as RedisOptions);Also applies to: 45-74
tests/unit/actions/my-usage-token-aggregation.test.ts (1)
151-151: 选择数量断言放宽至>= 1,与getMyTodayStats的查询合并一致。从
toBeGreaterThanOrEqual(2)降为toBeGreaterThanOrEqual(1),反映了查询优化。但与getMyStatsSummary测试中使用的精确断言toHaveLength(1)风格不同。如果getMyTodayStats的查询数量也已确定,建议使用精确断言以避免测试在回归时过于宽松。src/lib/provider-endpoints/probe-log-cleanup.ts (1)
112-115: 丢失领导锁时未记录已删除的行数。当
leadershipLost为 true 时,return直接跳到finally,跳过了 line 133-138 的totalDeleted日志。如果清理过程中途失去锁,已完成的删除量不会被记录,不利于运维排查。建议在 return 前记录已删除的行数
if (leadershipLost) { + if (totalDeleted > 0) { + logger.info("[EndpointProbeLogCleanup] Partial cleanup before leadership lost", { + retentionDays: RETENTION_DAYS, + totalDeleted, + }); + } return; }tests/unit/repository/statistics-quota-costs-all-time.test.ts (1)
100-131:sumKeyQuotaCostsById测试中capturedSelectFields的捕获时机需确认。
capturedSelectFields = fields在wheremock 内部(line 116)赋值,但fields来自外层selectmock 的闭包参数。第二次select()调用时currentCallIndex === 2,此时fields确实是 cost 查询的 select 字段。不过这种依赖闭包捕获 + 调用次序的测试模式较脆弱——如果sumKeyQuotaCostsById内部实现调整了查询顺序,测试会静默失败或给出误导性结果。考虑在未来重构时将
capturedSelectFields赋值移到selectmock 层级,配合更明确的断言来降低脆弱性。src/actions/my-usage.ts (3)
249-290: 配额查询重构合理,并行化良好。将
sumKeyQuotaCostsById和sumUserQuotaCosts与 session 计数并行执行是正确的优化方向。解构赋值清晰地映射了各周期费用字段。需注意
ALL_TIME_MAX_AGE_DAYS = Infinity会导致costTotal扫描该 key/user 的全部历史数据(Number.isFinite(Infinity)为false,故cutoffDate为null)。对于长期活跃的高频 key,这个全量扫描可能在数据量增长后成为瓶颈。建议后续考虑是否需要为costTotal设置合理的上限天数,或引入物化视图/缓存层来加速全量聚合。
386-404: 在.map()内部通过副作用累加 totals 不够清晰。当前在
.map()回调中同时执行了转换和累加操作,.map()语义上应为纯转换。建议拆分为reduce或先forEach累加再.map()转换,使意图更明确。可选重构:分离累加与映射
- let totalCalls = 0; - let totalInputTokens = 0; - let totalOutputTokens = 0; - let totalCostUsd = 0; - - const modelBreakdown = breakdown.map((row) => { - const billingModel = billingModelSource === "original" ? row.originalModel : row.model; - const rawCostUsd = Number(row.costUsd ?? 0); - const costUsd = Number.isFinite(rawCostUsd) ? rawCostUsd : 0; - - totalCalls += row.calls ?? 0; - totalInputTokens += row.inputTokens ?? 0; - totalOutputTokens += row.outputTokens ?? 0; - totalCostUsd += costUsd; - - return { - model: row.model, - billingModel, - calls: row.calls, - costUsd, - inputTokens: row.inputTokens, - outputTokens: row.outputTokens, - }; - }); + const modelBreakdown = breakdown.map((row) => { + const billingModel = billingModelSource === "original" ? row.originalModel : row.model; + const rawCostUsd = Number(row.costUsd ?? 0); + const costUsd = Number.isFinite(rawCostUsd) ? rawCostUsd : 0; + return { + model: row.model, + billingModel, + calls: row.calls, + costUsd, + inputTokens: row.inputTokens, + outputTokens: row.outputTokens, + }; + }); + + const totalCalls = modelBreakdown.reduce((sum, r) => sum + (r.calls ?? 0), 0); + const totalInputTokens = modelBreakdown.reduce((sum, r) => sum + (r.inputTokens ?? 0), 0); + const totalOutputTokens = modelBreakdown.reduce((sum, r) => sum + (r.outputTokens ?? 0), 0); + const totalCostUsd = modelBreakdown.reduce((sum, r) => sum + r.costUsd, 0);
685-697:keyModelBreakdown的二次排序是必要的。数据库查询按
sum(costUsd) DESC(即 user 维度的总 cost)排序,但keyOnlyBreakdown经过过滤后需要按 key 维度的 cost 重新排序,此处的.sort((a, b) => b.cost - a.cost)是正确的。有一个小注意点:如果
Number(row.keyCost)在极端情况下返回NaN,sort比较会不稳定。当前通过COALESCE(..., 0)确保 DB 返回合法数值字符串,实际风险极低。如需额外防御,可在cost赋值时加Number.isFinite守卫(类似summaryAcc中 line 644 的处理方式)。可选:为 breakdown cost 添加 NaN 守卫
keyModelBreakdown: keyOnlyBreakdown .map((row) => ({ model: row.model, requests: row.keyRequests, - cost: Number(row.keyCost ?? 0), + cost: (() => { const c = Number(row.keyCost ?? 0); return Number.isFinite(c) ? c : 0; })(), inputTokens: row.keyInputTokens,同理
userModelBreakdown中 line 701 的cost字段也可做相同处理。
| if (isDirectEndpointEditConflictError(error)) { | ||
| return { | ||
| ok: false, | ||
| error: "端点 URL 与同供应商类型下的其他端点冲突", | ||
| errorCode: ERROR_CODES.CONFLICT, | ||
| }; | ||
| } | ||
|
|
||
| if (isForeignKeyViolationError(error)) { | ||
| return { | ||
| ok: false, | ||
| error: "供应商不存在", | ||
| errorCode: ERROR_CODES.NOT_FOUND, | ||
| }; | ||
| } |
There was a problem hiding this comment.
新增用户提示文案仍为硬编码中文。
这些返回值属于用户可见文案,建议改为 i18n key 或仅返回 errorCode 由前端统一翻译。
As per coding guidelines "All user-facing strings must use i18n (5 languages supported: zh-CN, zh-TW, en, ja, ru). Never hardcode display text".
Also applies to: 554-567, 749-754, 818-823
🤖 Prompt for AI Agents
In `@src/actions/provider-endpoints.ts` around lines 416 - 430, The user-facing
error strings in provider-endpoints handlers (e.g., the branch using
isDirectEndpointEditConflictError and isForeignKeyViolationError) are hardcoded
Chinese; replace them with i18n keys or remove the message and return only
errorCode so the frontend handles localization. Update the return objects in
functions/methods that use isDirectEndpointEditConflictError,
isForeignKeyViolationError (and other similar branches at the noted ranges) to
set either error: "i18n.provider.endpoint_conflict" / "i18n.provider.not_found"
(or another agreed key) or omit error and rely on ERROR_CODES.CONFLICT /
ERROR_CODES.NOT_FOUND being returned; ensure the same change is applied for the
other occurrences referenced (around lines 554-567, 749-754, 818-823) so no
hardcoded display strings remain.
| const pendingTagFiltersKey = useMemo( | ||
| () => pendingTagFilters.slice().sort().join("|"), | ||
| [pendingTagFilters] | ||
| ); | ||
| const pendingKeyGroupFiltersKey = useMemo( | ||
| () => pendingKeyGroupFilters.slice().sort().join("|"), | ||
| [pendingKeyGroupFilters] | ||
| ); | ||
| const debouncedPendingTagsKey = useDebounce(pendingTagFiltersKey, 300); | ||
| const debouncedPendingKeyGroupsKey = useDebounce(pendingKeyGroupFiltersKey, 300); |
There was a problem hiding this comment.
基于字符串键的去抖策略清晰合理
将数组通过 sort().join("|") 转为稳定字符串再去抖,避免了数组引用不稳定导致的多余触发,设计良好。
"|" 字符,排序+拼接后的 key 可能产生碰撞(如 ["a|b", "c"] 和 ["a", "b|c"] 会生成相同的 key)。若 tag 值受限于用户输入,建议在创建/校验时禁止使用该字符,或换用更安全的分隔符(如 \0)。
🤖 Prompt for AI Agents
In `@src/app/`[locale]/dashboard/users/users-page-client.tsx around lines 80 - 89,
The current debouncing keys pendingTagFiltersKey and pendingKeyGroupFiltersKey
build strings via sort().join("|"), which can collide if tag values contain
"|"—update the key generation to use a collision-safe encoding (e.g., sort the
arrays and then use JSON.stringify on the sorted arrays, or join with a null
character "\0", or escape values with encodeURIComponent) before passing into
useDebounce; change the expressions that compute pendingTagFiltersKey and
pendingKeyGroupFiltersKey (which derive from pendingTagFilters and
pendingKeyGroupFilters) accordingly so useDebounce receives a stable,
unambiguous key.
| // drizzle-orm migrator 仅比较 `created_at(folderMillis)` 来决定是否执行迁移。 | ||
| // 若历史 journal 的 `when` 被修正(或曾出现非单调),旧实例可能会因为 `created_at` 偏大而永久跳过后续迁移。 | ||
| // 这里用 hash 对齐并修复 created_at,让升级对用户无感(Docker 拉新镜像重启即可)。 | ||
| const migrations = readMigrationFiles({ migrationsFolder }); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, check the drizzle-orm version
if [ -f "package.json" ]; then
echo "=== Package.json drizzle-orm version ==="
cat package.json | jq '.dependencies."drizzle-orm" // .devDependencies."drizzle-orm"' 2>/dev/null || grep -i "drizzle-orm" package.json
fi
# Check if node_modules exists and drizzle-orm is installed
if [ -d "node_modules/drizzle-orm" ]; then
echo -e "\n=== Drizzle-orm installed, checking migrator ==="
# Look for migrator files
find node_modules/drizzle-orm -name "*migrator*" -type f | head -20
fiRepository: ding113/claude-code-hub
Length of output: 115
🏁 Script executed:
#!/bin/bash
# Search for readMigrationFiles in drizzle-orm
echo "=== Searching for readMigrationFiles ==="
if [ -d "node_modules/drizzle-orm" ]; then
rg -l 'readMigrationFiles' node_modules/drizzle-orm/ 2>/dev/null | head -10
echo -e "\n=== Checking migrator.d.ts ==="
if [ -f "node_modules/drizzle-orm/migrator.d.ts" ]; then
cat node_modules/drizzle-orm/migrator.d.ts | head -100
fi
fiRepository: ding113/claude-code-hub
Length of output: 107
🏁 Script executed:
#!/bin/bash
# Check drizzle-orm package.json for version
echo "=== Drizzle-orm package version ==="
if [ -f "node_modules/drizzle-orm/package.json" ]; then
cat node_modules/drizzle-orm/package.json | jq '.version' 2>/dev/null || grep -i "\"version\"" node_modules/drizzle-orm/package.json
fiRepository: ding113/claude-code-hub
Length of output: 102
🏁 Script executed:
# Check the migrate.ts file to see how readMigrationFiles is used
cat -n src/lib/migrate.ts | head -100Repository: ding113/claude-code-hub
Length of output: 3689
🏁 Script executed:
# Check imports in the file
rg -n 'import.*readMigrationFiles|from.*migrator' src/Repository: ding113/claude-code-hub
Length of output: 227
🌐 Web query:
drizzle-orm 0.44.0 readMigrationFiles API migrator documentation
💡 Result:
readMigrationFiles isn’t documented as a first-class public API in the Drizzle docs; it’s an internal helper used by the driver-specific migrate() implementations.
Where it lives / how it’s used (v0.44.x)
- Driver migrators (example:
drizzle-orm/node-postgres/migrator) callreadMigrationFiles(config)and then delegate todb.dialect.migrate(...). This is shown in Drizzle’s GitHub discussion about programmatic migrations. [2] - Example implementation detail (from a published
drizzle-ormbuild): a migrator reads migrations viareadMigrationFiles(config), determines the migrations table name (default__drizzle_migrations, overridable viaconfig.migrationsTable), and applies any migrations newer than the last recordedcreated_at/timestamp. [3]
The “official” docs to reference
For migrations, the official documentation is the Drizzle Kit command docs (CLI), which describes the same underlying flow: read .sql files from the migrations folder, check the migrations log table (__drizzle_migrations), run pending migrations, and log them. [1]
Practical takeaway
- If you want a supported API surface, use the driver-specific
migrate(db, { migrationsFolder, ... })entrypoint (as in the GitHub discussion). [2] - If you still want
readMigrationFiles, import it fromdrizzle-orm/migrator(it’s what the migrators use), but treat it as internal/unstable (no dedicated docs page). [2]
Sources: Drizzle Kit migrate docs [1], Drizzle ORM discussion showing readMigrationFiles used under the hood [2], published build source map showing readMigrationFiles + migrations table behavior [3].
readMigrationFiles 在 drizzle-orm 0.44.x 中存在,但为内部非文档化 API。
已确认 readMigrationFiles 从 drizzle-orm/migrator 导出,并返回包含 hash 和 folderMillis 属性的对象(第 81 行验证)。
但需要注意的是,根据 Drizzle 官方文档,readMigrationFiles 并非公开的公共 API,而是驱动程序特定迁移器使用的内部辅助函数,缺乏文档支持,在未来版本中可能发生变化。建议改用官方推荐的驱动程序特定的 migrate() 函数替代(如 drizzle-orm/postgres-js/migrator 中的 migrate(db, config) 形式)。
🤖 Prompt for AI Agents
In `@src/lib/migrate.ts` at line 77, The code currently uses the internal helper
readMigrationFiles (referenced as readMigrationFiles) from drizzle-orm which is
undocumented and unstable; replace its usage with the documented driver-specific
migrate API (e.g., migrate from drizzle-orm/postgres-js/migrator) by wiring your
DB client into migrate(db, config) and using its returned results instead of
readMigrationFiles output (drop reliance on hash/folderMillis from
readMigrationFiles and map to the migrate result shape); update imports to use
the official migrator (migrate) and adjust any call sites in this module (e.g.,
where migrations is used) to the migrate() promise/result.
| // Max idle DB polling interval (default 30s, bounded by base interval) | ||
| const DEFAULT_IDLE_DB_POLL_INTERVAL_MS = Math.min(BASE_INTERVAL_MS, 30_000); | ||
| const IDLE_DB_POLL_INTERVAL_MS = Math.max( | ||
| 1, | ||
| parseIntWithDefault( | ||
| process.env.ENDPOINT_PROBE_IDLE_DB_POLL_INTERVAL_MS, | ||
| DEFAULT_IDLE_DB_POLL_INTERVAL_MS | ||
| ) |
There was a problem hiding this comment.
IDLE 轮询间隔未按注释上限裁剪。
当前仅对默认值做了上限限制,环境变量可设为大于 BASE_INTERVAL_MS,与“bounded by base interval”的注释不一致。建议显式 clamp。
建议修正
-const IDLE_DB_POLL_INTERVAL_MS = Math.max(
- 1,
- parseIntWithDefault(
- process.env.ENDPOINT_PROBE_IDLE_DB_POLL_INTERVAL_MS,
- DEFAULT_IDLE_DB_POLL_INTERVAL_MS
- )
-);
+const IDLE_DB_POLL_INTERVAL_MS = Math.min(
+ BASE_INTERVAL_MS,
+ Math.max(
+ 1,
+ parseIntWithDefault(
+ process.env.ENDPOINT_PROBE_IDLE_DB_POLL_INTERVAL_MS,
+ DEFAULT_IDLE_DB_POLL_INTERVAL_MS
+ )
+ )
+);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Max idle DB polling interval (default 30s, bounded by base interval) | |
| const DEFAULT_IDLE_DB_POLL_INTERVAL_MS = Math.min(BASE_INTERVAL_MS, 30_000); | |
| const IDLE_DB_POLL_INTERVAL_MS = Math.max( | |
| 1, | |
| parseIntWithDefault( | |
| process.env.ENDPOINT_PROBE_IDLE_DB_POLL_INTERVAL_MS, | |
| DEFAULT_IDLE_DB_POLL_INTERVAL_MS | |
| ) | |
| // Max idle DB polling interval (default 30s, bounded by base interval) | |
| const DEFAULT_IDLE_DB_POLL_INTERVAL_MS = Math.min(BASE_INTERVAL_MS, 30_000); | |
| const IDLE_DB_POLL_INTERVAL_MS = Math.min( | |
| BASE_INTERVAL_MS, | |
| Math.max( | |
| 1, | |
| parseIntWithDefault( | |
| process.env.ENDPOINT_PROBE_IDLE_DB_POLL_INTERVAL_MS, | |
| DEFAULT_IDLE_DB_POLL_INTERVAL_MS | |
| ) | |
| ) | |
| ); |
🤖 Prompt for AI Agents
In `@src/lib/provider-endpoints/probe-scheduler.ts` around lines 32 - 39, The code
only applies the BASE_INTERVAL_MS upper bound when using the default, but allows
the env var ENDPOINT_PROBE_IDLE_DB_POLL_INTERVAL_MS to exceed BASE_INTERVAL_MS;
update the IDLE_DB_POLL_INTERVAL_MS computation to parse the env var via
parseIntWithDefault and then clamp the resulting value between 1 and
BASE_INTERVAL_MS (e.g., use Math.max(1, Math.min(parsedValue,
BASE_INTERVAL_MS))) so the effective poll interval respects the comment; update
references to DEFAULT_IDLE_DB_POLL_INTERVAL_MS, IDLE_DB_POLL_INTERVAL_MS,
BASE_INTERVAL_MS, parseIntWithDefault, and
process.env.ENDPOINT_PROBE_IDLE_DB_POLL_INTERVAL_MS accordingly.
| if (period === "custom" && dateRange) { | ||
| // 自定义日期范围:startDate <= date <= endDate | ||
| return sql`(${messageRequest.createdAt} AT TIME ZONE ${timezone})::date >= ${dateRange.startDate}::date | ||
| AND (${messageRequest.createdAt} AT TIME ZONE ${timezone})::date <= ${dateRange.endDate}::date`; | ||
| // 自定义日期范围:startDate <= local_date <= endDate | ||
| const startLocal = sql`(${dateRange.startDate}::date)::timestamp`; | ||
| const endExclusiveLocal = sql`(${dateRange.endDate}::date + INTERVAL '1 day')`; | ||
| const start = sql`(${startLocal} AT TIME ZONE ${timezone})`; | ||
| const endExclusive = sql`(${endExclusiveLocal} AT TIME ZONE ${timezone})`; | ||
| return sql`${messageRequest.createdAt} >= ${start} AND ${messageRequest.createdAt} < ${endExclusive}`; | ||
| } |
There was a problem hiding this comment.
自定义日期范围的输入未做格式校验,无效的 startDate/endDate 会导致 PostgreSQL 运行时错误。
DateRangeParams 接口约定格式为 YYYY-MM-DD,但 buildDateCondition 没有在 SQL 执行前验证输入格式。如果传入非法字符串(如 "not-a-date"),::date 转换会抛出 PostgreSQL 错误,错误信息可能暴露给调用方。
建议在进入 SQL 构建前做基础格式校验,或在上游调用方(如 findCustomRangeLeaderboard)统一校验。
🛡️ 建议的防御性校验
+const DATE_REGEX = /^\d{4}-\d{2}-\d{2}$/;
+
if (period === "custom" && dateRange) {
+ if (!DATE_REGEX.test(dateRange.startDate) || !DATE_REGEX.test(dateRange.endDate)) {
+ throw new Error("Invalid date format. Expected YYYY-MM-DD.");
+ }
const startLocal = sql`(${dateRange.startDate}::date)::timestamp`;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (period === "custom" && dateRange) { | |
| // 自定义日期范围:startDate <= date <= endDate | |
| return sql`(${messageRequest.createdAt} AT TIME ZONE ${timezone})::date >= ${dateRange.startDate}::date | |
| AND (${messageRequest.createdAt} AT TIME ZONE ${timezone})::date <= ${dateRange.endDate}::date`; | |
| // 自定义日期范围:startDate <= local_date <= endDate | |
| const startLocal = sql`(${dateRange.startDate}::date)::timestamp`; | |
| const endExclusiveLocal = sql`(${dateRange.endDate}::date + INTERVAL '1 day')`; | |
| const start = sql`(${startLocal} AT TIME ZONE ${timezone})`; | |
| const endExclusive = sql`(${endExclusiveLocal} AT TIME ZONE ${timezone})`; | |
| return sql`${messageRequest.createdAt} >= ${start} AND ${messageRequest.createdAt} < ${endExclusive}`; | |
| } | |
| const DATE_REGEX = /^\d{4}-\d{2}-\d{2}$/; | |
| if (period === "custom" && dateRange) { | |
| if ( | |
| !DATE_REGEX.test(dateRange.startDate) || | |
| !DATE_REGEX.test(dateRange.endDate) | |
| ) { | |
| throw new Error("Invalid date format. Expected YYYY-MM-DD."); | |
| } | |
| // 自定义日期范围:startDate <= local_date <= endDate | |
| const startLocal = sql`(${dateRange.startDate}::date)::timestamp`; | |
| const endExclusiveLocal = sql`(${dateRange.endDate}::date + INTERVAL '1 day')`; | |
| const start = sql`(${startLocal} AT TIME ZONE ${timezone})`; | |
| const endExclusive = sql`(${endExclusiveLocal} AT TIME ZONE ${timezone})`; | |
| return sql`${messageRequest.createdAt} >= ${start} AND ${messageRequest.createdAt} < ${endExclusive}`; | |
| } |
🤖 Prompt for AI Agents
In `@src/repository/leaderboard.ts` around lines 163 - 170, buildDateCondition
currently interpolates dateRange.startDate/endDate directly into SQL which
allows invalid strings (e.g. "not-a-date") to reach PostgreSQL and cause runtime
errors; add defensive validation of DateRangeParams (ensure YYYY-MM-DD format
and valid calendar date) before building SQL (either inside buildDateCondition
or at the caller findCustomRangeLeaderboard) and reject/throw a clear validation
error for invalid inputs. Use a strict YYYY-MM-DD regex and/or Date parsing to
confirm the values are valid dates, reference the symbols dateRange.startDate,
dateRange.endDate, buildDateCondition, findCustomRangeLeaderboard,
messageRequest.createdAt and timezone when implementing the check, and only
construct the SQL when validation passes.
There was a problem hiding this comment.
Code Review Summary
This is a well-structured performance-focused refactoring that addresses N+1 queries, adds strategic caching, and improves database index utilization. The PR demonstrates careful engineering with proper error handling, LRU eviction, and defensive coding patterns.
PR Size: XL
- Lines changed: ~26,000 (24,827 additions, 1,176 deletions)
- Files changed: 72
Split Suggestion: For future PRs of this size, consider splitting into:
- Database migrations and schema changes
- Repository layer optimizations (caching, batch queries)
- Frontend optimizations (useInViewOnce, virtualized components)
- Circuit breaker and endpoint lifecycle changes
Issues Found
| Category | Critical | High | Medium | Low |
|---|---|---|---|---|
| Logic/Bugs | 0 | 1 | 0 | 0 |
| Security | 0 | 0 | 0 | 0 |
| Error Handling | 0 | 1 | 0 | 0 |
| Types | 0 | 0 | 0 | 0 |
| Comments/Docs | 0 | 0 | 0 | 0 |
| Tests | 0 | 0 | 0 | 0 |
| Simplification | 0 | 0 | 1 | 0 |
Critical Issues (Must Fix)
None identified.
High Priority Issues (Should Fix)
1. Race condition in getProviderStatistics in-flight dedup
File: src/repository/provider.ts:1135-1224
The in-flight deduplication has a race condition. The providerStatisticsInFlight is assigned AFTER the promise starts executing (line 1224), leaving a window where concurrent callers can bypass the dedup check.
Current problematic flow:
// Line 1131-1133: Check if in-flight
if (providerStatisticsInFlight && providerStatisticsInFlight.timezone === timezone) {
return await providerStatisticsInFlight.promise;
}
// Line 1135-1222: Create promise (executes immediately)
const promise = (async () => { ... })();
// Line 1224: Register as in-flight (TOO LATE)
providerStatisticsInFlight = { timezone, promise };Fix: Register the in-flight promise BEFORE starting async work:
if (providerStatisticsInFlight && providerStatisticsInFlight.timezone === timezone) {
return await providerStatisticsInFlight.promise;
}
// Create placeholder first
let resolve: (value: ProviderStatisticsRow[]) => void;
let reject: (reason: unknown) => void;
const promise = new Promise<ProviderStatisticsRow[]>((res, rej) => {
resolve = res;
reject = rej;
});
// Register immediately
providerStatisticsInFlight = { timezone, promise };
try {
const result = await (async () => { /* query logic */ })();
resolve(result);
return result;
} catch (e) {
reject(e);
throw e;
}Medium Priority Issues
2. In-place array mutation in endpoint-selector.ts
File: src/lib/provider-endpoints/endpoint-selector.ts:17
The rankActiveProviderEndpoints function mutates the input array with .sort(). Callers may not expect their array to be modified.
Fix: Clone before sorting:
function rankActiveProviderEndpoints(endpoints: ProviderEndpoint[]): ProviderEndpoint[] {
if (endpoints.length <= 1) return endpoints;
return [...endpoints].sort((a, b) => { /* ... */ });
}Review Coverage
- Logic and correctness - Race condition identified
- Security (OWASP Top 10) - Clean
- Error handling - Comprehensive with try/catch and logging
- Type safety - Clean with proper TypeScript usage
- Documentation accuracy - Comments match implementation
- Test coverage - Adequate unit tests added
- Code clarity - Generally good, minor mutation issue
Notable Positive Aspects
- Excellent database optimization: FILTER clauses, range-based date comparisons, new indexes
- Comprehensive caching: TTL + LRU eviction on all caches prevents memory leaks
- Multi-instance safety: Advisory locks for migrations and backfills
- DST-aware date handling: Proper timezone handling for statistics
- Batch API design: Efficient LATERAL joins for probe logs
Automated review by Claude AI
| const delayMs = process.env.NODE_ENV === "test" ? 0 : 10; | ||
| this.flushTimer = setTimeout(() => { | ||
| this.flushTimer = null; | ||
| void this.flush().catch(() => {}); |
There was a problem hiding this comment.
[HIGH] [ERROR-SWALLOWED] Swallowed flush() rejection hides failures
src/app/[locale]/settings/providers/_components/endpoint-latency-sparkline.tsx:375
Why this is a problem: void this.flush().catch(() => {}); discards any rejection from flush(). If flush() ever rejects (unexpected runtime bug, refactor regression), this fails silently and makes diagnosis much harder.
Suggested fix:
this.flushTimer = setTimeout(() => {
this.flushTimer = null;
void this.flush().catch((error) => {
console.error("[ProbeLogsBatcher] flush failed", error);
});
}, delayMs);|
|
||
| this.flushTimer = setTimeout(() => { | ||
| this.flushTimer = null; | ||
| void this.flush().catch(() => {}); |
There was a problem hiding this comment.
[HIGH] [ERROR-SWALLOWED] Swallowed flush() rejection hides failures
src/app/[locale]/settings/providers/_components/provider-endpoint-hover.tsx:127
Why this is a problem: void this.flush().catch(() => {}); ignores any rejection from flush(). If flush() ever rejects (unexpected runtime bug, refactor regression), this fails silently and makes diagnosis much harder.
Suggested fix:
this.flushTimer = setTimeout(() => {
this.flushTimer = null;
void this.flush().catch((error) => {
console.error("[VendorTypeEndpointStatsBatcher] flush failed", error);
});
}, 0);| } | ||
| } | ||
|
|
||
| export async function batchGetProviderEndpointProbeLogs( |
There was a problem hiding this comment.
[HIGH] [TEST-MISSING-CRITICAL] New batch action lacks unit tests
src/actions/provider-endpoints.ts:708
Why this is a problem: CLAUDE.md requires: "Test Coverage - All new features must have unit test coverage of at least 80%". batchGetProviderEndpointProbeLogs introduces new behavior (admin gating, zod validation, de-dupe + ordering, empty-input fast path, per-endpoint limit) but tests/unit/actions/provider-endpoints.test.ts does not cover it.
Suggested fix (add to tests/unit/actions/provider-endpoints.test.ts):
it("batchGetProviderEndpointProbeLogs: dedupes ids and returns per-endpoint logs", async () => {
getSessionMock.mockResolvedValue({ user: { role: "admin" } });
findProviderEndpointProbeLogsBatchMock.mockResolvedValue(new Map([[2, []]]) as any);
const { batchGetProviderEndpointProbeLogs } = await import("@/actions/provider-endpoints");
const res = await batchGetProviderEndpointProbeLogs({ endpointIds: [2, 2], limit: 12 });
expect(res).toEqual({ ok: true, data: [{ endpointId: 2, logs: [] }] });
expect(findProviderEndpointProbeLogsBatchMock).toHaveBeenCalledWith({
endpointIds: [2],
limitPerEndpoint: 12,
});
});| } | ||
| } | ||
|
|
||
| export async function batchGetVendorTypeEndpointStats(input: unknown): Promise< |
There was a problem hiding this comment.
[HIGH] [TEST-MISSING-CRITICAL] New batch action lacks unit tests
src/actions/provider-endpoints.ts:758
Why this is a problem: CLAUDE.md requires: "Test Coverage - All new features must have unit test coverage of at least 80%". batchGetVendorTypeEndpointStats has non-trivial semantics (admin gating, zod validation, de-dupe, input-order preservation, defaulting missing vendor rows to zeros) but tests/unit/actions/provider-endpoints.test.ts does not cover it.
Suggested fix (add to tests/unit/actions/provider-endpoints.test.ts):
it("batchGetVendorTypeEndpointStats: dedupes ids and defaults missing vendor rows", async () => {
getSessionMock.mockResolvedValue({ user: { role: "admin" } });
findVendorTypeEndpointStatsBatchMock.mockResolvedValue([
{ vendorId: 2, total: 3, enabled: 1, healthy: 1, unhealthy: 0, unknown: 0 },
] as any);
const { batchGetVendorTypeEndpointStats } = await import("@/actions/provider-endpoints");
const res = await batchGetVendorTypeEndpointStats({
vendorIds: [2, 1, 2],
providerType: "claude",
});
expect(findVendorTypeEndpointStatsBatchMock).toHaveBeenCalledWith({
vendorIds: [2, 1],
providerType: "claude",
});
expect(res.ok).toBe(true);
expect(res.data).toEqual([
{ vendorId: 2, providerType: "claude", total: 3, enabled: 1, healthy: 1, unhealthy: 0, unknown: 0 },
{ vendorId: 1, providerType: "claude", total: 0, enabled: 0, healthy: 0, unhealthy: 0, unknown: 0 },
]);
});There was a problem hiding this comment.
Code Review Summary
This is an XL PR with broad changes across provider management UI, new admin batch actions, repository query paths, Redis/circuit state syncing, and DB migrations. The scope makes regression risk and root-cause attribution high; splitting into smaller PRs would make review and rollback safer.
PR Size: XL
- Lines changed: 26003
- Files changed: 72
- Split suggestions:
- PR 1: DB migrations +
src/lib/migrate.tslocking/repair logic - PR 2: Endpoint circuit breaker Redis batch-sync + selector/statistics repository changes
- PR 3: Providers UI batching (
provider-endpoint-hover,endpoint-latency-sparkline) + related actions/routes - PR 4: My Usage/statistics query optimizations + tests
- PR 1: DB migrations +
Issues Found
| Category | Critical | High | Medium | Low |
|---|---|---|---|---|
| Logic/Bugs | 0 | 0 | 0 | 0 |
| Security | 0 | 0 | 0 | 0 |
| Error Handling | 0 | 2 | 0 | 0 |
| Types | 0 | 0 | 0 | 0 |
| Comments/Docs | 0 | 0 | 0 | 0 |
| Tests | 0 | 2 | 0 | 0 |
| Simplification | 0 | 0 | 0 | 0 |
Critical Issues (Must Fix)
None.
High Priority Issues (Should Fix)
- [HIGH] [ERROR-SWALLOWED]
src/app/[locale]/settings/providers/_components/endpoint-latency-sparkline.tsx:375swallowsflush()rejection via.catch(() => {}). - [HIGH] [ERROR-SWALLOWED]
src/app/[locale]/settings/providers/_components/provider-endpoint-hover.tsx:127swallowsflush()rejection via.catch(() => {}). - [HIGH] [TEST-MISSING-CRITICAL]
src/actions/provider-endpoints.ts:708addsbatchGetProviderEndpointProbeLogswithout unit coverage. - [HIGH] [TEST-MISSING-CRITICAL]
src/actions/provider-endpoints.ts:758addsbatchGetVendorTypeEndpointStatswithout unit coverage.
Review Coverage
- Logic and correctness
- Security (OWASP Top 10)
- Error handling
- Type safety
- Documentation accuracy
- Test coverage
- Code clarity
Automated review by Codex AI
There was a problem hiding this comment.
- Identified open PR:
#789(refactor(provider): improve provider page performance) - Applied size label:
size/XL(additions24827, deletions1176, files72) - Posted 4 inline review comments on diff lines:
src/app/[locale]/settings/providers/_components/endpoint-latency-sparkline.tsx:375(.catch(() => {})swallowingflush()rejection)src/app/[locale]/settings/providers/_components/provider-endpoint-hover.tsx:127(.catch(() => {})swallowingflush()rejection)src/actions/provider-endpoints.ts:708(missing unit tests forbatchGetProviderEndpointProbeLogs)src/actions/provider-endpoints.ts:758(missing unit tests forbatchGetVendorTypeEndpointStats)
- Submitted the required PR review summary via
gh pr review --comment
* fix: Providers 管理页批量化端点统计与测活日志 * perf: 优化 provider 统计与 my-usage 查询性能 * perf: Providers 管理页移除 refresh 放大器并按需加载端点区块 * fix: 跟进 review 补齐 Providers 批量与统计健壮性 * fix: 跟进 CodeRabbit 修复 in-view 与测活数据校验 * perf: 补齐 in-view 稳定化与 batch 404 复原 * perf: my-usage 配额/汇总减少 DB 往返 * perf(providers): 端点池热路径批量熔断查询与索引迁移 (#779) - 运行时端点选择与严格审计统计改为批量读取端点熔断状态,减少 Redis 往返\n- probe 写入在端点并发删除时静默忽略,避免 FK 失败导致任务中断\n- 新增索引迁移:idx_provider_endpoints_pick_enabled / idx_providers_vendor_type_url_active\n- repository 批量查询模块改为 server-only,避免误暴露为 Server Action * fix: 跟进 review 去重熔断 reset 与 scanEnd (#779) * fix: 精确熔断 reset + repo 使用 server-only (#779) * fix: my-usage 补齐 sessionId/warmup 过滤 (#779) * perf: provider 统计 in-flight 去重更稳健 (#779) * fix: ProviderForm 统一失效相关缓存 (#779) * fix: Providers/Usage 细节修正与用例补齐 (#779) * style: biome 格式化补齐 (#779) * fix(#779): 熔断状态同步与 probeLogs 批量查询改进 * fix(#781): 清理孤儿端点并修正 Endpoint Health * perf: 优化 usage logs 与端点同步(#779/#781) * refactor: 移除端点冗余过滤(#779) * fix: 熔断状态批量查询仅覆盖启用端点(#779) * fix: Provider 统计兼容脏数据并稳定 probe logs 排序(#779) * perf: 禁用 Providers 重查询的 window focus 自动刷新(#779) * fix: 多实例熔断状态定期同步,并修复 backfill 遗留软删除端点(#779/#781) * perf: probe scheduler 仅探测启用 provider 的端点(#781) * perf: ProviderForm 避免重复 refetch 并稳定 hover circuit key(#779) * perf: 全局 QueryClient 策略与 usage/user 索引优化(#779) * perf: 时区统计索引命中与批量删除优化(#779) * perf: 降低 logs/users 页面无效重算 * fix(provider): endpoint pool 仅基于启用 provider - sync/backfill/delete:引用判断与回填仅考虑 is_enabled=true 的 provider,避免 disabled provider 复活旧 endpoint - updateProvider:provider 从禁用启用时确保端点存在 - Dashboard Endpoint Health:避免并发刷新覆盖用户切换,vendor/type 仅从启用 provider 推导 - probe logs 批量接口:滚动发布场景下部分 404 不全局禁用 batch - 补齐 endpoint-selector 单测以匹配 findEnabled* 语义 * perf: Dashboard vendor/type 轻量查询与 usage logs 并行查询 * fix(migrate): advisory lock 串行迁移并移除 emoji 日志 * fix: endpoint hover 兜底并规范 batch probe logs SQL * perf(settings/providers): 减少冗余刷新并复用 endpoint/circuit 缓存 * perf(probe/statistics): 修正 probe 锁/计数并收敛统计与 usage 扫描 * perf(probe/ui): 优化 probe 目标筛选 SQL 并减少 sparkline 闪烁 * fix(db): 修复 Drizzle snapshot 链 * fix(perf): 补强 Providers 批量与缓存一致性 - Provider 统计:消除隐式 cross join,收敛 in-flight 清理;deleteProvidersBatch 降低事务内往返\n- Providers hover:按 QueryClient 隔离微批量并支持 AbortSignal,减少串扰与潜在泄漏\n- Probe/熔断/缓存:probe 目标查询改为 join;Redis 同步时更新计数字段;统计缓存保持 FIFO 语义\n- My Usage:userBreakdown 补齐 5m/1h cache 聚合列(当前 UI 未展示) * chore: format code (issue-779-provider-performance-23b338e) * chore: 触发 CI 重跑 * fix(provider): 批量启用时补齐 endpoint pool - batchUpdateProviders 会走 updateProvidersBatch;当供应商从 disabled 批量启用时,best-effort 插入缺失的 provider_endpoints 记录\n- 避免历史/竞态导致启用后严格端点策略下无可用 endpoint 而被阻断 * fix(perf): 收敛 Providers 刷新放大并优化探测/分页 * perf: 收敛 availability/probe 轮询并优化 my-usage (#779/#781) - AvailabilityDashboard: 抑制重叠/乱序刷新,前后台切换节流强刷\n- Probe scheduler/cleanup: idle DB poll + 锁续租,降低无意义扫描与并发清理\n- Endpoint circuit: Redis 同步节流(1s)\n- My Usage: key/user breakdown 合并为单次聚合\n- DB: 新增 message_request key+model/endpoint 部分索引迁移;修复 journal 单调性校验与迁移表 created_at 自愈 * fix(ui): 恢复全局 react-query 默认配置 * fix(availability): 刷新 vendors 时清理旧 endpoint 选择 * perf: 补强 Providers 探测与 Usage Logs 性能 * perf(ui): useInViewOnce 共享 IntersectionObserver 降低资源占用 - 按 (root+options) 复用 observer pool,减少长列表/大表格下的 observer 实例数\n- 补齐单测覆盖(test env 直通 + 共享/释放语义) * perf: providers batch where 优化与 sparkline 降级并发修正 * perf: my-usage breakdown 补齐缓存字段并优化筛选缓存 * perf: 优化端点熔断 Redis 负载与探测候选 * fix(#781): Endpoint Health 仅展示启用 provider 引用端点 * 修正端点健康筛选并增强URL解析容错 * docs(provider-endpoints): 说明 keepPreviousWhenReferenced 语义 * perf(availability): EndpointTab 前后台切换节流刷新 * docs(availability): 补充 EndpointTab 刷新节流注释 * chore(review): 按 AI 审阅补齐注释并收敛细节 * fix: 修正 provider 统计 SQL 的 DST 日界 --------- Co-authored-by: tesgth032 <tesgth032@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…t issues Merge 6 index migrations (0068-0073) into single idempotent migration. Extract reusable utilities from duplicated code across the codebase: - TTLMap<K,V>: generic LRU+TTL cache replacing 3 inline implementations - createAbortError: shared abort error factory from 2 components - startLeaderLockKeepAlive: shared leader lock renewal from 2 schedulers - ProbeLogsBatcher: data-fetching infra extracted from sparkline component - buildUsageLogConditions: shared SQL filter builder from 3 query functions Additional cleanup: - Simplify useInViewOnce hook (remove unused options, keep shared observer pool) - Remove dead code (sumKeyTotalCostById, unexport internal types) - Hardcode env var defaults (ENDPOINT_CIRCUIT_HEALTH_CACHE_MAX_SIZE, ENDPOINT_PROBE_IDLE_DB_POLL_INTERVAL_MS) - Fix in-flight dedup race condition in getProviderStatistics - Fix yesterday/today interval boundary inconsistency (lte -> lt) - Add NaN guard for limitPerEndpoint in batch probe logs - Add updatedAt to deleteProvider for audit consistency - Log swallowed flush() errors in batchers instead of silently catching
54afd85 to
c2c6757
Compare
🧪 测试结果
总体结果: ✅ 所有测试通过 |
| loadedFromRedisAt.set(endpointId, now); | ||
|
|
||
| try { | ||
| const redisState = await loadEndpointCircuitState(endpointId); |
There was a problem hiding this comment.
Timestamp updated before Redis fetch completes. If Redis call fails or is slow, concurrent calls arriving within the TTL window will skip Redis sync and potentially use stale memory state. Consider setting the timestamp after successful Redis load (similar to the in-flight dedup pattern in provider.ts:1224-1225).
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/endpoint-circuit-breaker.ts
Line: 97:100
Comment:
Timestamp updated before Redis fetch completes. If Redis call fails or is slow, concurrent calls arriving within the TTL window will skip Redis sync and potentially use stale memory state. Consider setting the timestamp after successful Redis load (similar to the in-flight dedup pattern in `provider.ts:1224-1225`).
How can I resolve this? If you propose a fix, please make it concise.| if (this.store.size >= this.maxSize) { | ||
| this.evict(); | ||
| } |
There was a problem hiding this comment.
Size check uses >= but happens after delete on line 28. For updating existing keys, the delete reduces size before the check, so eviction won't trigger when exactly at capacity (e.g., size=100, delete makes 99, check passes, insert makes 100 again). This is correct behavior but could add a clarifying comment.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/cache/ttl-map.ts
Line: 30:32
Comment:
Size check uses `>=` but happens after delete on line 28. For updating existing keys, the delete reduces size before the check, so eviction won't trigger when exactly at capacity (e.g., size=100, delete makes 99, check passes, insert makes 100 again). This is correct behavior but could add a clarifying comment.
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
src/repository/leaderboard.ts (1)
495-495:⚠️ Potential issue | 🟡 Minor
cacheCreationCost语义与实际计算不一致。当前表达式对
cacheCreationInputTokens > 0的请求求和的是 整个请求的costUsd,而非仅缓存创建部分的成本。字段命名为cacheCreationCost容易让调用方误以为这是缓存创建的专属开销。如果无法拆分出精确的缓存创建成本,建议将字段重命名为更准确的名称(如
costOfCacheCreatingRequests),或在接口注释中明确说明其含义。src/app/[locale]/dashboard/users/users-page-client.tsx (1)
182-190:⚠️ Potential issue | 🟡 Minor硬编码的英文错误消息违反 i18n 规范。
"Failed to fetch settings"是硬编码的英文字符串。虽然当前组件未直接展示此错误信息,但如果上层存在 Error Boundary 或后续重构将其暴露给用户,则会导致非英文用户看到未翻译的文本。建议使用 i18n 键或统一的错误提示:
♻️ 建议修改
- if (!response.ok) throw new Error("Failed to fetch settings"); + if (!response.ok) throw new Error(tCommon("error"));As per coding guidelines,
**/*.{ts,tsx,js,jsx}: "All user-facing strings must use i18n (5 languages supported: zh-CN, zh-TW, en, ja, ru). Never hardcode display text".src/lib/endpoint-circuit-breaker.ts (1)
84-141:⚠️ Potential issue | 🟠 Major避免 Redis 同步竞态并确保缓存上限生效
getOrCreateHealth在 Redis 读取前先写入loadedFromRedisAt,但未登记redisSyncInFlight,并发调用会在 Redis 结果尚未返回时跳过同步,可能在 TTL 窗口内误判为 closed;同时 Redis 分支的 early return 会绕过 LRU/容量裁剪。建议用 in-flight 包裹 Redis 读取、将loadedFromRedisAt放在读取完成后设置,并统一在函数末尾执行裁剪。建议修复
async function getOrCreateHealth(endpointId: number): Promise<EndpointHealth> { const inFlight = redisSyncInFlight.get(endpointId); if (inFlight) { await inFlight; } let health = healthMap.get(endpointId); const loadedAt = loadedFromRedisAt.get(endpointId); const now = Date.now(); const needsRedisCheck = loadedAt === undefined || (loadedAt !== undefined && now - loadedAt > REDIS_SYNC_TTL_MS); + let resolved: EndpointHealth | undefined; if (needsRedisCheck) { - loadedFromRedisAt.set(endpointId, now); - - try { - const redisState = await loadEndpointCircuitState(endpointId); - if (redisState) { - // 从 Redis 同步到内存时,不能只在 circuitState 变化时才更新: - // failureCount / halfOpenSuccessCount 等字段也可能在其它实例中发生变化。 - if (health) { - health.failureCount = redisState.failureCount; - health.lastFailureTime = redisState.lastFailureTime; - health.circuitState = redisState.circuitState; - health.circuitOpenUntil = redisState.circuitOpenUntil; - health.halfOpenSuccessCount = redisState.halfOpenSuccessCount; - return health; - } - - health = { - failureCount: redisState.failureCount, - lastFailureTime: redisState.lastFailureTime, - circuitState: redisState.circuitState, - circuitOpenUntil: redisState.circuitOpenUntil, - halfOpenSuccessCount: redisState.halfOpenSuccessCount, - }; - healthMap.set(endpointId, health); - return health; - } - - if (health && health.circuitState !== "closed") { - health.circuitState = "closed"; - health.failureCount = 0; - health.lastFailureTime = null; - health.circuitOpenUntil = null; - health.halfOpenSuccessCount = 0; - } - } catch (error) { - logger.warn("[EndpointCircuitBreaker] Failed to sync state from Redis", { - endpointId, - error: error instanceof Error ? error.message : String(error), - }); - } + const syncPromise = (async () => { + try { + const redisState = await loadEndpointCircuitState(endpointId); + loadedFromRedisAt.set(endpointId, Date.now()); + + if (redisState) { + const current = healthMap.get(endpointId); + if (current) { + current.failureCount = redisState.failureCount; + current.lastFailureTime = redisState.lastFailureTime; + current.circuitState = redisState.circuitState; + current.circuitOpenUntil = redisState.circuitOpenUntil; + current.halfOpenSuccessCount = redisState.halfOpenSuccessCount; + return current; + } + + const nextHealth: EndpointHealth = { + failureCount: redisState.failureCount, + lastFailureTime: redisState.lastFailureTime, + circuitState: redisState.circuitState, + circuitOpenUntil: redisState.circuitOpenUntil, + halfOpenSuccessCount: redisState.halfOpenSuccessCount, + }; + healthMap.set(endpointId, nextHealth); + return nextHealth; + } + + const current = healthMap.get(endpointId); + if (current && current.circuitState !== "closed") { + current.circuitState = "closed"; + current.failureCount = 0; + current.lastFailureTime = null; + current.circuitOpenUntil = null; + current.halfOpenSuccessCount = 0; + } + return current; + } catch (error) { + logger.warn("[EndpointCircuitBreaker] Failed to sync state from Redis", { + endpointId, + error: error instanceof Error ? error.message : String(error), + }); + return healthMap.get(endpointId); + } + })(); + + redisSyncInFlight.set(endpointId, syncPromise); + try { + resolved = await syncPromise; + } finally { + if (redisSyncInFlight.get(endpointId) === syncPromise) { + redisSyncInFlight.delete(endpointId); + } + } } - const result = getOrCreateHealthSync(endpointId); + const result = resolved ?? getOrCreateHealthSync(endpointId); enforceEndpointHealthCacheMaxSize(); return result; }src/app/[locale]/dashboard/availability/_components/endpoint-probe-history.tsx (1)
89-111:⚠️ Potential issue | 🟡 Minor
fetchLogs中fetch响应未校验 HTTP 状态码。Line 101-102 直接对响应调用
res.json()而未检查res.ok。当 API 返回 4xx/5xx 时,res.json()可能解析出非预期结构(或抛异常),导致data.logs为undefined从而静默清空日志。建议:在 json 解析前校验状态码
const res = await fetch(`/api/availability/endpoints/probe-logs?${params.toString()}`); + if (!res.ok) { + console.error("Failed to fetch logs", res.status, res.statusText); + return; + } const data = await res.json();
🤖 Fix all issues with AI agents
In
`@src/app/`[locale]/dashboard/availability/_components/endpoint/endpoint-tab.tsx:
- Around line 61-137: The finally block can leave loadingVendors stuck true when
a non-silent request A is in-flight, a silent request B increments
vendorsRequestIdRef and cancels A, and neither finally clears loading; to fix,
remember whether this invocation actually set loading (e.g., const didSetLoading
= !options?.silent; call setLoadingVendors(true) only when didSetLoading), and
in finally clear loading when either didSetLoading is true or this request is
still the current one (if (didSetLoading || requestId ===
vendorsRequestIdRef.current) setLoadingVendors(false)); update refreshVendors to
use didSetLoading and reference vendorsRequestIdRef, setLoadingVendors, and
options.silent accordingly.
In `@src/app/`[locale]/settings/providers/_components/provider-endpoint-hover.tsx:
- Around line 432-435: The loading message for the endpoints list reuses
t("keyLoading") which is semantically "key loading"; update the endpointsLoading
branch in provider-endpoint-hover.tsx to use a dedicated translation key (e.g.
t("endpointStatus.loading") or a generic t("loading")) instead of "keyLoading",
and add the corresponding entries to your i18n resource files so the new key is
localized; ensure the JSX still renders the same container (the div with
className "px-3 py-4 text-center text-xs text-muted-foreground") but with the
new translation key.
In `@src/lib/migrate.ts`:
- Around line 39-52: The finally block can let client.end() mask an earlier
error from fn(); wrap the await client.end() call in its own try-catch so that
any error from client.end() is logged (including error details) but does not
override the original exception thrown by fn(); specifically modify the finally
in migrate.ts to catch errors from await client.end() (reference client.end(),
fn(), acquired, lockName) and log them via logger.error while allowing the
original error to propagate.
🧹 Nitpick comments (25)
scripts/validate-migrations.js (1)
163-173:entry?.when的typeof校验可考虑同时检查idx的连续性。当前仅校验
when的单调性,但如果idx出现跳跃或重复(例如手动编辑导致的笔误),同样可能引发迁移问题。可以考虑额外校验idx === expectedIdx,与单调性检查互补。可选:增加 idx 连续性校验
const issues = []; let previousWhen = Number.NEGATIVE_INFINITY; let previousTag = ""; + let expectedIdx = 0; for (const entry of journal.entries) { const tag = typeof entry?.tag === "string" ? entry.tag : "(unknown)"; const when = entry?.when; + + if (entry?.idx !== expectedIdx) { + issues.push({ + type: "JOURNAL", + line: 0, + statement: `Non-sequential idx: expected ${expectedIdx}, got ${entry?.idx} for tag=${tag}`, + suggestion: "Ensure journal entries have sequential idx values starting from 0.", + }); + } + expectedIdx++; + if (typeof when !== "number" || !Number.isFinite(when)) {src/app/[locale]/settings/providers/_components/provider-rich-list-item.tsx (3)
713-722: 桌面端 vendor 条件渲染逻辑正确,但移动端缺少对应处理。桌面端新增了 vendor 分支展示(displayName / websiteDomain + ProviderEndpointHover),回退到
provider.url也合理。但移动端(lines 442-565)完全没有 vendor 相关展示。如果这是有意为之(移动端精简信息),建议加个简短注释说明;否则建议补齐移动端的 vendor 信息展示以保持一致性。
656-665: 当vendor存在时,考虑优先使用vendor.faviconUrl。当前 favicon 始终从
provider.faviconUrl读取,但既然已引入vendorprop 且ProviderVendor类型也包含faviconUrl字段,建议在 vendor 存在时优先使用vendor.faviconUrl,避免 provider 上的冗余字段与 vendor 数据不一致。建议修改
- {provider.faviconUrl && ( + {(vendor?.faviconUrl ?? provider.faviconUrl) && ( <img - src={provider.faviconUrl} + src={vendor?.faviconUrl ?? provider.faviconUrl} alt="" className="h-4 w-4 flex-shrink-0" onError={(e) => { (e.target as HTMLImageElement).style.display = "none"; }} /> )}
209-215: 查询失效的 key 集合可考虑统一管理。组件内多处调用
queryClient.invalidateQueries使用了不同的 key 组合(例如删除时额外失效["provider-vendors"],而编辑/切换时只失效["providers"]和["providers-health"])。建议将常用的失效 key 组合抽取为常量或辅助函数,减少遗漏风险并提高可维护性。示例
// 可在组件内或外部定义 const invalidateProviderQueries = (queryClient: QueryClient, includeVendors = false) => { queryClient.invalidateQueries({ queryKey: ["providers"] }); queryClient.invalidateQueries({ queryKey: ["providers-health"] }); if (includeVendors) { queryClient.invalidateQueries({ queryKey: ["provider-vendors"] }); } };src/app/[locale]/dashboard/users/users-page-client.tsx (1)
44-46:UsersPageClient现在只是一个透传包装器,可考虑简化。移除
QueryClientProvider后,UsersPageClient仅原样转发props给UsersPageContent。可以直接导出UsersPageContent并重命名为UsersPageClient,减少不必要的组件层级和调用栈深度。♻️ 建议的简化方案
-export function UsersPageClient(props: UsersPageClientProps) { - return <UsersPageContent {...props} />; -} - -function UsersPageContent({ currentUser }: UsersPageClientProps) { +export function UsersPageClient({ currentUser }: UsersPageClientProps) {src/repository/overview.ts (1)
97-104: 时区边界逻辑与getOverviewMetrics重复,建议提取公共辅助函数。
nowLocal/todayStartLocal/todayStart/tomorrowStart的 SQL 片段构造在两个函数中完全一致(Lines 45-48 vs 97-100)。可以抽取一个内部辅助(如buildDayBoundaries(timezone: string))返回这些 SQL 片段,减少重复并确保两处行为始终同步。示例重构
function buildDayBoundaries(timezone: string) { const nowLocal = sql`CURRENT_TIMESTAMP AT TIME ZONE ${timezone}`; const todayStartLocal = sql`DATE_TRUNC('day', ${nowLocal})`; const todayStart = sql`(${todayStartLocal} AT TIME ZONE ${timezone})`; const tomorrowStart = sql`((${todayStartLocal} + INTERVAL '1 day') AT TIME ZONE ${timezone})`; return { nowLocal, todayStartLocal, todayStart, tomorrowStart }; }在
getOverviewMetricsWithComparison中可继续基于返回值派生yesterdayStart/yesterdayEnd。src/app/[locale]/settings/providers/_components/provider-endpoint-hover.tsx (3)
357-364:endpointIds在 queryKey 和 queryFn 中重复排序。
endpointIdsKey(Line 360-363)已对 ID 排序生成缓存键,queryFn内(Line 372)又做了一次相同的排序。可以将排序后的数组提取为共享变量,或直接在queryFn中复用endpointIdsKey解析后的结果。影响不大,仅是冗余计算。
386-388:circuitState使用了未经校验的类型断言。
item.circuitState as EndpointCircuitState直接断言服务端返回值类型。如果后端返回了意外值(如null或新增状态),UI 层不会有任何防御。建议添加简单的运行时校验或使用 fallback:建议修改
for (const item of results.flat()) { - map[item.endpointId] = item.circuitState as EndpointCircuitState; + const state = item.circuitState; + if (state === "closed" || state === "open" || state === "half-open") { + map[item.endpointId] = state; + } }
242-289: 降级路径中的中文注释建议统一为英文。Line 243 的注释使用了中文。虽然编码指南仅要求用户可见字符串使用 i18n,但建议代码注释保持与项目其他部分一致的语言风格,以利于团队协作。
src/lib/abort-utils.ts (1)
1-9:signal.reason的 falsy 检查可能遗漏合法的 falsy reason 值。
signal.reason在AbortController.abort(reason)中可以是任意值。当前的if (signal.reason)会在 reason 为0、""或false时跳过,改用!== undefined更为严谨。实际场景下影响极小,仅作提醒。建议修改
export function createAbortError(signal?: AbortSignal): unknown { if (!signal) return new Error("Aborted"); - if (signal.reason) return signal.reason; + if (signal.reason !== undefined) return signal.reason; try { return new DOMException("Aborted", "AbortError"); } catch { return new Error("Aborted"); } }src/lib/provider-endpoints/leader-lock.ts (1)
143-196:startLeaderLockKeepAlive实现整体合理,有一个关于onLost重复调用的小问题值得关注。当
getLock()返回undefined(Line 164)时,会调用stop()+opts.onLost()。但此时opts.clearLock()并未被调用,这与续约失败路径(Line 174 先调clearLock()再onLost())的行为不一致。虽然在getLock()返回undefined的场景下,锁本身已经不存在,不调clearLock()也说得通,但建议确认调用方的onLost回调是幂等的,以防在极端时序下(比如clearLock之后紧跟一次tick)被连续触发两次。src/lib/cache/ttl-map.ts (1)
53-55:size属性包含已过期但未清理的条目,可能导致外部调用方产生误判。由于采用惰性过期策略,
size返回的是底层 Map 的实际大小,可能包含已过期但尚未被get/has/evict清除的条目。如果调用方依赖size做精确容量判断,可能会偏大。对于纯缓存用途这是可接受的 trade-off,但建议在类上方或size访问器处加一行注释说明此行为。src/drizzle/schema.ts (1)
323-329:providersEnabledVendorTypeIdx的 WHERE 条件中IS NOT NULL检查冗余。
providerVendorId列在 schema 中定义为.notNull()(Line 160-161),因此${table.providerVendorId} IS NOT NULL永远为 true,> 0已经隐含排除了 NULL。不影响正确性和性能,但可以精简 WHERE 表达式。src/lib/provider-endpoints/probe-logs-batcher.ts (2)
222-250: 并发控制模式正确但不够直观,可考虑注释说明其线程安全性。
idx++在多个 async worker 间共享,由于 JS 单线程模型,idx++在await之间的同步执行是安全的。但此模式容易让读者误以为存在竞态条件,建议添加简短注释说明。
354-384:flush中外层 catch 可能对已 resolve 的请求再次调用 reject。
Promise.all内部每个 async 函数已有 try/catch,正常情况下外层 catch 不会触发。但如果触发,它会遍历snapshot中所有 group(包括已 resolve 的请求)调用req.reject。虽然load中的settled闭包保证了不会重复 settle promise,行为上是安全的,但理解成本较高——建议在外层 catch 处添加注释说明此安全保障。src/lib/hooks/use-in-view-once.ts (1)
12-22:getObserverOptionsKey未将root纳入缓存键。当前
getSharedObserver是模块私有的,且useInViewOnce固定使用DEFAULT_OPTIONS(root为undefined/viewport),所以暂时安全。但如果未来扩展为支持自定义root元素,不同root会被错误地共享同一个IntersectionObserver,导致回调不触发。可考虑在函数顶部加一行注释标明此限制。
src/repository/statistics.ts (1)
1003-1016:QuotaCostSummary未导出,而QuotaCostRanges已导出。
QuotaCostSummary作为sumUserQuotaCosts和sumKeyQuotaCostsById的返回类型,外部调用方如需声明变量类型则无法直接引用。建议一并导出。建议修改
-interface QuotaCostSummary { +export interface QuotaCostSummary {src/instrumentation.ts (1)
402-442: 开发模式下 backfill 代码与生产模式重复约 30 行。生产模式(lines 291-321)和开发模式(lines 414-442)的 backfill 逻辑几乎完全相同,区别仅在于是否包裹 advisory lock。可考虑提取公共的
runBackfills()函数,在生产模式下通过withAdvisoryLock(lockName, runBackfills, { skipIfLocked: true })调用,在开发模式下直接调用runBackfills()。src/actions/my-usage.ts (2)
381-394:getMyTodayStats中手动累加 breakdown 行的方式可行,但缺少对inputTokens/outputTokens的Number.isFinite保护。Line 389 对
costUsd做了Number.isFinite校验,但inputTokens、outputTokens同样来自sql<number>类型标注——Drizzle 对double precision的映射在极端情况下(如NaN/Infinity)可能返回非有限值。建议保持一致性,对 token 字段也做相同的防御性检查,或者至少确认上游 SQL 的COALESCE已保证不会出现非有限值。可选:对 token 字段也做 isFinite 防护
- totalCalls += row.calls ?? 0; - totalInputTokens += row.inputTokens ?? 0; - totalOutputTokens += row.outputTokens ?? 0; - totalCostUsd += costUsd; + totalCalls += row.calls ?? 0; + const safeInput = Number.isFinite(row.inputTokens) ? row.inputTokens : 0; + const safeOutput = Number.isFinite(row.outputTokens) ? row.outputTokens : 0; + totalInputTokens += safeInput; + totalOutputTokens += safeOutput; + totalCostUsd += costUsd;
218-222: 动态import()用于time-utils和statistics模块——确认是有意为之。
getMyQuota中对@/lib/rate-limit/time-utils和@/repository/statistics使用了动态import(),这会在每次调用时产生模块解析开销(虽然 Node 会缓存)。如果这是为了避免循环依赖或减少冷启动加载,可以保留;否则改为顶层静态import更清晰且利于 tree-shaking。src/app/[locale]/dashboard/availability/_components/endpoint-probe-history.tsx (1)
54-64: 初始加载 vendors 时,若请求失败仅console.error,用户无任何反馈。可以考虑加一个 toast 提示或将 error 状态暴露到 UI 上,但鉴于此页面是管理后台、且这是初始化加载,当前处理方式也可接受。
src/repository/usage-logs.ts (4)
330-364:findUsageLogsForKeySlim:缓存命中与未命中路径存在重复的map逻辑。Line 339 和 line 358-360 两处对
pageRows做了相同的costUsd?.toString() ?? null映射。如果映射逻辑日后变更(如增加字段转换),需要同步修改两处,容易遗漏。建议:提取映射函数统一处理
+ const mapSlimRow = (row: typeof pageRows[number]): UsageLogSlimRow => ({ + ...row, + costUsd: row.costUsd?.toString() ?? null, + }); + const cachedTotal = usageLogSlimTotalCache.get(totalCacheKey); if (cachedTotal !== undefined) { total = Math.max(cachedTotal, total); return { - logs: pageRows.map((row) => ({ ...row, costUsd: row.costUsd?.toString() ?? null })), + logs: pageRows.map(mapSlimRow), total, }; } // ... COUNT logic ... - const logs: UsageLogSlimRow[] = pageRows.map((row) => ({ - ...row, - costUsd: row.costUsd?.toString() ?? null, - })); + const logs: UsageLogSlimRow[] = pageRows.map(mapSlimRow); usageLogSlimTotalCache.set(totalCacheKey, total); return { logs, total };
464-498:findUsageLogsWithDetails:两个 summary 查询分支的select完全相同,仅from/join不同,建议提取公共 select 定义。Lines 467-479 与 lines 483-494 的选择列完全一致,重复了约 30 行 SQL 模板。如果后续新增统计字段(如已新增的 5m/1h tokens),两处都需同步修改。
建议:提取公共 select 对象
+ const summarySelectFields = { + totalRows: sql<number>`count(*)::double precision`, + totalRequests: sql<number>`count(*) FILTER (WHERE ${EXCLUDE_WARMUP_CONDITION})::double precision`, + totalCost: sql<string>`COALESCE(sum(${messageRequest.costUsd}) FILTER (WHERE ${EXCLUDE_WARMUP_CONDITION}), 0)`, + totalInputTokens: sql<number>`COALESCE(sum(${messageRequest.inputTokens}) FILTER (WHERE ${EXCLUDE_WARMUP_CONDITION})::double precision, 0::double precision)`, + totalOutputTokens: sql<number>`COALESCE(sum(${messageRequest.outputTokens}) FILTER (WHERE ${EXCLUDE_WARMUP_CONDITION})::double precision, 0::double precision)`, + totalCacheCreationTokens: sql<number>`COALESCE(sum(${messageRequest.cacheCreationInputTokens}) FILTER (WHERE ${EXCLUDE_WARMUP_CONDITION})::double precision, 0::double precision)`, + totalCacheReadTokens: sql<number>`COALESCE(sum(${messageRequest.cacheReadInputTokens}) FILTER (WHERE ${EXCLUDE_WARMUP_CONDITION})::double precision, 0::double precision)`, + totalCacheCreation5mTokens: sql<number>`COALESCE(sum(${messageRequest.cacheCreation5mInputTokens}) FILTER (WHERE ${EXCLUDE_WARMUP_CONDITION})::double precision, 0::double precision)`, + totalCacheCreation1hTokens: sql<number>`COALESCE(sum(${messageRequest.cacheCreation1hInputTokens}) FILTER (WHERE ${EXCLUDE_WARMUP_CONDITION})::double precision, 0::double precision)`, + }; + const summaryQuery = keyId === undefined - ? db - .select({ - totalRows: sql<number>`count(*)::double precision`, - // ... (all fields) ... - }) - .from(messageRequest) - .where(and(...conditions)) - : db - .select({ - totalRows: sql<number>`count(*)::double precision`, - // ... (same fields) ... - }) - .from(messageRequest) - .innerJoin(keysTable, eq(messageRequest.key, keysTable.key)) - .where(and(...conditions)); + ? db.select(summarySelectFields).from(messageRequest).where(and(...conditions)) + : db + .select(summarySelectFields) + .from(messageRequest) + .innerJoin(keysTable, eq(messageRequest.key, keysTable.key)) + .where(and(...conditions));
274-276:usageLogSlimTotalCache的 key 构造使用\u0001分隔符,需确保各字段值本身不含此字符。Line 291-301 用
\u0001(SOH 控制字符)拼接缓存 key。对于来源为用户输入的sessionId、model、endpoint等字段,理论上可能包含任意字符。不过在实际场景中 SOH 出现在这些字段的概率极低,且该缓存仅用于 10 秒短 TTL 的 total count,冲突影响有限。
367-374: distinct models/endpoints 缓存 TTL 为 5 分钟——对于my-usage场景可接受,但新增的 model/endpoint 可能有延迟可见性。如果用户刚通过新 model 发起请求后立即查看筛选器下拉列表,可能需等待最长 5 分钟才能看到新选项。这对 "我的用量" 页面而言影响不大,但值得在文档或注释中注明。
| const refreshVendors = useCallback(async (options?: { silent?: boolean }) => { | ||
| const requestId = ++vendorsRequestIdRef.current; | ||
| if (!options?.silent) { | ||
| setLoadingVendors(true); | ||
| } | ||
|
|
||
| try { | ||
| const currentVendorId = latestSelectionRef.current.vendorId; | ||
| const currentType = latestSelectionRef.current.providerType; | ||
| const nextVendors = await getDashboardProviderVendors(); | ||
|
|
||
| if (requestId !== vendorsRequestIdRef.current) { | ||
| return null; | ||
| } | ||
| }; | ||
| fetchVendors(); | ||
| }, []); | ||
|
|
||
| // Fetch endpoints when vendor or type changes | ||
| useEffect(() => { | ||
| if (!selectedVendorId || !selectedType) { | ||
| setEndpoints([]); | ||
| return; | ||
| setVendors(nextVendors); | ||
|
|
||
| if (nextVendors.length === 0) { | ||
| setSelectedVendorId(null); | ||
| setSelectedType(null); | ||
| setSelectedEndpoint(null); | ||
| return { | ||
| selectionChanged: currentVendorId != null || currentType != null, | ||
| vendorId: null, | ||
| providerType: null, | ||
| }; | ||
| } | ||
|
|
||
| const vendor = | ||
| (currentVendorId ? nextVendors.find((v) => v.id === currentVendorId) : null) ?? | ||
| nextVendors[0] ?? | ||
| null; | ||
|
|
||
| if (!vendor) { | ||
| setSelectedVendorId(null); | ||
| setSelectedType(null); | ||
| setSelectedEndpoint(null); | ||
| return { | ||
| selectionChanged: currentVendorId != null || currentType != null, | ||
| vendorId: null, | ||
| providerType: null, | ||
| }; | ||
| } | ||
|
|
||
| const nextVendorId = vendor.id; | ||
| const nextProviderType = | ||
| currentType && vendor.providerTypes.includes(currentType) | ||
| ? currentType | ||
| : (vendor.providerTypes[0] ?? null); | ||
|
|
||
| const selectionChanged = nextVendorId !== currentVendorId || nextProviderType !== currentType; | ||
|
|
||
| if (selectionChanged) { | ||
| // 避免 selection 自动切换期间仍能对旧 endpoint 发起探测请求(#781)。 | ||
| setSelectedEndpoint(null); | ||
| } | ||
|
|
||
| setSelectedVendorId(nextVendorId); | ||
| setSelectedType(nextProviderType); | ||
|
|
||
| return { | ||
| selectionChanged, | ||
| vendorId: nextVendorId, | ||
| providerType: nextProviderType, | ||
| }; | ||
| } catch (error) { | ||
| if (requestId !== vendorsRequestIdRef.current) { | ||
| return null; | ||
| } | ||
| console.error("Failed to fetch vendors:", error); | ||
| return null; | ||
| } finally { | ||
| if (!options?.silent && requestId === vendorsRequestIdRef.current) { | ||
| setLoadingVendors(false); | ||
| } | ||
| } | ||
| }, []); |
There was a problem hiding this comment.
refreshVendors 中 silent 模式的 loading 状态管理需注意边界情况。
Line 133: if (!options?.silent && requestId === vendorsRequestIdRef.current) — 当使用 silent: true 调用时,setLoadingVendors(false) 不会在 finally 中执行。这依赖于 silent 调用时 setLoadingVendors(true) 也不会被调用(line 63-65),所以 loading 状态保持一致。逻辑是正确的,但如果后续有非 silent 请求被 silent 请求的 requestId 递增所作废,finally 中 requestId === vendorsRequestIdRef.current 为 false,loading 状态可能卡在 true。
具体场景:非 silent 请求 A 正在进行 → silent 请求 B 发起 → A 的 finally 发现 requestId 不匹配,不调用 setLoadingVendors(false) → B 的 finally 因为 silent 也不调用 → loadingVendors 卡在 true。
建议的修复方案
} finally {
- if (!options?.silent && requestId === vendorsRequestIdRef.current) {
+ if (requestId === vendorsRequestIdRef.current) {
setLoadingVendors(false);
}
}或者在 silent 模式下也确保 loading 状态被正确重置。
🤖 Prompt for AI Agents
In
`@src/app/`[locale]/dashboard/availability/_components/endpoint/endpoint-tab.tsx
around lines 61 - 137, The finally block can leave loadingVendors stuck true
when a non-silent request A is in-flight, a silent request B increments
vendorsRequestIdRef and cancels A, and neither finally clears loading; to fix,
remember whether this invocation actually set loading (e.g., const didSetLoading
= !options?.silent; call setLoadingVendors(true) only when didSetLoading), and
in finally clear loading when either didSetLoading is true or this request is
still the current one (if (didSetLoading || requestId ===
vendorsRequestIdRef.current) setLoadingVendors(false)); update refreshVendors to
use didSetLoading and reference vendorsRequestIdRef, setLoadingVendors, and
options.silent accordingly.
| {endpointsLoading ? ( | ||
| <div className="px-3 py-4 text-center text-xs text-muted-foreground"> | ||
| {t("keyLoading")} | ||
| </div> |
There was a problem hiding this comment.
加载中状态复用了 keyLoading 翻译键。
t("keyLoading") 语义上指的是"密钥加载中",此处用于端点列表加载的场景不太匹配。建议使用更通用或专属的翻译键(如 endpointStatus.loading),避免不同上下文共用同一文案导致后续维护混乱。
🤖 Prompt for AI Agents
In `@src/app/`[locale]/settings/providers/_components/provider-endpoint-hover.tsx
around lines 432 - 435, The loading message for the endpoints list reuses
t("keyLoading") which is semantically "key loading"; update the endpointsLoading
branch in provider-endpoint-hover.tsx to use a dedicated translation key (e.g.
t("endpointStatus.loading") or a generic t("loading")) instead of "keyLoading",
and add the corresponding entries to your i18n resource files so the new key is
localized; ensure the JSX still renders the same container (the div with
className "px-3 py-4 text-center text-xs text-muted-foreground") but with the
new translation key.
Remove silent option guard so vendor loading state always resets when the request completes, preventing stale loading indicators. Wrap advisory lock client.end() in try-catch to avoid unhandled errors during connection teardown.
🧪 测试结果
总体结果: ✅ 所有测试通过 |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/app/[locale]/dashboard/availability/_components/endpoint/endpoint-tab.tsx (1)
409-413:⚠️ Potential issue | 🟡 MinorProvider type 名称未经 i18n 处理,直接显示原始字符串
{type}直接渲染如"claude"、"gemini-cli"等原始值作为下拉选项文本。根据编码规范,所有用户可见的字符串都必须使用 i18n。建议通过翻译 key 映射显示名称:- {type} + {t(`providerTypes.${type}`)}As per coding guidelines, "All user-facing strings must use i18n (5 languages supported: zh-CN, zh-TW, en, ja, ru). Never hardcode display text".
🤖 Fix all issues with AI agents
In `@src/lib/migrate.ts`:
- Around line 176-185: The finally block in runMigrations calls
migrationClient.end() without try-catch, which can mask or replace earlier
errors; wrap the await migrationClient.end() call in a try-catch (similar to
withAdvisoryLock) so any errors from closing the client are logged via
logger.error but do not override the original exception from fn(); update the
finally in runMigrations to catch and log errors from migrationClient.end()
while preserving original error flow.
🧹 Nitpick comments (4)
src/lib/migrate.ts (2)
130-136: 可考虑将逐条 UPDATE 改为批量操作或包裹在事务中。当前逐条执行
UPDATE,若进程中途崩溃会导致部分修复。虽然该操作是幂等的(重启后可继续修复剩余行),但使用事务可保证原子性,也能减少多次往返的开销。建议修改
- for (const fix of pendingFixes) { - await client` - UPDATE "drizzle"."__drizzle_migrations" - SET created_at = ${fix.to} - WHERE id = ${fix.id} - `; - } + await client.begin(async (tx) => { + for (const fix of pendingFixes) { + await tx` + UPDATE "drizzle"."__drizzle_migrations" + SET created_at = ${fix.to} + WHERE id = ${fix.id} + `; + } + });
147-186:runMigrations重复了withAdvisoryLock的锁获取/释放逻辑,违反 DRY 原则。
withAdvisoryLock已经封装了完善的 advisory lock 生命周期管理(获取、释放、客户端关闭、错误处理),但runMigrations在第 159-161 行和 177-181 行重新实现了相同的模式。主要障碍是runMigrations需要用同一个client同时执行锁操作和迁移,而withAdvisoryLock内部创建了独立的客户端。可考虑重构
withAdvisoryLock使其支持接收外部客户端,或者将runMigrations改为调用withAdvisoryLock,在回调中创建迁移连接。src/app/[locale]/dashboard/availability/_components/endpoint/endpoint-tab.tsx (2)
57-59: 渲染阶段直接写 ref 存在并发模式下的隐患在 render 函数体中直接赋值
latestSelectionRef.current属于渲染阶段副作用。在 React 19 并发模式下,render 可能被调用多次但只有一次会被提交,中间丢弃的 render 也会写入 ref,导致 ref 值短暂地与实际 committed state 不一致。虽然由于最终提交的 render 会覆盖 ref 值,实际影响有限,但更安全的做法是将赋值移入useEffect:建议的修改
- latestSelectionRef.current.vendorId = selectedVendorId; - latestSelectionRef.current.providerType = selectedType; - latestSelectionRef.current.endpointId = selectedEndpoint?.id ?? null; + useEffect(() => { + latestSelectionRef.current.vendorId = selectedVendorId; + latestSelectionRef.current.providerType = selectedType; + latestSelectionRef.current.endpointId = selectedEndpoint?.id ?? null; + }, [selectedVendorId, selectedType, selectedEndpoint?.id]);
260-277:vendorResult为null(请求失败)时仍会触发 endpoints 和 logs 刷新当
refreshVendors返回null(网络异常或请求被取代),vendorResult?.selectionChanged为undefined,!undefined === true,导致后续refreshEndpoints和refreshProbeLogs仍会执行。如果 vendor 请求失败,用旧的 selection 继续刷新 endpoints/logs 是可以接受的容错行为,但建议明确处理nullcase 以表达意图:建议的修改
const refresh = async () => { const vendorResult = await refreshVendors({ silent: true }); - const vendorId = vendorResult?.vendorId ?? latestSelectionRef.current.vendorId; - const providerType = vendorResult?.providerType ?? latestSelectionRef.current.providerType; - const endpointId = latestSelectionRef.current.endpointId; + if (!vendorResult) return; - if (!vendorResult?.selectionChanged && vendorId && providerType) { + const vendorId = vendorResult.vendorId ?? latestSelectionRef.current.vendorId; + const providerType = vendorResult.providerType ?? latestSelectionRef.current.providerType; + const endpointId = latestSelectionRef.current.endpointId; + + if (!vendorResult.selectionChanged && vendorId && providerType) { void refreshEndpoints({ vendorId, providerType, keepSelectedEndpointId: endpointId, }); } - if (!vendorResult?.selectionChanged && endpointId) { + if (!vendorResult.selectionChanged && endpointId) { void refreshProbeLogs(endpointId); } };
| } finally { | ||
| try { | ||
| await migrationClient`SELECT pg_advisory_unlock(hashtext(${MIGRATION_ADVISORY_LOCK_NAME}))`; | ||
| } catch (unlockError) { | ||
| logger.error("Failed to release database migration lock", unlockError); | ||
| } | ||
|
|
||
| // 关闭连接 | ||
| await migrationClient.end(); | ||
| } |
There was a problem hiding this comment.
migrationClient.end() 未用 try-catch 包裹,与 withAdvisoryLock 中的处理方式不一致。
withAdvisoryLock 中已正确将 client.end() 包裹在 try-catch 中(第 51-58 行),但 runMigrations 的 finally 块中第 184 行的 migrationClient.end() 仍然没有保护。如果 end() 抛出异常,会覆盖 fn() 的原始错误或导致未处理的异常。
建议修改
// 关闭连接
- await migrationClient.end();
+ try {
+ await migrationClient.end();
+ } catch (endError) {
+ logger.error("Failed to close migration client", endError);
+ }
}🤖 Prompt for AI Agents
In `@src/lib/migrate.ts` around lines 176 - 185, The finally block in
runMigrations calls migrationClient.end() without try-catch, which can mask or
replace earlier errors; wrap the await migrationClient.end() call in a try-catch
(similar to withAdvisoryLock) so any errors from closing the client are logged
via logger.error but do not override the original exception from fn(); update
the finally in runMigrations to catch and log errors from migrationClient.end()
while preserving original error flow.
| if (toLoad.length > 0) { | ||
| const promise = (async () => { | ||
| try { | ||
| const redisStates = await loadEndpointCircuitStates(toLoad); | ||
|
|
||
| for (const endpointId of toLoad) { | ||
| const redisState = redisStates.get(endpointId) ?? null; | ||
| loadedFromRedisAt.set(endpointId, refreshNow); | ||
|
|
||
| const health = getOrCreateHealthSync(endpointId); | ||
| if (redisState) { | ||
| // 从 Redis 同步到内存时,不能只在 circuitState 变化时才更新: | ||
| // failureCount / halfOpenSuccessCount 等字段在 forceRefresh 下也应保持一致。 | ||
| health.failureCount = redisState.failureCount; | ||
| health.lastFailureTime = redisState.lastFailureTime; | ||
| health.circuitState = redisState.circuitState; | ||
| health.circuitOpenUntil = redisState.circuitOpenUntil; | ||
| health.halfOpenSuccessCount = redisState.halfOpenSuccessCount; | ||
| continue; | ||
| } | ||
|
|
||
| if (health.circuitState !== "closed") { | ||
| health.circuitState = "closed"; | ||
| health.failureCount = 0; | ||
| health.lastFailureTime = null; | ||
| health.circuitOpenUntil = null; | ||
| health.halfOpenSuccessCount = 0; | ||
| } | ||
| } | ||
| } catch (error) { | ||
| logger.warn("[EndpointCircuitBreaker] Failed to batch sync state from Redis", { | ||
| count: toLoad.length, | ||
| error: error instanceof Error ? error.message : String(error), | ||
| }); | ||
| } | ||
| })().finally(() => { | ||
| for (const endpointId of toLoad) { | ||
| if (redisSyncInFlight.get(endpointId) === promise) { | ||
| redisSyncInFlight.delete(endpointId); | ||
| } | ||
| } | ||
| }); | ||
|
|
||
| for (const endpointId of toLoad) { | ||
| redisSyncInFlight.set(endpointId, promise); |
There was a problem hiding this comment.
Batch in-flight dedup race window
The async IIFE at line 201 starts executing immediately (up to the first await at line 203), but redisSyncInFlight.set(endpointId, promise) doesn't happen until lines 243-244. In the current single-threaded JS model, this is safe because the IIFE suspends at await loadEndpointCircuitStates(toLoad) before lines 243-244 run synchronously. However, this ordering is fragile — if any synchronous logic is added before the first await inside the IIFE (e.g., a cache check that returns early), the redisSyncInFlight entries would never be set.
Consider moving lines 243-244 before the IIFE creation (mirroring the fix applied to providerStatisticsInFlight at provider.ts:1224-1225) to make the intent explicit and avoid future regressions:
for (const endpointId of toLoad) {
redisSyncInFlight.set(endpointId, promise);
}
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/endpoint-circuit-breaker.ts
Line: 200:244
Comment:
**Batch in-flight dedup race window**
The async IIFE at line 201 starts executing immediately (up to the first `await` at line 203), but `redisSyncInFlight.set(endpointId, promise)` doesn't happen until lines 243-244. In the current single-threaded JS model, this is safe because the IIFE suspends at `await loadEndpointCircuitStates(toLoad)` before lines 243-244 run synchronously. However, this ordering is fragile — if any synchronous logic is added before the first `await` inside the IIFE (e.g., a cache check that returns early), the `redisSyncInFlight` entries would never be set.
Consider moving lines 243-244 before the IIFE creation (mirroring the fix applied to `providerStatisticsInFlight` at `provider.ts:1224-1225`) to make the intent explicit and avoid future regressions:
```
for (const endpointId of toLoad) {
redisSyncInFlight.set(endpointId, promise);
}
```
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.| await client`SELECT pg_advisory_lock(hashtext(${lockName}))`; | ||
| acquired = true; | ||
| } | ||
|
|
||
| const result = await fn(); | ||
| return { ran: true, result }; | ||
| } finally { | ||
| if (acquired) { | ||
| try { | ||
| await client`SELECT pg_advisory_unlock(hashtext(${lockName}))`; | ||
| } catch (unlockError) { | ||
| logger.error("Failed to release advisory lock", { | ||
| lockName, | ||
| error: unlockError instanceof Error ? unlockError.message : String(unlockError), | ||
| }); | ||
| } | ||
| } | ||
|
|
||
| try { | ||
| await client.end(); | ||
| } catch (endError) { | ||
| logger.error("Failed to close advisory lock client", { | ||
| lockName, | ||
| error: endError instanceof Error ? endError.message : String(endError), | ||
| }); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Advisory lock client leak on fn() exception
If fn() throws, the finally block calls client.end(). However, pg_advisory_unlock is session-level — if client.end() closes the connection before the unlock executes (or the unlock itself fails), the lock is released implicitly by PostgreSQL when the connection terminates. This is fine for correctness.
However, the client is created with postgres(process.env.DSN, { max: 1 }) which creates a connection pool. If fn() throws synchronously before any query, the pool may not have established a connection yet, and client.end() can throw "Called end with pending queries" depending on the postgres.js version. Consider wrapping the client.end() in a try/catch (which is already done — good), but also note that pg_advisory_unlock on a failed/closed connection will throw and be caught by the inner try/catch, so this is handled correctly.
One real concern: if fn() itself uses the application's shared db connection pool (which it does — all repository functions use the shared drizzle db), the advisory lock is held on the separate client connection. This means the lock protects against concurrent instances but the actual work runs on a different connection. This is correct for the advisory lock pattern — just flagging that the lock does NOT provide transactional isolation with the work done inside fn().
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/migrate.ts
Line: 33:60
Comment:
**Advisory lock client leak on `fn()` exception**
If `fn()` throws, the `finally` block calls `client.end()`. However, `pg_advisory_unlock` is session-level — if `client.end()` closes the connection before the unlock executes (or the unlock itself fails), the lock is released implicitly by PostgreSQL when the connection terminates. This is fine for correctness.
However, the `client` is created with `postgres(process.env.DSN, { max: 1 })` which creates a connection pool. If `fn()` throws synchronously before any query, the pool may not have established a connection yet, and `client.end()` can throw `"Called end with pending queries"` depending on the postgres.js version. Consider wrapping the `client.end()` in a try/catch (which is already done — good), but also note that `pg_advisory_unlock` on a failed/closed connection will throw and be caught by the inner try/catch, so this is handled correctly.
One real concern: if `fn()` itself uses the **application's shared `db` connection pool** (which it does — all repository functions use the shared drizzle `db`), the advisory lock is held on the **separate `client`** connection. This means the lock protects against concurrent instances but the actual work runs on a different connection. This is correct for the advisory lock pattern — just flagging that the lock does NOT provide transactional isolation with the work done inside `fn()`.
How can I resolve this? If you propose a fix, please make it concise.| const hasMore = results.length > safePageSize; | ||
| const pageRows = hasMore ? results.slice(0, safePageSize) : results; | ||
|
|
||
| let total = offset + pageRows.length; | ||
|
|
||
| const cachedTotal = usageLogSlimTotalCache.get(totalCacheKey); | ||
| if (cachedTotal !== undefined) { | ||
| total = Math.max(cachedTotal, total); | ||
| return { | ||
| logs: pageRows.map((row) => ({ ...row, costUsd: row.costUsd?.toString() ?? null })), | ||
| total, | ||
| }; | ||
| } |
There was a problem hiding this comment.
Stale total count returned from cache
When the cache hit occurs (line 335-341), the function returns early with the cached total without running a fresh COUNT query. The Math.max(cachedTotal, total) on line 337 biases toward the larger value, which means:
- If rows are deleted between requests, the total count will remain inflated until the 10s TTL expires.
- On page 1 with few results (
offset=0,hasMore=false),total = pageRows.lengthcould be less thancachedTotal, causing the UI to show phantom pages.
This is likely acceptable for a 10-second TTL cache in a usage-logs context, but worth noting that the monotonically-increasing assumption (Math.max) doesn't hold when rows can be soft-deleted between requests.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/repository/usage-logs.ts
Line: 330:342
Comment:
**Stale total count returned from cache**
When the cache hit occurs (line 335-341), the function returns early with the cached total without running a fresh COUNT query. The `Math.max(cachedTotal, total)` on line 337 biases toward the larger value, which means:
1. If rows are **deleted** between requests, the total count will remain inflated until the 10s TTL expires.
2. On page 1 with few results (`offset=0`, `hasMore=false`), `total = pageRows.length` could be less than `cachedTotal`, causing the UI to show phantom pages.
This is likely acceptable for a 10-second TTL cache in a usage-logs context, but worth noting that the monotonically-increasing assumption (`Math.max`) doesn't hold when rows can be soft-deleted between requests.
How can I resolve this? If you propose a fix, please make it concise.| let idx = 0; | ||
|
|
||
| const workers = Array.from({ length: Math.min(concurrency, endpointIds.length) }, async () => { | ||
| for (;;) { | ||
| const currentIndex = idx++; | ||
| if (currentIndex >= endpointIds.length) return; |
There was a problem hiding this comment.
Non-atomic idx++ in concurrent workers
The idx++ on line 232 is used as a work-stealing index across concurrent async workers. While this is safe in single-threaded JavaScript (the increment + read is not interrupted between the for(;;) iterations since each worker awaits between iterations), this pattern can be confusing to readers who might expect race conditions.
This is functionally correct because idx++ is atomic in the JS event loop — each worker reads and increments idx synchronously before yielding at await. Just noting this is an intentional pattern.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/provider-endpoints/probe-logs-batcher.ts
Line: 228:233
Comment:
**Non-atomic `idx++` in concurrent workers**
The `idx++` on line 232 is used as a work-stealing index across concurrent async workers. While this is safe in single-threaded JavaScript (the increment + read is not interrupted between the `for(;;)` iterations since each worker `await`s between iterations), this pattern can be confusing to readers who might expect race conditions.
This is functionally correct because `idx++` is atomic in the JS event loop — each worker reads and increments `idx` synchronously before yielding at `await`. Just noting this is an intentional pattern.
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.| -- Note: message_request is a high-write table. Standard CREATE INDEX may block writes during index creation. | ||
| -- Drizzle migrator does not support CREATE INDEX CONCURRENTLY. If write blocking is a concern, | ||
| -- manually pre-create indexes with CONCURRENTLY before running this migration (IF NOT EXISTS prevents conflicts). | ||
| CREATE INDEX IF NOT EXISTS "idx_keys_key" ON "keys" USING btree ("key");--> statement-breakpoint |
There was a problem hiding this comment.
Blocking index creation on high-write table
The migration comment correctly warns that CREATE INDEX on message_request may block writes. Since Drizzle migrator doesn't support CREATE INDEX CONCURRENTLY, this migration will acquire an ACCESS EXCLUSIVE lock on message_request for the duration of index creation. For a table described as "high-write", this could cause significant downtime depending on table size.
The IF NOT EXISTS guard allows operators to pre-create indexes concurrently before deploying, which is the recommended approach. Consider documenting this in a deployment/upgrade guide or release notes to ensure operators are aware they should run the CONCURRENTLY versions manually before deploying this migration.
Prompt To Fix With AI
This is a comment left during a code review.
Path: drizzle/0068_flaky_swarm.sql
Line: 1:4
Comment:
**Blocking index creation on high-write table**
The migration comment correctly warns that `CREATE INDEX` on `message_request` may block writes. Since Drizzle migrator doesn't support `CREATE INDEX CONCURRENTLY`, this migration will acquire an `ACCESS EXCLUSIVE` lock on `message_request` for the duration of index creation. For a table described as "high-write", this could cause significant downtime depending on table size.
The `IF NOT EXISTS` guard allows operators to pre-create indexes concurrently before deploying, which is the recommended approach. Consider documenting this in a deployment/upgrade guide or release notes to ensure operators are aware they should run the CONCURRENTLY versions manually before deploying this migration.
How can I resolve this? If you propose a fix, please make it concise.
Greptile Summary
A comprehensive performance and correctness refactoring across the provider page, endpoint management, and usage statistics subsystems. The PR reduces database round-trips by consolidating N+1 query patterns into batch operations, adds targeted indexes on hot paths, introduces in-memory caching with TTL/LRU eviction, and improves multi-instance safety with advisory locks.
sumUserQuotaCosts,sumKeyQuotaCostsById), batch circuit breaker health loading (getAllEndpointHealthStatusAsync), and batch probe log retrieval via LATERAL joins::datecasts to range-based comparisons for index utilizationTTLMapgeneric cache used for key-string lookups, usage log totals, distinct model/endpoint lists, and provider statistics; in-flight dedup prevents thundering herd on cache expirypg_advisory_lock; provider backfill useswithAdvisoryLockwithskipIfLocked; drizzle migrationcreated_atrepair for journal consistencyProbeLogsBatchercoalesces per-endpoint requests into batch API calls;useInViewOncehook with shared IntersectionObserver defers off-screen data loading; removed redundantQueryClientProviderwrappers androuter.refresh()callscomputeNextDueAtMs, vendor+type scoped interval counting, in-memory probe result tracking to avoid stale scheduling decisionsConfidence Score: 4/5
drizzle/0068_flaky_swarm.sql(blocking index creation on high-write table),src/lib/endpoint-circuit-breaker.ts(in-flight dedup ordering),src/repository/provider.ts(complex transaction logic in delete/batch operations)Important Files Changed
getProviderStatistics, cascade soft-delete of endpoints on provider delete, batch enable/disable with endpoint pool sync. SQL query restructured with bounds CTE for index-friendliness.findEnabledProviderEndpointsByVendorAndType,findDashboardProviderEndpointsByVendorAndType,findEnabledProviderVendorTypePairs, vendor update enrichment, revive conflict handling in sync, and reorders probe result recording to check endpoint existence first.getAllEndpointHealthStatusAsync), LRU cache eviction, TTL-based Redis sync, in-flight dedup for batch loads, and optimization to delete default-closed states from Redis.isEndpointCircuitOpencalls with single batchgetAllEndpointHealthStatusAsynccall. UsesfindEnabledProviderEndpointsByVendorAndTypeto avoid filtering in application layer. Top-levelgetEnvConfigimport replaces dynamic import.computeNextDueAtMs/updateNextWorkHints, refactors vendor counting to vendor+type key, extractsstartLeaderLockKeepAliveto shared module, and updates probe results in-memory to avoid stale scheduling.sumKeyQuotaCostsById/sumUserQuotaCostscalls (N+1 elimination). Replaces separate aggregate+breakdown queries with single combined query using FILTER aggregates. Derives today-stats totals from breakdown loop.sumUserQuotaCostsandsumKeyQuotaCostsByIdfor consolidated multi-period cost queries via FILTER aggregates. IntroducesTTLMap-based key string cache. Adds range-bounding predicates to all JOIN conditions in statistics queries for index utilization.findUsageLogsForKeySlim(slim projection + cached total count). Extracts shared filter building tobuildUsageLogConditions. Adds TTL caching for distinct models/endpoints. Conditionally joins keysTable only when keyId filter is present.withAdvisoryLockutility. AddsrepairDrizzleMigrationsCreatedAtto fix journal timestamp mismatches. Changes from"use server"toimport "server-only".Flowchart
flowchart TD subgraph Client["Client-Side Optimization"] UI["Provider Page / Dashboard UI"] BATCH["ProbeLogsBatcher"] INVIEW["useInViewOnce Hook"] UI --> INVIEW INVIEW -->|visible| BATCH BATCH -->|coalesce requests| BATCHAPI["/api/actions batch endpoints"] end subgraph Server["Server-Side Actions"] BATCHAPI --> BPA["batchGetProviderEndpointProbeLogs"] BATCHAPI --> BVS["batchGetVendorTypeEndpointStats"] BATCHAPI --> BCI["batchGetEndpointCircuitInfo"] end subgraph Repository["Repository Layer"] BPA --> LATERAL["LATERAL JOIN batch query"] BVS --> GROUPBY["GROUP BY + FILTER aggregate"] BCI --> ALLHEALTH["getAllEndpointHealthStatusAsync"] end subgraph Cache["Caching Layer"] TTLMAP["TTLMap (key cache, totals, models)"] HEALTHCACHE["EndpointHealth LRU Cache"] PROVSTATS["Provider Statistics Cache + In-flight Dedup"] ALLHEALTH --> HEALTHCACHE HEALTHCACHE -->|TTL expired| REDIS["Redis Pipeline Batch Load"] end subgraph DB["Database"] LATERAL --> PG[(PostgreSQL)] GROUPBY --> PG REDIS -.->|circuit state| REDISDB[(Redis)] PG -.->|new indexes| IDX["11 new targeted indexes"] end subgraph Lifecycle["Provider Lifecycle"] DEL["deleteProvider / deleteProvidersBatch"] DEL -->|cascade| SOFTDEL["Soft-delete orphan endpoints"] SOFTDEL -->|NOT EXISTS check| PG UPD["updateProvider (enable)"] UPD -->|ensure endpoint| PG endLast reviewed commit: 1cee0a9