fix(provider): wrap endpoint sync in transactions to prevent race conditions#730
fix(provider): wrap endpoint sync in transactions to prevent race conditions#730
Conversation
Provider create and update operations now run vendor resolution and endpoint sync inside database transactions to prevent race conditions that could leave orphaned or inconsistent endpoint rows. Key changes: - createProvider: wrap vendor + insert + endpoint seed in a single tx - updateProvider: wrap vendor + update + endpoint sync in a single tx - Add syncProviderEndpointOnProviderEdit for atomic URL/type/vendor migration with in-place update, soft-delete, and conflict handling - Vendor cleanup failures degrade to warnings instead of propagating - Add comprehensive unit and integration tests for sync edge cases Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthrough引入事务化的端点同步(syncProviderEndpointOnProviderEdit)、为若干仓库方法添加可选事务上下文({ tx }),在提供商创建/更新/删除流程中将关键步骤移入事务并增强对供应商清理与电路重置的容错日志处理,同时扩展大量单元与集成测试覆盖并发与冲突场景。 Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @ding113, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the data integrity and robustness of provider and provider endpoint management. By introducing database transactions for critical create and update flows, it eliminates potential race conditions and ensures that related data (providers, vendors, and endpoints) remains consistent. The new, sophisticated endpoint synchronization logic ( Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
| }; | ||
|
|
||
| const ensureNextEndpointActive = async (options?: { | ||
| reactivateDisabled?: boolean; |
There was a problem hiding this comment.
Circuit reset in tx
syncProviderEndpointOnProviderEdit calls resetEndpointCircuit(previousEndpoint.id) inside the DB transaction after an in-place move. If the outer transaction later rolls back (e.g. provider update fails), the circuit state will already be reset even though the endpoint row change didn’t commit.
Move the circuit reset outside the transaction boundary (or gate it on commit) to keep circuit state consistent with persisted endpoint state.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/repository/provider-endpoints.ts
Line: 894:897
Comment:
**Circuit reset in tx**
`syncProviderEndpointOnProviderEdit` calls `resetEndpointCircuit(previousEndpoint.id)` inside the DB transaction after an in-place move. If the outer transaction later rolls back (e.g. provider update fails), the circuit state will already be reset even though the endpoint row change didn’t commit.
Move the circuit reset outside the transaction boundary (or gate it on commit) to keep circuit state consistent with persisted endpoint state.
How can I resolve this? If you propose a fix, please make it concise.
Additional Comments (1)
Please either restore the previous boolean behavior or update all call sites to handle the thrown errors explicitly and adjust tests accordingly. Prompt To Fix With AIThis is a comment left during a code review.
Path: src/repository/provider-endpoints.ts
Line: 744:746
Comment:
**Inconsistent error contract**
`ensureProviderEndpointExistsForUrl` used to return `false` for empty/invalid URLs, but now it throws (e.g. `throw new Error("[ProviderEndpointEnsure] url is required")`). Any existing callers that rely on a boolean return will now hard-fail and potentially abort their surrounding flow/transaction.
Please either restore the previous boolean behavior or update *all* call sites to handle the thrown errors explicitly and adjust tests accordingly.
How can I resolve this? If you propose a fix, please make it concise. |
🧪 测试结果
总体结果: ✅ 所有测试通过 |
There was a problem hiding this comment.
Code Review
This pull request refactors provider and provider endpoint management to enhance data consistency and error handling. It introduces try...catch blocks around tryDeleteProviderVendorIfEmpty calls in removeProviderEndpoint, removeProvider, and reclusterProviderVendors to log cleanup failures without blocking main operations. A new syncProviderEndpointOnProviderEdit function is added to src/repository/provider-endpoints.ts, which intelligently manages provider endpoints when a provider's URL or type changes, handling in-place updates, creation, revival, or soft-deletion of endpoints, including specific logic for unique constraint violations and clearing circuit breaker states. The createProvider and updateProvider functions are updated to be fully transactional, ensuring that vendor creation, provider updates, and endpoint synchronization occur atomically. The review comment highlights a type safety concern in syncProviderEndpointOnProviderEdit regarding the QueryExecutor type, suggesting that the transaction method should be explicitly included in the type definition to avoid a hacky cast and ensure proper handling of nested transactions.
| const executorWithSavepoint = tx as QueryExecutor & { | ||
| transaction?: <T>(runInTx: (nestedTx: TransactionExecutor) => Promise<T>) => Promise<T>; | ||
| }; | ||
|
|
||
| if (typeof executorWithSavepoint.transaction === "function") { | ||
| try { | ||
| await executorWithSavepoint.transaction(async (nestedTx) => { | ||
| await updatePreviousEndpointInPlace(nestedTx); | ||
| }); | ||
| movedInPlace = true; | ||
| } catch (error) { | ||
| if (!isUniqueViolationError(error)) { | ||
| throw error; | ||
| } | ||
| } | ||
| } else { | ||
| // No savepoint support means we cannot safely continue after unique violations. | ||
| await updatePreviousEndpointInPlace(tx); | ||
| movedInPlace = true; | ||
| } | ||
|
|
There was a problem hiding this comment.
The logic for handling unique violations during in-place updates relies on executorWithSavepoint.transaction being a function to create a savepoint. While Drizzle's transaction object (tx) typically supports nested transactions in PostgreSQL, the QueryExecutor type is defined as a Pick that excludes the transaction method. If options.tx is provided by a caller using the QueryExecutor type, this check might fail or behave unexpectedly. Since syncProviderEndpointOnProviderEdit is always wrapped in a transaction at the top level if options.tx is missing, tx will always be a transaction object. It would be cleaner to include transaction in the QueryExecutor type definition to avoid the hacky cast and ensure type safety for nested transactions.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@tests/unit/repository/provider-endpoint-sync-on-edit.test.ts`:
- Around line 116-131: 在测试文件中将对模块的动态替换与导入改为使用项目路径别名而不是相对路径;具体地,将所有
vi.doMock("../../../src/…") 调用和后续 await import("../../../src/…") 改为使用 "@/…"
别名(例如替换 vi.doMock("../../../src/drizzle/db", () => ({ db
}))、vi.doMock("../../../src/repository/provider-endpoints", () => (…)) 以及 await
import("../../../src/repository/provider")),以保持与代码库的路径别名规范一致并避免重构时的路径断裂,同时保留原有的
mock 名称如
getOrCreateProviderVendorIdFromUrlsMock、ensureProviderEndpointExistsForUrlMock、tryDeleteProviderVendorIfEmptyMock
和 syncProviderEndpointOnProviderEditMock 不变。
🧹 Nitpick comments (5)
tests/unit/lib/rate-limit/rolling-window-cache-warm.test.ts (1)
34-36: 与其他测试文件的 mock 模式不一致。
cost-limits.test.ts和service-extra.test.ts均使用vi.hoisted创建 mock 引用,并在beforeEach中(resetAllMocks之后)重新设置返回值。此文件使用了内联vi.fn,且beforeEach中仅调用了clearAllMocks,没有重置 mock 实现。当前
clearAllMocks不会清除实现,因此功能上没有问题。但如果未来有人将其改为resetAllMocks,时区 mock 会静默失效。建议统一为vi.hoisted模式以保持一致性。建议的修改
+const resolveSystemTimezoneMock = vi.hoisted(() => vi.fn(async () => "Asia/Shanghai")); + vi.mock("@/lib/utils/timezone", () => ({ - resolveSystemTimezone: vi.fn(async () => "Asia/Shanghai"), + resolveSystemTimezone: resolveSystemTimezoneMock, }));并在
beforeEach中添加重置:beforeEach(() => { pipelineCommands.length = 0; vi.clearAllMocks(); + resolveSystemTimezoneMock.mockResolvedValue("Asia/Shanghai"); vi.useFakeTimers(); vi.setSystemTime(new Date(nowMs)); });src/repository/provider-endpoints.ts (1)
1066-1069: 建议将 resetEndpointCircuit 失败降级,避免事务被外部副作用影响
该调用位于事务内,若 resetEndpointCircuit 依赖外部存储并抛错,会导致 endpoint 同步与 provider 更新整体回滚。可考虑捕获并降级为告警或移出事务。建议修改
- await resetEndpointCircuit(previousEndpoint.id); + try { + await resetEndpointCircuit(previousEndpoint.id); + } catch (error) { + logger.warn("syncProviderEndpointOnProviderEdit:reset_endpoint_circuit_failed", { + endpointId: previousEndpoint.id, + error: error instanceof Error ? error.message : String(error), + }); + }tests/unit/repository/provider-create-transaction.test.ts (1)
81-103:createDbMock中的tx缺少select方法。当前
txmock 只有insert,但createProvider内部调用的getOrCreateProviderVendorIdFromUrls或ensureProviderEndpointExistsForUrl在事务内可能需要select操作。如果生产代码路径恰好只走了被 mock 的函数(这些函数本身也是 mock 的),那目前不会出错。但如果未来生产代码在事务内新增了直接的tx.select调用,测试会因缺少方法而失败。当前由于
getOrCreateProviderVendorIdFromUrls和ensureProviderEndpointExistsForUrl都已被完全 mock,所以实际不会触发tx.select,测试能通过。可以考虑在tx上补充selectmock 以增强健壮性。tests/integration/provider-endpoint-sync-race.test.ts (1)
19-28:createDeferred中resolve变量的类型安全可以改进。
let resolve: () => void;声明后未初始化,虽然 Promise executor 同步执行保证了赋值,但 TypeScript 无法静态验证这一点,需要在第 26 行使用resolve!非空断言。可以考虑初始化为空函数以消除断言。建议的改进
function createDeferred() { - let resolve: () => void; + let resolve: () => void = () => {}; const promise = new Promise<void>((res) => { resolve = res; }); return { promise, - resolve: resolve!, + resolve, }; }tests/unit/repository/provider-endpoint-sync-helper.test.ts (1)
188-226: 唯一约束冲突回退测试值得注意一个细节。第 200 行
updateWhereMock.mockRejectedValueOnce(...)会让第一次where()调用拒绝,但updateSetMock(第 15-18 行)在调用where之前已将 payload 推入updatePayloads。这意味着updatePayloads[0]包含了失败的 in-place move 的 payload,updatePayloads[1]才是 soft-delete 的 payload。当前断言只检查了
updatePayloads[1](第 220 行),逻辑上是正确的。建议补充对updatePayloads[0]的断言,验证 in-place move 尝试确实发生过,使测试意图更明确。
There was a problem hiding this comment.
Code Review Summary
经过全面的多视角代码审查,本 PR 的实现质量很高,未发现需要修复的严重问题。
PR Size: XL
- Lines changed: 1925 (1640 additions + 285 deletions)
- Files changed: 15
建议: 此 PR 规模较大,建议在合并后密切监控生产环境的事务性能和数据库连接池使用情况。
Issues Found
| Category | Critical | High | Medium | Low |
|---|---|---|---|---|
| Logic/Bugs | 0 | 0 | 0 | 0 |
| Security | 0 | 0 | 0 | 0 |
| Error Handling | 0 | 0 | 0 | 0 |
| Types | 0 | 0 | 0 | 0 |
| Comments/Docs | 0 | 0 | 0 | 0 |
| Tests | 0 | 0 | 0 | 0 |
| Simplification | 0 | 0 | 0 | 0 |
Review Coverage
- Logic and correctness - Clean
- Security (OWASP Top 10) - Clean
- Error handling - Clean
- Type safety - Clean
- Documentation accuracy - Clean
- Test coverage - Excellent (7 new test files, integration + unit tests)
- Code clarity - Good
Key Strengths
-
事务完整性 ✅
createProvider和updateProvider正确包装在事务中- 确保 vendor 解析、provider 更新和 endpoint 同步的原子性
-
错误处理 ✅
- 所有
tryDeleteProviderVendorIfEmpty调用点都添加了 try-catch 包装 - Vendor cleanup 失败不会阻塞主要操作(graceful degradation)
- 所有
-
并发安全 ✅
- 使用
onConflictDoNothing处理并发插入 - 使用 savepoint 处理 unique constraint violations
- 集成测试验证了竞态条件场景
- 使用
-
类型安全 ✅
- 添加了
QueryExecutor类型用于事务兼容的数据库操作 - 移除了
any类型使用(src/repository/provider.ts:881) - 添加了
isUniqueViolationError辅助函数
- 添加了
-
测试覆盖 ✅
- 新增 7 个测试文件,覆盖所有边缘情况
- 包含集成测试验证竞态条件
- 单元测试覆盖事务原子性、endpoint 同步逻辑
Architecture Highlights
核心改进: syncProviderEndpointOnProviderEdit 函数实现了智能 endpoint 迁移:
- In-place update: 相同 vendor+type,仅 URL 变更 → 更新现有 endpoint
- Soft-delete + create: 不同 URL → soft-delete 旧 endpoint,创建新 endpoint
- Revive: 新 URL 匹配已 soft-deleted 的 endpoint → 恢复它
- Conflict handling: 使用 savepoint 优雅处理 unique constraint violations
- Circuit breaker reset: endpoint URL 变更时重置熔断器状态
Automated review by Claude AI
There was a problem hiding this comment.
Code Review Summary
No significant issues identified in this PR.
PR Size: XL
- Lines changed: 1925
- Files changed: 15
- Split suggestion: consider splitting into (1) repository transactional changes, (2) action-layer wiring, (3) tests/config
Review Coverage
- Logic and correctness - Clean
- Security (OWASP Top 10) - Clean
- Error handling - Clean
- Type safety - Clean
- Documentation accuracy - Clean
- Test coverage - Adequate
- Code clarity - Good
Automated review by Codex AI
There was a problem hiding this comment.
- Reviewed PR
#730(fix(provider): wrap endpoint sync in transactions to prevent race conditions). - PR size computed as XL (1640 additions + 285 deletions = 1925 lines; 15 files) and applied label
size/XL. - Submitted a PR review summary comment via
gh pr review --comment(no inline comments posted since no diff-scoped issues met the reporting threshold).
Avoid running endpoint circuit reset side effects inside DB transactions to prevent rollback inconsistency. Run resets only after commit and add regression tests for deferred reset behavior in helper and provider update flows.
Additional Comments (1)
Prompt To Fix With AIThis is a comment left during a code review.
Path: src/repository/provider-endpoints.ts
Line: 714:716
Comment:
**Invalid URL now throws**
`ensureProviderEndpointExistsForUrl()` changed from returning `false` on blank/invalid `url` to throwing (e.g. `throw new Error("[ProviderEndpointEnsure] url is required")`). Existing callers outside this PR that rely on the previous boolean contract will now crash instead of treating it as a no-op. Either restore the old return-`false` behavior (and let transaction-wrapped callers decide to throw), or update *all* call sites to handle these exceptions consistently.
How can I resolve this? If you propose a fix, please make it concise. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/repository/provider-endpoints.ts`:
- Around line 1097-1104: The returned action incorrectly labels the outcome as
"kept-previous-and-created-next" when ensureNextEndpointActive() returns "noop"
(concurrent transaction already created an active next endpoint); update the
branching after const ensureResult = await ensureNextEndpointActive() to
explicitly handle ensureResult === "noop" and return a distinct action (e.g.,
"kept-previous-and-kept-next") instead of the "created-next" label, and also
update the ProviderEndpointSyncAction type to include the new action value so
callers and logs reflect the true outcome; ensure the code paths that use
previousEndpoint, nextEndpoint, ensureNextEndpointActive(), and the returned
action are adjusted accordingly.
🧹 Nitpick comments (3)
src/repository/provider-endpoints.ts (2)
21-25:QueryExecutor缺少query方法,可能在部分场景下不够用。当前
QueryExecutor仅 Pick 了select | insert | update | delete | execute。如果后续有调用者在 tx 上下文中需要使用db.query(Drizzle 的 relational query API),类型上会报错。目前使用场景看起来够用,先记录一下。
1047-1067: savepoint 的类型探测方式较脆弱,建议添加注释说明。通过
as QueryExecutor & { transaction?: ... }进行类型断言,然后用typeof ... === "function"检查 savepoint 支持。这依赖于 Drizzle 事务对象的内部实现细节(嵌套.transaction()映射为SAVEPOINT)。如果 Drizzle 后续版本改变了 API,此处会静默降级到 else 分支(不支持 savepoint → 唯一约束冲突直接抛出)。当前方案可行,建议在注释中记录此依赖,方便后续升级时排查。
src/repository/provider.ts (1)
86-145:.returning()的字段列表在createProvider和updateProvider中大量重复。
createProvider(Line 92-145)和updateProvider(Line 535-588)的.returning()块几乎完全相同,且与findProviderList、findAllProvidersFresh、findProviderById中的.select()字段列表也高度重复。建议抽取为共享常量(如
providerSelectColumns),减少维护负担和遗漏新增字段的风险。提取共享字段列表
// 在文件顶部或 _shared 模块中定义 const providerColumns = { id: providers.id, name: providers.name, url: providers.url, key: providers.key, providerVendorId: providers.providerVendorId, // ... 其余字段 createdAt: providers.createdAt, updatedAt: providers.updatedAt, deletedAt: providers.deletedAt, }; // 使用时 const [provider] = await tx .insert(providers) .values({ ...dbData, providerVendorId }) .returning(providerColumns);
…tion label When ensureNextEndpointActive() returns "noop" (concurrent transaction already created an active next endpoint), the action was incorrectly labelled "kept-previous-and-created-next". Add a new "kept-previous-and-kept-next" action to ProviderEndpointSyncAction and use a three-way branch so callers and logs reflect the true outcome. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| export async function syncProviderEndpointOnProviderEdit( | ||
| input: SyncProviderEndpointOnProviderEditInput, | ||
| options?: { tx?: QueryExecutor } |
There was a problem hiding this comment.
Previous URL not validated
syncProviderEndpointOnProviderEdit() validates nextUrl but not previousUrl. If previousUrl is empty/invalid (e.g., legacy bad data), loadEndpoint()/reference checks will query with an empty string and the function can silently fall back to creating/reviving the nextUrl endpoint without ever cleaning up the old endpoint row. This breaks the “prevent stale/orphaned endpoints” goal for providers that already contain invalid url values.
Consider validating previousUrl similarly (or explicitly treating invalid previous URLs as a distinct case and ensuring old endpoints can still be cleaned up).
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/repository/provider-endpoints.ts
Line: 827:829
Comment:
**Previous URL not validated**
`syncProviderEndpointOnProviderEdit()` validates `nextUrl` but not `previousUrl`. If `previousUrl` is empty/invalid (e.g., legacy bad data), `loadEndpoint()`/reference checks will query with an empty string and the function can silently fall back to creating/reviving the `nextUrl` endpoint without ever cleaning up the old endpoint row. This breaks the “prevent stale/orphaned endpoints” goal for providers that already contain invalid `url` values.
Consider validating `previousUrl` similarly (or explicitly treating invalid previous URLs as a distinct case and ensuring old endpoints can still be cleaned up).
How can I resolve this? If you propose a fix, please make it concise.| const updatePreviousEndpointInPlace = async (executor: QueryExecutor): Promise<void> => { | ||
| await executor | ||
| .update(providerEndpoints) | ||
| .set({ | ||
| vendorId: input.vendorId, | ||
| providerType: input.providerType, | ||
| url: nextUrl, | ||
| deletedAt: null, | ||
| isEnabled: true, |
There was a problem hiding this comment.
No rowcount check
In updatePreviousEndpointInPlace(), the .update(...).where(id=...) result isn’t checked. If the row was concurrently hard-deleted or otherwise not updated, movedInPlace still becomes true and the function returns updated-previous-in-place, but the endpoint row may not actually have moved/been enabled. This can leave the provider pointing at nextUrl while the endpoint table wasn’t updated as assumed.
Consider using .returning({ id: ... }) (or checking affected rows) before setting movedInPlace = true.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/repository/provider-endpoints.ts
Line: 1028:1036
Comment:
**No rowcount check**
In `updatePreviousEndpointInPlace()`, the `.update(...).where(id=...)` result isn’t checked. If the row was concurrently hard-deleted or otherwise not updated, `movedInPlace` still becomes `true` and the function returns `updated-previous-in-place`, but the endpoint row may not actually have moved/been enabled. This can leave the provider pointing at `nextUrl` while the endpoint table wasn’t updated as assumed.
Consider using `.returning({ id: ... })` (or checking affected rows) before setting `movedInPlace = true`.
How can I resolve this? If you propose a fix, please make it concise.| const updateResult = await db.transaction(async (tx) => { | ||
| let previousVendorId: number | null = null; | ||
| let previousUrl: string | null = null; | ||
| let previousProviderType: Provider["providerType"] | null = null; | ||
| let endpointCircuitResetId: number | null = null; | ||
|
|
||
| if (shouldSyncEndpoint) { | ||
| const [current] = await tx | ||
| .select({ |
There was a problem hiding this comment.
Stale pre-read in tx
updateProvider() reads the “current” provider row and later updates it, but the initial SELECT doesn’t lock the provider row (FOR UPDATE). If two updates race, both transactions can read the same previousUrl/previousVendorId and then each run endpoint sync based on stale previous state, causing extra soft-deletes or missed cleanup.
If correctness under concurrent edits is a goal here, the initial read should lock the provider row (or the update should use a single UPDATE ... RETURNING that also returns the previous values where possible) so only one edit computes the migration from a given previous state.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/repository/provider.ts
Line: 491:499
Comment:
**Stale pre-read in tx**
`updateProvider()` reads the “current” provider row and later updates it, but the initial SELECT doesn’t lock the provider row (`FOR UPDATE`). If two updates race, both transactions can read the same `previousUrl/previousVendorId` and then each run endpoint sync based on stale previous state, causing extra soft-deletes or missed cleanup.
If correctness under concurrent edits is a goal here, the initial read should lock the provider row (or the update should use a single `UPDATE ... RETURNING` that also returns the previous values where possible) so only one edit computes the migration from a given previous state.
How can I resolve this? If you propose a fix, please make it concise.
Additional Comments (1)
It’s safer to throw (or at least return a deterministic sentinel like Prompt To Fix With AIThis is a comment left during a code review.
Path: src/repository/provider-endpoints.ts
Line: 56:60
Comment:
**Silent date fallback**
`toDate()` returns `new Date()` when it can’t parse the DB value. That will silently turn invalid timestamps into “now”, which can corrupt semantics for `createdAt/updatedAt/deletedAt` in callers (e.g., a malformed `deletedAt` becomes non-null “now” and the endpoint looks freshly deleted). Since this transformer is used broadly for provider endpoints/vendors, it can produce incorrect state rather than failing fast.
It’s safer to throw (or at least return a deterministic sentinel like `new Date(0)`) when the DB value is not parseable.
How can I resolve this? If you propose a fix, please make it concise. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/repository/provider-endpoints.ts`:
- Around line 1048-1068: The else branch calling
updatePreviousEndpointInPlace(tx) when tx lacks .transaction() can cause the
whole transaction to become aborted on a unique constraint violation; wrap that
call in a try/catch, catch the error and if isUniqueViolationError(error) throw
a new/relevant error (or rethrow with added context) so callers immediately see
the unique-violation abort risk instead of silently leaving the transaction in
an aborted state; reference executorWithSavepoint, tx,
updatePreviousEndpointInPlace, transaction, and isUniqueViolationError to locate
and update the logic.
- Around line 900-997: ensureNextEndpointActive currently revives soft-deleted
endpoints but does not clear stale probe metadata or reset the circuit breaker;
update the revive branches in ensureNextEndpointActive (the tx.update calls that
set deletedAt:null / isEnabled:true) to also clear all lastProbe* fields (e.g.,
lastProbeOk, lastProbeMs/latency, lastProbeAt) and any circuit-breaker state
fields, and ensure you schedule the same post-transaction circuit-breaker reset
used elsewhere (mirror the post-commit reset logic used in the in-place move
code path) so the revived endpoint starts with fresh probe state; modify the
tx.update calls that reference providerEndpoints and invoke the same
reset-after-commit helper used elsewhere.
🧹 Nitpick comments (4)
src/repository/provider-endpoints.ts (2)
27-54:isUniqueViolationError健壮性良好,建议考虑收窄字符串匹配。当前实现同时检查了
code、message和cause的嵌套结构,覆盖了 Drizzle ORM 包装的 PostgreSQL 错误的多种形态,逻辑合理。一个小建议:
message.includes("duplicate key value")是基于 PostgreSQL 默认英文错误消息的硬编码匹配。如果数据库 locale 配置为非英文,该匹配会失效。不过在实际使用中,code === "23505"的检查会先命中,所以 message 匹配只是兜底,风险较低。
827-850:syncProviderEndpointOnProviderEdit函数体过长(~340行),包含多个嵌套闭包。整个
runInTx闭包内定义了loadEndpoint、hasActiveReferencesOnPreviousUrl、ensureNextEndpointActive、updatePreviousEndpointInPlace四个内部函数,加上多条件分支逻辑,认知复杂度较高。考虑将
ensureNextEndpointActive和updatePreviousEndpointInPlace提取为文件级别的私有函数(接受tx和必要参数),可以降低嵌套层级,也便于单独测试。tests/unit/repository/provider-endpoint-sync-helper.test.ts (2)
5-55: Mock 设计基于有序队列,与生产代码的查询顺序强耦合。
createTxMock通过queue.shift()按顺序消费 select 结果,任何对syncProviderEndpointOnProviderEdit内部查询顺序的调整(如增加一次 select 或调换顺序)都会静默导致测试结果错误而非报错。建议考虑两种改进方向:
- 在
selectLimitMock中增加队列耗尽时的显式报错(如throw new Error("unexpected select call")),至少能在查询顺序变化时快速定位。- 长远来看,可按
where条件做基于参数的匹配分发,而非固定顺序。最小改进:队列耗尽时报错
- const selectLimitMock = vi.fn(async () => queue.shift() ?? []); + const selectLimitMock = vi.fn(async () => { + if (queue.length === 0) { + throw new Error("selectLimitMock: unexpected select call - queue exhausted"); + } + return queue.shift()!; + });
1-292: 建议补充对 revive 场景(kept-previous-and-revived-next)的测试用例。当前测试覆盖了
noop、created-next(隐含在 in-place move 中)、kept-next、soft-deleted-previous-and-kept-next等 action,但缺少对revived-next相关 action(如kept-previous-and-revived-next、soft-deleted-previous-and-revived-next)的直接测试。这些路径涉及对软删除端点的恢复,是核心业务逻辑之一。
| const ensureNextEndpointActive = async (options?: { | ||
| reactivateDisabled?: boolean; | ||
| }): Promise<"created-next" | "revived-next" | "noop"> => { | ||
| const reactivateDisabled = options?.reactivateDisabled ?? true; | ||
| const nextEndpoint = await loadEndpoint({ | ||
| vendorId: input.vendorId, | ||
| providerType: input.providerType, | ||
| url: nextUrl, | ||
| }); | ||
|
|
||
| if (!nextEndpoint) { | ||
| const inserted = await tx | ||
| .insert(providerEndpoints) | ||
| .values({ | ||
| vendorId: input.vendorId, | ||
| providerType: input.providerType, | ||
| url: nextUrl, | ||
| label: null, | ||
| updatedAt: now, | ||
| }) | ||
| .onConflictDoNothing({ | ||
| target: [ | ||
| providerEndpoints.vendorId, | ||
| providerEndpoints.providerType, | ||
| providerEndpoints.url, | ||
| ], | ||
| }) | ||
| .returning({ id: providerEndpoints.id }); | ||
|
|
||
| if (inserted[0]) { | ||
| return "created-next"; | ||
| } | ||
|
|
||
| const concurrentEndpoint = await loadEndpoint({ | ||
| vendorId: input.vendorId, | ||
| providerType: input.providerType, | ||
| url: nextUrl, | ||
| }); | ||
|
|
||
| if (!concurrentEndpoint) { | ||
| throw new Error("[ProviderEndpointSync] failed to load next endpoint after conflict"); | ||
| } | ||
|
|
||
| if (concurrentEndpoint.deletedAt !== null) { | ||
| await tx | ||
| .update(providerEndpoints) | ||
| .set({ | ||
| deletedAt: null, | ||
| isEnabled: true, | ||
| updatedAt: now, | ||
| }) | ||
| .where(eq(providerEndpoints.id, concurrentEndpoint.id)); | ||
|
|
||
| return "revived-next"; | ||
| } | ||
|
|
||
| if (reactivateDisabled && !concurrentEndpoint.isEnabled) { | ||
| await tx | ||
| .update(providerEndpoints) | ||
| .set({ | ||
| isEnabled: true, | ||
| updatedAt: now, | ||
| }) | ||
| .where(eq(providerEndpoints.id, concurrentEndpoint.id)); | ||
|
|
||
| return "revived-next"; | ||
| } | ||
|
|
||
| return "noop"; | ||
| } | ||
|
|
||
| if (nextEndpoint.deletedAt !== null) { | ||
| await tx | ||
| .update(providerEndpoints) | ||
| .set({ | ||
| deletedAt: null, | ||
| isEnabled: true, | ||
| updatedAt: now, | ||
| }) | ||
| .where(eq(providerEndpoints.id, nextEndpoint.id)); | ||
|
|
||
| return "revived-next"; | ||
| } | ||
|
|
||
| if (reactivateDisabled && !nextEndpoint.isEnabled) { | ||
| await tx | ||
| .update(providerEndpoints) | ||
| .set({ | ||
| isEnabled: true, | ||
| updatedAt: now, | ||
| }) | ||
| .where(eq(providerEndpoints.id, nextEndpoint.id)); | ||
|
|
||
| return "revived-next"; | ||
| } | ||
|
|
||
| return "noop"; | ||
| }; |
There was a problem hiding this comment.
ensureNextEndpointActive 在 revive 软删除端点时未重置 probe 快照字段和断路器状态。
对比 in-place move 的逻辑(Lines 1031-1043),会清除所有 lastProbe* 字段并在事务提交后重置断路器。但 ensureNextEndpointActive 在 revive 场景(Lines 943-954, 971-982)只设置了 deletedAt: null 和 isEnabled: true,未清除可能已过时的 probe 数据,也没有触发断路器重置。
这意味着一个被软删除的端点被恢复后,可能携带旧的 probe 状态(如 lastProbeOk: false)和可能处于 open 状态的断路器,导致路由层认为该端点不健康而跳过它。
如果这是有意为之(保留历史数据供参考),建议添加注释说明。否则建议在 revive 时同样重置相关状态。
🤖 Prompt for AI Agents
In `@src/repository/provider-endpoints.ts` around lines 900 - 997,
ensureNextEndpointActive currently revives soft-deleted endpoints but does not
clear stale probe metadata or reset the circuit breaker; update the revive
branches in ensureNextEndpointActive (the tx.update calls that set
deletedAt:null / isEnabled:true) to also clear all lastProbe* fields (e.g.,
lastProbeOk, lastProbeMs/latency, lastProbeAt) and any circuit-breaker state
fields, and ensure you schedule the same post-transaction circuit-breaker reset
used elsewhere (mirror the post-commit reset logic used in the in-place move
code path) so the revived endpoint starts with fresh probe state; modify the
tx.update calls that reference providerEndpoints and invoke the same
reset-after-commit helper used elsewhere.
| let movedInPlace = false; | ||
| const executorWithSavepoint = tx as QueryExecutor & { | ||
| transaction?: <T>(runInTx: (nestedTx: TransactionExecutor) => Promise<T>) => Promise<T>; | ||
| }; | ||
|
|
||
| if (typeof executorWithSavepoint.transaction === "function") { | ||
| try { | ||
| await executorWithSavepoint.transaction(async (nestedTx) => { | ||
| await updatePreviousEndpointInPlace(nestedTx); | ||
| }); | ||
| movedInPlace = true; | ||
| } catch (error) { | ||
| if (!isUniqueViolationError(error)) { | ||
| throw error; | ||
| } | ||
| } | ||
| } else { | ||
| // No savepoint support means we cannot safely continue after unique violations. | ||
| await updatePreviousEndpointInPlace(tx); | ||
| movedInPlace = true; | ||
| } |
There was a problem hiding this comment.
Savepoint 回退路径中,无 savepoint 支持时直接执行更新会在唯一约束冲突时导致事务不可用。
Line 1064-1067 的 else 分支在 tx 不支持 .transaction() 方法时,直接执行 updatePreviousEndpointInPlace(tx)。如果此时发生唯一约束冲突,PostgreSQL 会将当前事务标记为 aborted(current transaction is aborted, commands ignored until end of transaction block),后续所有 SQL 语句都会失败,整个事务最终会回滚。
注释说明了这一点("No savepoint support means we cannot safely continue after unique violations"),但调用方可能不清楚这种隐含的事务中止风险。实际运行中 Drizzle ORM 的事务对象通常支持 .transaction()(即 savepoint),所以这个分支可能极少触发。
🤖 Prompt for AI Agents
In `@src/repository/provider-endpoints.ts` around lines 1048 - 1068, The else
branch calling updatePreviousEndpointInPlace(tx) when tx lacks .transaction()
can cause the whole transaction to become aborted on a unique constraint
violation; wrap that call in a try/catch, catch the error and if
isUniqueViolationError(error) throw a new/relevant error (or rethrow with added
context) so callers immediately see the unique-violation abort risk instead of
silently leaving the transaction in an aborted state; reference
executorWithSavepoint, tx, updatePreviousEndpointInPlace, transaction, and
isUniqueViolationError to locate and update the logic.
* fix(proxy): add 'cannot be modified' error detection to thinking signature rectifier Extend the thinking signature rectifier to detect and handle the Anthropic API error when thinking/redacted_thinking blocks have been modified from their original response. This error occurs when clients inadvertently modify these blocks in multi-turn conversations. The rectifier will now remove these blocks and retry the request, similar to how it handles other thinking-related signature errors. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * chore(deps): bump jspdf in the npm_and_yarn group across 1 directory Bumps the npm_and_yarn group with 1 update in the / directory: [jspdf](https://github.com/parallax/jsPDF). Updates `jspdf` from 3.0.4 to 4.1.0 - [Release notes](https://github.com/parallax/jsPDF/releases) - [Changelog](https://github.com/parallax/jsPDF/blob/master/RELEASE.md) - [Commits](parallax/jsPDF@v3.0.4...v4.1.0) --- updated-dependencies: - dependency-name: jspdf dependency-version: 4.1.0 dependency-type: direct:production dependency-group: npm_and_yarn ... Signed-off-by: dependabot[bot] <support@github.com> * fix: Hot-reload cache invalidation for Request Filters and Sensitive Words (#710) * fix: hot-reload request filters via globalThis singleton pattern EventEmitter and RequestFilterEngine now use globalThis caching to ensure the same instance is shared across different Next.js worker contexts. This fixes the issue where filter changes required Docker restart. Added diagnostic logging for event subscription and propagation. * fix(redis): wait for subscriber connection ready before subscribe - ensureSubscriber now returns Promise<Redis>, waits for 'ready' event - subscribeCacheInvalidation returns null on failure instead of noop - RequestFilterEngine checks cleanup !== null before logging success - Fixes false "Subscribed" log when Redis connection fails * feat(sensitive-words): add hot-reload via Redis pub/sub Enable real-time cache invalidation for sensitive words detector, matching the pattern used by request-filter-engine and error-rule-detector. * fix(redis): harden cache invalidation subscriptions Ensure sensitive-words CRUD emits update events so hot-reload propagates across workers. Roll back failed pub/sub subscriptions, add retry/timeout coverage, and avoid sticky provider-cache subscription state. * fix(codex): bump default User-Agent fallback Update the hardcoded Codex UA used when requests lack an effective user-agent (e.g. filtered out). Keep unit tests in sync with the new default. * fix(redis): resubscribe cache invalidation after reconnect Clear cached subscription state on disconnect and resubscribe on ready so cross-worker cache invalidation survives transient Redis reconnects. Add unit coverage, avoid misleading publish logs, track detector cleanup handlers, and translate leftover Russian comments to English. * fix(sensitive-words): use globalThis singleton detector Align SensitiveWordDetector with existing __CCH_* singleton pattern to avoid duplicate instances across module reloads. Extend singleton unit tests to cover the detector. * chore: format code (req-fix-dda97fd) * fix: address PR review comments - pubsub.ts: use `once` instead of `on` for ready event to prevent duplicate resubscription handlers on reconnect - forwarder.ts: extract DEFAULT_CODEX_USER_AGENT constant - provider-cache.ts: wrap subscribeCacheInvalidation in try/catch - tests: use exported constant instead of hardcoded UA string * fix(redis): resubscribe across repeated reconnects Ensure pub/sub resubscribe runs on every reconnect, extend unit coverage, and keep emitRequestFiltersUpdated resilient when logger import fails. --------- Co-authored-by: John Doe <johndoe@example.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * feat(logs): make cost column toggleable with improved type safety (#715) close #713 * fix(proxy): add OpenAI chat completion format support in usage extraction (#705) (#716) The `extractUsageMetrics` function was missing support for OpenAI chat completion format fields (`prompt_tokens`/`completion_tokens`), causing token statistics to not be recorded for OpenAI-compatible providers. Changes: - Add `prompt_tokens` -> `input_tokens` mapping - Add `completion_tokens` -> `output_tokens` mapping - Preserve priority: Claude > Gemini > OpenAI format - Add 5 unit tests for OpenAI format handling Closes #705 Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * fix(currency): respect system currencyDisplay setting in UI (#717) Fixes #678 - Currency display unit configuration was not applied. Root cause: - `users-page-client.tsx` hardcoded `currencyCode="USD"` - `UserLimitBadge` and `LimitStatusIndicator` had hardcoded `unit="$"` default - `big-screen/page.tsx` used hardcoded "$" in multiple places Changes: - Add `getCurrencySymbol()` helper function to currency.ts - Fetch system settings in `users-page-client.tsx` and pass to table - Pass `currencySymbol` from `user-key-table-row.tsx` to limit badges - Remove hardcoded "$" defaults from badge components - Update big-screen page to fetch settings and use dynamic symbol - Add unit tests for `getCurrencySymbol` Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * feat(gemini): add Google Search web access preference for Gemini providers (#721) * feat(gemini): add Google Search web access preference for Gemini providers Add provider-level preference for Gemini API type providers to control Google Search (web access) tool injection: - inherit: Follow client request (default) - enabled: Force inject googleSearch tool into request - disabled: Force remove googleSearch tool from request Changes: - Add geminiGoogleSearchPreference field to provider schema - Add GeminiGoogleSearchPreference type and validation - Implement applyGeminiGoogleSearchOverride with audit trail - Add UI controls in provider form (Gemini Overrides section) - Add i18n translations for 5 languages (en, zh-CN, zh-TW, ja, ru) - Integrate override logic in proxy forwarder for Gemini requests - Add 22 unit tests for the override logic Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(gemini): address code review feedback - Use explicit else-if for disabled preference check (gemini-code-assist) - Use i18n for SelectValue placeholder instead of hardcoded string (coderabbitai) - Sync overridden body back to session.request.message for log consistency (coderabbitai) - Persist Gemini special settings immediately, matching Anthropic pattern (coderabbitai) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(gemini): use strict types for provider config and audit - Narrow preference type to "enabled" | "disabled" (exclude unreachable "inherit") - Use ProviderType and GeminiGoogleSearchPreference types instead of string Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * fix(api): 透传 /api/actions 认证会话以避免 getUsers 返回空数据 (#720) * fix(api): 透传 /api/actions 认证会话以避免 getUsers 返回空数据 * fix(auth): 让 scoped session 继承 allowReadOnlyAccess 语义并支持内部降权校验 * chore: format code (dev-93585fa) * fix: bound SessionTracker active_sessions zsets by env TTL (#718) * fix(session): bound active_sessions zsets by env ttl Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * fix(rate-limit): pass session ttl to lua cleanup Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * fix(session): validate SESSION_TTL env and prevent ZSET leak on invalid values - Add input validation for SESSION_TTL (reject NaN, 0, negative; default 300) - Guard against invalid TTL in Lua script to prevent clearing all sessions - Use dynamic EXPIRE based on SESSION_TTL instead of hardcoded 3600s - Add unit tests for TTL validation and dynamic expiry behavior --------- Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * fix(provider): stop standard-path fallback to legacy provider url * feat(providers): expose vendor endpoint pools in settings UI (#719) * feat(providers): add endpoint status mapping * feat(providers): add endpoint pool hover * feat(providers): show vendor endpoints in list rows * feat(providers): extract vendor endpoint CRUD table * chore(i18n): add provider endpoint UI strings * fix(providers): integrate endpoint pool into provider form * fix(provider): wrap endpoint sync in transactions to prevent race conditions (#730) * fix(provider): wrap provider create/update endpoint sync in transactions Provider create and update operations now run vendor resolution and endpoint sync inside database transactions to prevent race conditions that could leave orphaned or inconsistent endpoint rows. Key changes: - createProvider: wrap vendor + insert + endpoint seed in a single tx - updateProvider: wrap vendor + update + endpoint sync in a single tx - Add syncProviderEndpointOnProviderEdit for atomic URL/type/vendor migration with in-place update, soft-delete, and conflict handling - Vendor cleanup failures degrade to warnings instead of propagating - Add comprehensive unit and integration tests for sync edge cases Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(provider): defer endpoint circuit reset until transaction commit Avoid running endpoint circuit reset side effects inside DB transactions to prevent rollback inconsistency. Run resets only after commit and add regression tests for deferred reset behavior in helper and provider update flows. * fix(provider): distinguish noop from created-next in endpoint sync action label When ensureNextEndpointActive() returns "noop" (concurrent transaction already created an active next endpoint), the action was incorrectly labelled "kept-previous-and-created-next". Add a new "kept-previous-and-kept-next" action to ProviderEndpointSyncAction and use a three-way branch so callers and logs reflect the true outcome. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review comments from PR #731 - fix(auth): prevent scoped session access widening via ?? -> && guard - fix(i18n): standardize zh-CN provider terminology to "服务商" - fix(i18n): use consistent Russian translations for circuit status - fix(i18n): replace raw formatDistanceToNow with locale-aware RelativeTime - fix(gemini): log warning for unknown google search preference values - fix(error-rules): check subscribeCacheInvalidation return value - fix(test): correct endpoint hover sort test to assert URLs not labels Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: export auth session storage and fix test mock types - Export authSessionStorage from auth-session-storage.node.ts to prevent undefined on named imports; remove duplicate declare global block - Fix mockEndpoints in provider-endpoint-hover test: remove nonexistent lastOk/lastLatencyMs fields, add missing lastProbe* fields, use Date objects for createdAt/updatedAt Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: miraserver <20286838+miraserver@users.noreply.github.com> Co-authored-by: John Doe <johndoe@example.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: 泠音 <im@ling.plus> Co-authored-by: Longlone <41830147+WAY29@users.noreply.github.com> Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Summary
This PR wraps provider create/update operations in database transactions to prevent race conditions that could leave orphaned or inconsistent endpoint rows in the
provider_endpointstable.Problem
Related Issues:
provider_endpointstable accumulates stale records when provider URLs are edited, never cleaning up old endpointsRoot Cause:
When editing a provider's URL or type, the system previously executed these operations sequentially without transaction protection:
This created race conditions where:
Solution
Core Changes
1. Transaction-wrapped provider operations (
src/repository/provider.ts)createProvider()now wraps vendor resolution + provider insert + endpoint seeding in a single transactionupdateProvider()now wraps vendor resolution + provider update + endpoint sync in a single transaction2. New atomic endpoint sync function (
src/repository/provider-endpoints.ts)syncProviderEndpointOnProviderEdit()for intelligent endpoint migration when provider URL/type changes3. Transaction-aware helper functions
getOrCreateProviderVendorIdFromUrls()now accepts optional{ tx }parameterensureProviderEndpointExistsForUrl()now accepts optional{ tx }parametertryDeleteProviderVendorIfEmpty()now accepts optional{ tx }parameter4. Graceful vendor cleanup degradation
Supporting Changes
Type safety improvements:
QueryExecutortype for transaction-compatible database operationsisUniqueViolationError()helper for detecting PostgreSQL unique constraint violationsTest coverage:
syncProviderEndpointOnProviderEditcovering all edge cases (noop, in-place update, soft-delete + create, revive, conflict)createProvidertransaction atomicityChanges by File
Core Logic
src/repository/provider.ts(+231/-189) - Wrap create/update in transactionssrc/repository/provider-endpoints.ts(+454/-62) - AddsyncProviderEndpointOnProviderEdit, transaction supportActions Layer
src/actions/providers.ts(+24/-6) - Graceful vendor cleanup error handlingsrc/actions/provider-endpoints.ts(+9/-1) - Graceful vendor cleanup error handlingTests
tests/integration/provider-endpoint-sync-race.test.ts(+147) - Race condition integration testtests/unit/repository/provider-create-transaction.test.ts(+187) - Transaction atomicity teststests/unit/repository/provider-endpoint-sync-helper.test.ts(+227) - Sync function edge casestests/unit/repository/provider-endpoint-sync-on-edit.test.ts(+219) - Edit scenariostests/unit/repository/provider-endpoints.test.ts(+49/-20) - Updated for transactionstests/unit/actions/providers.test.ts(+59) - Action layer endpoint sync teststests/unit/actions/providers-recluster.test.ts(+15/-7) - Recluster transaction supporttests/unit/lib/rate-limit/*.test.ts(+18) - Mock updates for timezone resolutionConfiguration
vitest.integration.config.ts(+1) - Integration test configBreaking Changes
None. All changes are backward compatible.
Test Plan
Automated Tests
syncProviderEndpointOnProviderEditcovering all edge cases (noop, in-place update, soft-delete + create, revive, conflict)createProvidertransaction atomicityManual Testing (if desired)
https://api-a.example.com/v1/messagesprovider_endpointstablehttps://api-b.example.com/v1/messageshttps://api-a.example.com/v1/messagesChecklist
Description enhanced by Claude AI
Greptile Overview
Greptile Summary
provider_vendorsand seeds/syncsprovider_endpoints, aiming to eliminate races and partial writes.syncProviderEndpointOnProviderEdit()to atomically migrate endpoints across URL/vendor/type changes (create/revive/update-in-place/soft-delete paths) with unique-conflict handling.{ tx }executors, and downgrades vendor cleanup failures to warnings in actions/repository.Confidence Score: 3/5
Important Files Changed
syncProviderEndpointOnProviderEdit()for endpoint migration; review findings include silent date fallback, missing previousUrl validation, and no rowcount check on in-place move update.{ tx }.Sequence Diagram
sequenceDiagram autonumber participant UI as Vendor Settings UI participant A as actions/providers.ts participant PR as repository/provider.ts participant DB as Postgres (drizzle) participant PE as repository/provider-endpoints.ts participant CB as endpoint-circuit-breaker UI->>A: editProvider(id, {url/type/websiteUrl}) A->>PR: updateProvider(id, data) PR->>DB: transaction(begin) alt needs vendor refresh PR->>PE: getOrCreateProviderVendorIdFromUrls(..., {tx}) PE->>DB: select/insert provider_vendors end PR->>DB: update providers set ... returning provider alt shouldSyncEndpoint PR->>PE: syncProviderEndpointOnProviderEdit(prev,next,..., {tx}) PE->>DB: select previous/next endpoints alt in-place move possible PE->>DB: update provider_endpoints (move + clear probe snapshot) PE-->>PR: {action, resetCircuitEndpointId} else create/revive next PE->>DB: insert on conflict do nothing PE->>DB: update deletedAt/isEnabled if revive PE-->>PR: {action} end end PR->>DB: transaction(commit) opt resetCircuitEndpointId PR->>CB: resetEndpointCircuit(endpointId) end opt previousVendorIdToCleanup PR->>PE: tryDeleteProviderVendorIfEmpty(vendorId) PE->>DB: transaction(begin) PE->>DB: check active providers/endpoints PE->>DB: delete vendor if safe PE->>DB: transaction(commit) end PR-->>A: updated provider A-->>UI: success response