Skip to content

fix(providers): auto-backfill vendor aggregation for legacy providers#635

Merged
ding113 merged 1 commit intodevfrom
fix/provider-vendor-backfill
Jan 21, 2026
Merged

fix(providers): auto-backfill vendor aggregation for legacy providers#635
ding113 merged 1 commit intodevfrom
fix/provider-vendor-backfill

Conversation

@ding113
Copy link
Owner

@ding113 ding113 commented Jan 21, 2026

Summary

  • Fix bug where legacy providers were incorrectly grouped under "Unknown Vendor #0" after upgrading
  • Add automatic vendor backfill during startup to aggregate providers by domain
  • Add frontend protection for orphaned providers (vendorId=-1)

Changes

Backend

  • src/repository/provider-endpoints.ts: Add backfillProviderVendorsFromProviders() and deriveDisplayNameFromDomain()
  • src/instrumentation.ts: Integrate vendor backfill into startup flow (production & development)
  • src/drizzle/schema.ts: Remove .notNull() constraint to match actual migration
  • src/repository/_shared/transformers.ts: Change default from ?? 0 to ?? null
  • src/types/provider.ts: Update type to providerVendorId: number | null

Frontend

  • provider-vendor-view.tsx: Add orphaned providers grouping logic, vendorId>0 protection
  • vendor-keys-compact-list.tsx: Add vendorId<=0 early return in urlResolver
  • i18n: Add orphanedProviders key in 5 languages (en, zh-CN, zh-TW, ja, ru)

Test plan

  • TypeScript type check passes
  • Production build succeeds
  • Manual test: Verify legacy providers auto-aggregate on startup
  • Manual test: Verify orphaned providers display correctly in UI

Generated with Claude Code

Greptile Summary

This PR fixes a critical bug where legacy providers were incorrectly grouped under "Unknown Vendor #0" after upgrading. The solution implements automatic vendor backfill during startup to aggregate providers by domain, while adding frontend protections for orphaned providers.

Key Changes:

  • Added backfillProviderVendorsFromProviders() function that auto-aggregates legacy providers by domain, with intelligent display name derivation (e.g., api.openai.com → OpenAI)
  • Integrated vendor backfill into startup flow for both production and development environments (runs after migrations)
  • Updated type system: providerVendorId: number | null to support orphaned providers
  • Removed .notNull() constraint from schema to match actual migration
  • Frontend groups orphaned providers under special vendorId=-1 with "Unknown Vendor" label
  • Added protective null checks in proxy logic (forwarder, provider-selector, session) before accessing vendor-specific features
  • Frontend urlResolver prevents endpoint resolution for orphaned providers (vendorId <= 0)
  • i18n support added across 5 languages for orphaned provider grouping

Design Strengths:

  • Idempotent backfill logic uses pagination and tracks progress, safe to rerun
  • Non-blocking initialization with graceful error handling
  • Comprehensive logging for monitoring backfill success and failures
  • All vendor-specific operations properly guarded against null/invalid vendor IDs
  • Frontend and backend protections work in tandem to prevent invalid endpoint associations

Confidence Score: 5/5

  • This PR is safe to merge with high confidence. All changes are well-protected with null checks, the backfill logic is idempotent, and the implementation properly handles both legacy and new provider scenarios.
  • Score of 5 reflects: (1) Comprehensive null/type safety improvements across type definitions and proxy logic; (2) Idempotent backfill design with pagination and error handling; (3) Proper frontend protections guarding vendor-specific operations; (4) Non-blocking startup integration with graceful degradation; (5) No breaking changes to existing APIs; (6) Thorough i18n coverage across 5 languages. The backfill function correctly handles edge cases (invalid URLs, network errors) and provides detailed logging. All vendor-dependent features (circuit breakers, endpoints, vendor deletion) are properly guarded.
  • No files require special attention. All changes demonstrate solid engineering with proper error handling and type safety.

Important Files Changed

Filename Overview
src/types/provider.ts Type definition updated to make `providerVendorId: number
src/repository/provider-endpoints.ts Adds two critical functions: backfillProviderVendorsFromProviders() for auto-aggregating legacy providers by domain, and deriveDisplayNameFromDomain() for intelligent vendor naming. Uses pagination and error handling with comprehensive logging. Idempotent design prevents duplicate processing.
src/instrumentation.ts Integrates vendor backfill into startup flow for both production and development environments. Backfill runs after migrations but before other initialization. Non-blocking with graceful error handling and detailed logging. Respects AUTO_MIGRATE flag.
src/app/[locale]/settings/providers/_components/provider-vendor-view.tsx Frontend grouping logic correctly handles orphaned providers by assigning vendorId=-1. Shows "Unknown Vendor" for orphaned group (line 195-196). Delete button properly guarded with vendorId > 0 check (line 230). VendorEndpointsSection also protected with vendorId > 0 (line 248).
src/app/v1/_lib/proxy/forwarder.ts Updated checks to handle nullable providerVendorId. Added null checks before accessing vendor circuit features on lines 231, 269, and 916. Prevents errors when orphaned providers (vendorId=null) reach vendor-specific logic.

Sequence Diagram

sequenceDiagram
    participant App as Application Startup
    participant Inst as Instrumentation Hook
    participant DB as Database
    participant BE as Backend (Backfill)
    participant FE as Frontend View
    participant Proxy as Proxy Layer

    App->>Inst: register()
    Inst->>DB: Check DB connection
    Inst->>DB: Run migrations
    Inst->>BE: backfillProviderVendorsFromProviders()
    BE->>DB: SELECT providers WHERE providerVendorId IS NULL OR = 0
    BE->>DB: Derive domain from URL
    BE->>DB: Create/find vendor by domain
    BE->>DB: UPDATE provider.providerVendorId
    BE-->>Inst: Return stats (processed, updated, created)
    Inst->>Inst: Log backfill results
    Inst->>FE: Initialize UI
    FE->>DB: Fetch providers & vendors
    FE->>FE: Group by providerVendorId
    FE->>FE: Create orphaned group (vendorId=-1)
    FE->>FE: Render vendor cards with proper grouping
    FE->>FE: Guard endpoint UI with vendorId > 0 check
    Proxy->>DB: Load provider for request
    Proxy->>Proxy: Check if providerVendorId exists
    alt Has Valid Vendor
        Proxy->>Proxy: Apply vendor circuit breaker
        Proxy->>Proxy: Select from endpoint pool
    else Orphaned (null or -1)
        Proxy->>Proxy: Skip vendor-specific logic
        Proxy->>Proxy: Use provider.url directly
    end
Loading

When upgrading from older versions, existing providers were incorrectly
grouped under "Unknown Vendor #0" instead of being auto-aggregated by
their website domain.

Changes:
- Add backfillProviderVendorsFromProviders() to auto-create vendors from
  provider URLs during startup
- Add deriveDisplayNameFromDomain() to generate display names from domains
- Integrate vendor backfill into instrumentation.ts startup flow
- Fix providerVendorId type to allow null (Schema, Type, Transformer)
- Add vendorId=-1 protection for orphaned providers in frontend
- Add i18n keys for orphanedProviders in 5 languages

The backfill runs idempotently on startup, processing providers with
null or 0 vendorId, extracting domains from websiteUrl (preferred) or
url, and creating/associating vendors accordingly.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @ding113, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request delivers a comprehensive solution to address the incorrect grouping of legacy providers and enhance the system's handling of vendor associations. It automates the process of assigning vendors to existing providers based on their domain during startup, ensuring data consistency. Simultaneously, it refines the data model and updates both the user interface and core backend logic to gracefully manage and display providers that do not yet have a vendor, improving overall system robustness and user experience.

Highlights

  • Automatic Vendor Backfill: Introduced an automatic backfill mechanism during application startup to aggregate legacy providers by their domain, assigning them to appropriate vendors or creating new ones if necessary. This resolves issues where providers were incorrectly grouped under 'Unknown Vendor #0'.
  • Nullability Support for Provider Vendors: Updated the database schema to allow providerVendorId to be NULL, and adjusted TypeScript types (number | null) and data transformation logic to consistently handle providers without an assigned vendor ID. The default value for providerVendorId is now null instead of 0.
  • Frontend UI for Orphaned Providers: Enhanced the frontend to group and display providers with null or invalid vendor IDs under a new 'Unknown Vendor' category. This group (vendorId=-1) has restricted actions, preventing deletion or endpoint management, and includes new i18n keys for localization.
  • Robust Backend Handling of Null Vendor IDs: Modified various backend proxy and repository functions to include explicit null checks for providerVendorId before performing operations like endpoint selection, circuit breaker checks, or vendor deletion, ensuring stability and preventing errors for providers without an assigned vendor.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link

coderabbitai bot commented Jan 21, 2026

📝 Walkthrough

功能说明

此PR引入对孤立供应商的支持,允许提供商的vendorId为null。更新了多语言翻译字符串、数据库模式、类型定义、UI组件和验证逻辑,并添加了自动回填功能以处理缺失的供应商ID。

变更内容

类别 / 文件 变更摘要
多语言翻译文件
messages/en/settings/providers/strings.json, messages/ja/settings/providers/strings.json, messages/ru/settings/providers/strings.json, messages/zh-CN/settings/providers/strings.json, messages/zh-TW/settings/providers/strings.json
各语言新增orphanedProviders翻译键;更新vendorFallbackNamevendorTypeCircuitUpdated等翻译内容
数据库模式
src/drizzle/schema.ts
providerVendorIdNOT NULL改为nullable,允许供应商ID为空
类型定义
src/types/provider.ts
更新ProviderProviderDisplay接口中providerVendorId类型为number | null
UI组件渲染
src/app/[locale]/settings/providers/_components/provider-vendor-view.tsx
按vendorId分组供应商,孤立项目使用特殊键(-1);根据vendorId显示orphanedProviders标签;仅对有效供应商渲染删除对话框和端点部分
URL解析守卫
src/app/[locale]/settings/providers/_components/vendor-keys-compact-list.tsx
添加保护条件,当vendorId ≤ 0时返回null,防止无效供应商的端点获取
请求转发逻辑
src/app/v1/_lib/proxy/forwarder.ts
在三处添加vendorId有效性检查(需truthy且 > 0):端点候选构建、熔断状态跳过、超时条件评估
供应商选择
src/app/v1/_lib/proxy/provider-selector.ts
仅在vendorId > 0时执行熔断检查,防止无效供应商的不必要检查
会话管理
src/app/v1/_lib/proxy/session.ts
使用?? undefined确保null vendorId存储为undefined
供应商操作
src/actions/providers.ts
仅当存在providerVendorId时才执行自动清理步骤
转换器
src/repository/_shared/transformers.ts
将缺失的providerVendorId默认值从0改为null
供应商仓储
src/repository/provider.ts
条件化供应商端点回填:仅在providerVendorId存在或变更时执行
端点仓储与回填
src/repository/provider-endpoints.ts
新增backfillProviderVendorsFromProviders()函数,自动为providerVendorId为null/0的提供商分配供应商ID;从域名推导显示名称
应用初始化
src/instrumentation.ts
在生产、初始和开发启动路径中新增provider-vendors回填步骤

代码审查工作量评估

🎯 3 (中等) | ⏱️ ~25 分钟

相关PR

  • PR #608: 修改相同的供应商/供应商ID相关代码路径,包括模式、仓储、UI分组和熔断保护逻辑
  • PR #554: 修改提供商相关代码路径,特别是src/actions/providers.tssrc/app/v1/_lib/proxy/session.ts
  • PR #522: 修改重叠的提供商文件,如src/actions/providers.tssrc/drizzle/schema.ts
🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed 标题清晰准确地总结了主要变更:为遗留提供程序添加自动供应商聚合回填功能,符合changeset的核心目的。
Description check ✅ Passed 描述与changeset相关,涵盖了主要变更的后端和前端部分,提供了清晰的变更概览和测试计划。

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/provider-vendor-backfill

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

🧪 测试结果

测试类型 状态
代码质量
单元测试
集成测试
API 测试

总体结果: ✅ 所有测试通过

@github-actions github-actions bot added the size/M Medium PR (< 500 lines) label Jan 21, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the issue of legacy providers being incorrectly grouped by introducing an automatic vendor backfill mechanism. The changes are well-structured, with corresponding updates to the database schema, data transformers, and frontend components to support nullable providerVendorId and handle orphaned providers gracefully.

I've identified a couple of minor areas for improvement:

  • There's some code duplication in the startup script (instrumentation.ts) that could be refactored for better maintainability.
  • A JSDoc comment in provider-endpoints.ts is slightly inconsistent with its implementation, which might cause confusion.

Overall, this is a solid contribution that improves data integrity and the user experience for legacy provider management. The changes are thorough and consider both backend and frontend implications.

Comment on lines +209 to +225
// 回填 provider_vendors(按域名自动聚合旧 providers)
try {
const { backfillProviderVendorsFromProviders } = await import(
"@/repository/provider-endpoints"
);
const vendorResult = await backfillProviderVendorsFromProviders();
logger.info("[Instrumentation] Provider vendors backfill completed", {
processed: vendorResult.processed,
providersUpdated: vendorResult.providersUpdated,
vendorsCreatedCount: vendorResult.vendorsCreated.size,
skippedInvalidUrl: vendorResult.skippedInvalidUrl,
});
} catch (error) {
logger.warn("[Instrumentation] Failed to backfill provider vendors", {
error: error instanceof Error ? error.message : String(error),
});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block of code for backfilling provider vendors is duplicated later in this file for the development environment (lines 309-325). To improve maintainability and reduce redundancy, consider extracting this logic into a separate helper function and calling it from both the production and development setup blocks.

@ding113 ding113 merged commit c65cf16 into dev Jan 21, 2026
18 checks passed
@github-project-automation github-project-automation bot moved this from Backlog to Done in Claude Code Hub Roadmap Jan 21, 2026

for (const row of rows) {
stats.processed++;

Copy link
Contributor

@github-actions github-actions bot Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH] [LOGIC-BUG] Vendor backfill can skip valid providers when websiteUrl is present but unparsable (src/repository/provider-endpoints.ts:287)

Why this is a problem: backfillProviderVendorsFromProviders() currently prefers row.websiteUrl?.trim() and immediately continues when that URL can’t be parsed, even if row.url is valid. This contradicts the intent stated in the comment ("按照 website_url(优先)或 url 的域名进行自动聚合") and can leave legacy providers un-aggregated (still provider_vendor_id NULL/0).

Suggested fix:

const websiteDomain = row.websiteUrl ? normalizeWebsiteDomainFromUrl(row.websiteUrl) : null;
const providerDomain = normalizeWebsiteDomainFromUrl(row.url);
const domain = websiteDomain ?? providerDomain;

if (!domain) {
  logger.warn("[backfillVendors] Invalid URL for provider", {
    providerId: row.id,
    url: row.url,
    websiteUrl: row.websiteUrl,
  });
  stats.skippedInvalidUrl++;
  lastId = Math.max(lastId, row.id);
  continue;
}

const displayName = deriveDisplayNameFromDomain(domain);
const vendorId = await getOrCreateProviderVendorIdFromUrls({
  providerUrl: row.url,
  websiteUrl: websiteDomain ? row.websiteUrl : null,
  faviconUrl: row.faviconUrl ?? null,
  displayName,
});

/**
* 为所有 provider_vendor_id 为 NULL 或 0 的 providers 创建 vendor
* 按照 website_url(优先)或 url 的域名进行自动聚合
*/
Copy link
Contributor

@github-actions github-actions bot Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH] [TEST-MISSING-CRITICAL] New vendor backfill has no unit tests (src/repository/provider-endpoints.ts:243)

Why this is a problem: This PR adds new startup behavior via backfillProviderVendorsFromProviders() but doesn’t add test coverage for the core branches (paging, invalid URL handling, update path). Project rule: "Test Coverage - All new features must have unit test coverage of at least 80%".

Suggested fix (extend existing repository tests):

// tests/unit/repository/provider-endpoints.test.ts

test("backfillProviderVendorsFromProviders: falls back to provider.url when websiteUrl is invalid", async () => {
  vi.resetModules();

  const selectPages = [
    [
      {
        id: 1,
        name: "p1",
        url: "https://api.openai.com/v1",
        websiteUrl: "http://",
        faviconUrl: null,
        providerVendorId: 0,
      },
    ],
    [],
  ];

  const selectMock = vi
    .fn()
    .mockImplementationOnce(() => createThenableQuery(selectPages.shift() ?? []))
    .mockImplementationOnce(() => createThenableQuery([]))
    .mockImplementationOnce(() => createThenableQuery([]));

  const returning = vi.fn(async () => [{ id: 42 }]);
  const insertMock = vi.fn(() => ({
    values: vi.fn(() => ({ onConflictDoNothing: vi.fn(() => ({ returning })) })),
  }));

  const updateWhere = vi.fn(async () => undefined);
  const updateMock = vi.fn(() => ({ set: vi.fn(() => ({ where: updateWhere })) }));

  vi.doMock("@/drizzle/db", () => ({
    db: { select: selectMock, insert: insertMock, update: updateMock },
  }));

  const { backfillProviderVendorsFromProviders } = await import("@/repository/provider-endpoints");
  const result = await backfillProviderVendorsFromProviders();

  expect(result.providersUpdated).toBe(1);
});

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

This PR changes providerVendorId semantics (nullable) and adds an automatic vendor backfill on startup. The backfill logic has an edge case where a malformed websiteUrl causes the provider to be skipped even if url is valid, which can leave the upgrade bug partially unfixed. New startup behavior was also introduced without the required unit test coverage.

PR Size: M

  • Lines changed: 276
  • Files changed: 17

Issues Found

Category Critical High Medium Low
Logic/Bugs 0 1 0 0
Security 0 0 0 0
Error Handling 0 0 0 0
Types 0 0 0 0
Comments/Docs 0 0 0 0
Tests 0 1 0 0
Simplification 0 0 0 0

Critical Issues (Must Fix)

  • [LOGIC-BUG] backfillProviderVendorsFromProviders skips providers when websiteUrl is invalid instead of falling back to url (src/repository/provider-endpoints.ts:287) (Confidence: 95)

High Priority Issues (Should Fix)

  • [TEST-MISSING-CRITICAL] No unit tests added for vendor backfill + orphaned-provider path (src/repository/provider-endpoints.ts:243) (Confidence: 85)

Review Coverage

  • Logic and correctness
  • Security (OWASP Top 10)
  • Error handling
  • Type safety
  • Documentation accuracy
  • Test coverage
  • Code clarity

Automated review by Codex AI

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • PR #635 classified as M (276 lines, 17 files) and labeled size/M.
  • Inline review comments posted (diff-only):
    • [HIGH] [LOGIC-BUG] src/repository/provider-endpoints.ts:287 — backfill skips providers when websiteUrl is unparsable instead of falling back to url (suggested concrete fix included).
    • [HIGH] [TEST-MISSING-CRITICAL] src/repository/provider-endpoints.ts:243 — new vendor backfill behavior added without the required unit tests (suggested test scaffold included).
  • Summary review submitted via gh pr review with the above findings and coverage checklist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:i18n area:provider area:UI bug Something isn't working size/M Medium PR (< 500 lines)

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant