fix(billing): 修复 Gemini 图片生成模型的 IMAGE modality token 计费问题 by sususu98 · Pull Request #664 · ding113/claude-code-hub

sususu98 · 2026-01-28T04:14:21Z

Summary

Fix billing calculation for Gemini image generation models (e.g., gemini-3-pro-image-preview). IMAGE modality tokens cost $0.00012/token, which is 10x more expensive than TEXT tokens at $0.000012/token. The previous implementation did not parse modality-specific token details, resulting in approximately 7.6x undercharging.

Problem

The system was treating all output tokens uniformly without distinguishing between IMAGE and TEXT modalities in Gemini's candidatesTokensDetails and promptTokensDetails response fields.

Related Issues:

Fixes Image generation models (gemini-3-pro-image-preview) billing calculation is incorrect #663 - Image generation models billing calculation is incorrect

Solution

Extended type definitions (src/types/model-price.ts)
- Added output_cost_per_image_token and input_cost_per_image_token fields
Updated usage extraction (src/app/v1/_lib/proxy/response-handler.ts)
- Parse candidatesTokensDetails to extract output_image_tokens and output_tokens (TEXT)
- Parse promptTokensDetails to extract input_image_tokens and input_tokens (TEXT)
- Case-insensitive modality matching via toUpperCase()
- Calculate unaccounted tokens from candidatesTokenCount difference as TEXT
Updated cost calculation (src/lib/utils/cost-calculation.ts)
- output_image_tokens uses output_cost_per_image_token when available
- Falls back to output_cost_per_token for backward compatibility

Changes

Core Changes

File	Changes
`src/types/model-price.ts`	Added 4 new image token price fields
`src/app/v1/_lib/proxy/response-handler.ts`	Added modality parsing logic (+68 lines)
`src/lib/utils/cost-calculation.ts`	Added image token cost calculation (+18 lines)

Test Coverage

File	Tests
`tests/unit/lib/cost-calculation-image-tokens.test.ts`	10 new tests
`tests/unit/proxy/extract-usage-metrics.test.ts`	12 new Gemini image tests

Billing Example

Item	Tokens	Unit Price	Cost
Input TEXT	326	$0.000002	$0.000652
Output TEXT	340+337	$0.000012	$0.008124
Output IMAGE	2000	$0.00012	$0.240000
Total	-	-	$0.248776

Before fix: $0.244696 (undercharged by $0.00408)

Testing

Automated Tests

Unit tests added for image token cost calculation (10 tests)
Unit tests added for usage metrics extraction (12 tests)
All tests pass locally

Manual Testing

Configure a provider with gemini-3-pro-image-preview model
Set output_cost_per_image_token: 0.00012 in model pricing
Send an image generation request
Verify that IMAGE modality tokens are billed at the higher rate

Checklist

Code follows project conventions
Self-review completed
Tests pass locally (bun run test)
Backward compatible (falls back to standard token pricing if image pricing not configured)

Description enhanced by Claude AI

Greptile Overview

Greptile Summary

This PR fixes a critical billing issue for Gemini image generation models by properly extracting and billing IMAGE modality tokens at 10x the rate of TEXT tokens ($0.00012 vs $0.000012).

Key Changes:

Extended type system with output_cost_per_image_token and input_cost_per_image_token fields
Enhanced usage extraction to parse candidatesTokensDetails and promptTokensDetails from Gemini responses
Implemented case-insensitive modality matching and proper handling of unaccounted tokens
Updated cost calculation with graceful fallback to regular token pricing for backward compatibility
Added comprehensive test coverage (22 new tests across billing and extraction)

Potential Issue:

When promptTokensDetails is present along with cachedContentTokenCount, the cache deduction logic may not work correctly. The code at line 1354 overwrites input_tokens with raw textTokens from details without deducting cached tokens that were previously subtracted at line 1281. This scenario lacks test coverage.

Confidence Score: 3/5

Safe to merge with one logical issue that needs verification
The implementation correctly extracts modality tokens and calculates costs for the primary use case. However, there's a potential cache + image token interaction bug that could cause incorrect billing when both features are used together. The issue is uncommon but should be addressed.
src/app/v1/_lib/proxy/response-handler.ts requires attention for the cache + image token interaction at lines 1278-1357

Important Files Changed

Filename	Overview
src/types/model-price.ts	Added optional fields for image token pricing with clear comments
src/lib/utils/cost-calculation.ts	Properly calculates image token costs with fallback to regular token pricing
src/app/v1/_lib/proxy/response-handler.ts	Extracts modality-specific tokens but may have cache interaction issue

Sequence Diagram

sequenceDiagram
    participant Client
    participant ResponseHandler
    participant UsageExtractor
    participant CostCalculator
    participant PriceData

    Client->>ResponseHandler: Gemini API Response
    ResponseHandler->>UsageExtractor: extractUsageMetrics(response)
    
    UsageExtractor->>UsageExtractor: Parse usageMetadata
    
    alt Has candidatesTokensDetails
        UsageExtractor->>UsageExtractor: Iterate through candidatesTokensDetails
        UsageExtractor->>UsageExtractor: Filter IMAGE modality (case-insensitive)
        UsageExtractor->>UsageExtractor: Sum output_image_tokens
        UsageExtractor->>UsageExtractor: Calculate unaccounted TEXT tokens
        Note over UsageExtractor: output_tokens = textTokens + (candidatesTotal - detailsSum)
    end
    
    alt Has promptTokensDetails
        UsageExtractor->>UsageExtractor: Iterate through promptTokensDetails
        UsageExtractor->>UsageExtractor: Filter IMAGE modality (case-insensitive)
        UsageExtractor->>UsageExtractor: Sum input_image_tokens
        UsageExtractor->>UsageExtractor: Extract input_tokens (TEXT only)
    end
    
    UsageExtractor->>UsageExtractor: Add thoughtsTokenCount to output_tokens
    UsageExtractor-->>ResponseHandler: Return UsageMetrics
    
    ResponseHandler->>CostCalculator: calculateRequestCost(metrics, priceData)
    
    CostCalculator->>PriceData: Get output_cost_per_image_token
    alt Has output_cost_per_image_token
        CostCalculator->>CostCalculator: Calculate image token cost at $0.00012/token
    else Fallback
        CostCalculator->>PriceData: Use output_cost_per_token
        CostCalculator->>CostCalculator: Calculate at TEXT token rate
    end
    
    CostCalculator->>PriceData: Get input_cost_per_image_token
    alt Has input_cost_per_image_token
        CostCalculator->>CostCalculator: Calculate input image cost
    else Fallback
        CostCalculator->>PriceData: Use input_cost_per_token
    end
    
    CostCalculator->>CostCalculator: Sum all segments + apply multiplier
    CostCalculator-->>ResponseHandler: Return total cost
    ResponseHandler-->>Client: Billing Record

问题背景: - gemini-3-pro-image-preview 等图片生成模型返回的 usage 中包含 candidatesTokensDetails - 该字段按 modality 细分 token (IMAGE/TEXT) - IMAGE modality token 价格为 $0.00012/token，是普通 TEXT token 的 10 倍 - 原系统未解析此字段，导致 IMAGE token 按 TEXT 价格计费，计费偏低约 7.6 倍类型扩展 (src/types/model-price.ts): - 新增 output_cost_per_image_token: 输出图片 token 单价 (按 token 计费) - 新增 input_cost_per_image_token: 输入图片 token 单价 (按 token 计费) - 保留 input_cost_per_image: 输入图片固定价格 (按张计费，$0.0011/张) - 保留 output_cost_per_image: 输出图片固定价格 (按张计费) Usage 提取逻辑 (src/app/v1/_lib/proxy/response-handler.ts): - 解析 candidatesTokensDetails 提取 output_image_tokens 和 output_tokens (TEXT) - 解析 promptTokensDetails 提取 input_image_tokens 和 input_tokens (TEXT) - 使用 toUpperCase() 进行大小写不敏感匹配 (IMAGE/image/Image) - 添加 hasValidToken 守卫，仅在解析到有效 token 时覆盖原始值 - 修复 promptTokensDetails 解析不完整导致 input IMAGE tokens 被重复计费的问题 - 计算 candidatesTokenCount 与 details 总和的差值作为未分类 TEXT tokens (这些是图片生成的内部开销，按 TEXT 价格计费) 计费逻辑 (src/lib/utils/cost-calculation.ts): - output_image_tokens 优先使用 output_cost_per_image_token 计费 - input_image_tokens 优先使用 input_cost_per_image_token 计费 - 若未配置 image token 价格，回退到普通 token 价格 (向后兼容) - 倍率 (multiplier) 同时作用于 image token 费用测试覆盖: - 新增 cost-calculation-image-tokens.test.ts (10 个测试) - 扩展 extract-usage-metrics.test.ts (12 个 Gemini image 测试) - 覆盖场景: 纯 IMAGE、IMAGE+TEXT 混合、无效数据、大小写变体、向后兼容、混合输入输出、candidatesTokenCount 差值计算计费示例 (完整图片生成请求): - promptTokenCount=326, candidatesTokenCount=2340, thoughtsTokenCount=337 - candidatesTokensDetails: IMAGE=2000 (差值 340 为未分类 TEXT) - 输入 TEXT: 326 × $0.000002 = $0.000652 - 输出 TEXT: (340+337) × $0.000012 = $0.008124 - 输出 IMAGE: 2000 × $0.00012 = $0.240000 - 总计: $0.248776 (修复前 $0.244696，少收 $0.00408) Fixes ding113#663

coderabbitai · 2026-01-28T04:14:43Z

📝 Walkthrough

Walkthrough

此 PR 通过向 UsageMetrics 添加图像令牌字段、在 Gemini/OpenAI 响应解析中提取模态特定令牌计数、更新定价模型以支持图像特定成本，并添加全面的测试覆盖来扩展多模态令牌跟踪。

Changes

内容 / 文件	变更摘要
类型系统更新 `src/types/model-price.ts`, `src/app/v1/_lib/proxy/response-handler.ts`, `src/lib/utils/cost-calculation.ts`	在 ModelPriceData 中添加 output_cost_per_image、output_cost_per_image_token、input_cost_per_image、input_cost_per_image_token 字段；在 UsageMetrics 中添加 input_image_tokens 和 output_image_tokens 可选字段
响应解析 - 模态令牌提取 `src/app/v1/_lib/proxy/response-handler.ts`	实现模态感知的令牌提取逻辑，从 candidatesTokensDetails 和 promptTokensDetails 中区分 IMAGE 和 TEXT 模态，分别计算和设置对应的输入/输出图像令牌计数
成本计算 - 图像令牌处理 `src/lib/utils/cost-calculation.ts`	在 calculateRequestCost 中添加图像令牌成本计算，使用 output_cost_per_image_token 和 input_cost_per_image_token 字段，包含对传统单位价格的回退逻辑
测试覆盖 - 图像令牌成本 `tests/unit/lib/cost-calculation-image-tokens.test.ts`	新增 152 行单元测试，验证图像令牌定价、成本回退、乘数应用及混合场景
测试覆盖 - 使用指标提取 `tests/unit/proxy/extract-usage-metrics.test.ts`	新增 164 行 Gemini 特定测试用例，验证从 candidatesTokensDetails 和 promptTokensDetails 中提取 IMAGE 模态令牌的完整控制流

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	标题清晰准确地总结了主要变更：修复 Gemini 图片生成模型的 IMAGE modality token 计费问题，反映了 PR 的核心目的。
Description check	✅ Passed	PR 描述详细说明了修复内容、涉及的文件、计费逻辑变更，以及测试覆盖情况，与代码变更高度相关。
Linked Issues check	✅ Passed	PR 完整实现了 `#663` 的所有要求：扩展 UsageMetrics 类型以包含 input/output_image_tokens [`#663`]、在 response-handler.ts 中解析 candidatesTokensDetails 和 promptTokensDetails 提取 IMAGE token [`#663`]、在 cost-calculation.ts 中应用 image token 单价计费 [`#663`]，并包含全面的测试覆盖。
Out of Scope Changes check	✅ Passed	所有代码变更均与 `#663` 的计费修复需求紧密相关，没有发现超出范围的修改。

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-01-28T04:14:44Z

Summary of Changes

Hello @sususu98, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求旨在解决 Gemini 图片生成模型在计费时未能正确识别和应用图片模态 token 价格的问题。通过引入专门的图片 token 计费字段和更新解析及计算逻辑，确保系统能够准确地对图片输入和输出的 token 进行计费，从而纠正了此前因价格差异导致的费用计算不准确问题，提升了计费的精确性和公平性。

Highlights

计费问题修复: 修复了 Gemini 图片生成模型 (如 gemini-3-pro-image-preview) 的 IMAGE 模态 token 计费问题，此前因未正确解析导致少收约 7.6 倍费用。
价格模型扩展: 在 src/types/model-price.ts 中新增了 output_cost_per_image_token 和 input_cost_per_image_token 字段，用于定义图片 token 的单价。
Usage 提取逻辑更新: 修改了 src/app/v1/_lib/proxy/response-handler.ts 中的 extractUsageMetrics 函数，以解析 candidatesTokensDetails 和 promptTokensDetails，从而提取 output_image_tokens 和 input_image_tokens，并支持大小写不敏感的模态匹配，同时计算未分类的 TEXT token。
计费逻辑调整: 更新了 src/lib/utils/cost-calculation.ts 中的计费逻辑，确保图片 token 优先使用其专属单价计费，若未配置则回退到普通 token 单价。
测试覆盖增强: 新增了 tests/unit/lib/cost-calculation-image-tokens.test.ts 文件，包含 10 个测试用例，并扩展了 tests/unit/proxy/extract-usage-metrics.test.ts，增加了 12 个 Gemini 图片相关测试。

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

greptile-apps

_{3 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-01-28T04:17:53Z

tests/unit/proxy/extract-usage-metrics.test.ts

    });
+
+    it("应从 candidatesTokensDetails 提取 IMAGE modality tokens", () => {
+      const response = JSON.stringify({
+        usageMetadata: {
+          promptTokenCount: 326,
+          candidatesTokenCount: 2340,
+          candidatesTokensDetails: [
+            { modality: "IMAGE", tokenCount: 2000 },
+            { modality: "TEXT", tokenCount: 340 },
+          ],
+        },
+      });
+
+      const result = parseUsageFromResponseText(response, "gemini");
+
+      expect(result.usageMetrics?.output_image_tokens).toBe(2000);
+      expect(result.usageMetrics?.output_tokens).toBe(340);
+    });
+
+    it("应从 promptTokensDetails 提取 IMAGE modality tokens", () => {
+      const response = JSON.stringify({
+        usageMetadata: {
+          promptTokenCount: 886,
+          candidatesTokenCount: 500,
+          promptTokensDetails: [
+            { modality: "TEXT", tokenCount: 326 },
+            { modality: "IMAGE", tokenCount: 560 },
+          ],
+        },
+      });
+
+      const result = parseUsageFromResponseText(response, "gemini");
+
+      expect(result.usageMetrics?.input_image_tokens).toBe(560);
+      expect(result.usageMetrics?.input_tokens).toBe(326);
+    });
+
+    it("应正确解析混合输入输出的完整 usage", () => {
+      const response = JSON.stringify({
+        usageMetadata: {
+          promptTokenCount: 357,
+          candidatesTokenCount: 2100,
+          totalTokenCount: 2580,
+          promptTokensDetails: [
+            { modality: "TEXT", tokenCount: 99 },
+            { modality: "IMAGE", tokenCount: 258 },
+          ],
+          candidatesTokensDetails: [{ modality: "IMAGE", tokenCount: 2000 }],
+          thoughtsTokenCount: 123,
+        },
+      });
+
+      const result = parseUsageFromResponseText(response, "gemini");
+
+      expect(result.usageMetrics?.input_tokens).toBe(99);
+      expect(result.usageMetrics?.input_image_tokens).toBe(258);
+      // output_tokens = (candidatesTokenCount - IMAGE详情) + thoughtsTokenCount
+      // = (2100 - 2000) + 123 = 223
+      expect(result.usageMetrics?.output_tokens).toBe(223);
+      expect(result.usageMetrics?.output_image_tokens).toBe(2000);
+    });
+
+    it("应处理只有 IMAGE modality 的 candidatesTokensDetails", () => {
+      const response = JSON.stringify({
+        usageMetadata: {
+          promptTokenCount: 100,
+          candidatesTokenCount: 2000,
+          candidatesTokensDetails: [{ modality: "IMAGE", tokenCount: 2000 }],
+        },
+      });
+
+      const result = parseUsageFromResponseText(response, "gemini");
+
+      expect(result.usageMetrics?.output_image_tokens).toBe(2000);
+      // candidatesTokenCount = 2000, IMAGE = 2000, 未分类 = 0
+      expect(result.usageMetrics?.output_tokens).toBe(0);
+    });
+
+    it("应计算 candidatesTokenCount 与 details 的差值作为未分类 TEXT", () => {
+      const response = JSON.stringify({
+        usageMetadata: {
+          promptTokenCount: 326,
+          candidatesTokenCount: 2340,
+          candidatesTokensDetails: [{ modality: "IMAGE", tokenCount: 2000 }],
+          thoughtsTokenCount: 337,
+        },
+      });
+
+      const result = parseUsageFromResponseText(response, "gemini");
+
+      // 未分类 = 2340 - 2000 = 340
+      // output_tokens = 340 + 337 (thoughts) = 677
+      expect(result.usageMetrics?.output_tokens).toBe(677);
+      expect(result.usageMetrics?.output_image_tokens).toBe(2000);
+    });
+
+    it("应处理缺失 candidatesTokensDetails 的情况（向后兼容）", () => {
+      const response = JSON.stringify({
+        usageMetadata: {
+          promptTokenCount: 1000,
+          candidatesTokenCount: 500,
+        },
+      });
+
+      const result = parseUsageFromResponseText(response, "gemini");
+
+      expect(result.usageMetrics?.output_tokens).toBe(500);
+      expect(result.usageMetrics?.output_image_tokens).toBeUndefined();
+      expect(result.usageMetrics?.input_image_tokens).toBeUndefined();
+    });
+
+    it("应处理空的 candidatesTokensDetails 数组", () => {
+      const response = JSON.stringify({
+        usageMetadata: {
+          promptTokenCount: 1000,
+          candidatesTokenCount: 500,
+          candidatesTokensDetails: [],
+        },
+      });
+
+      const result = parseUsageFromResponseText(response, "gemini");
+
+      expect(result.usageMetrics?.output_tokens).toBe(500);
+      expect(result.usageMetrics?.output_image_tokens).toBeUndefined();
+    });
+
+    it("应处理 candidatesTokensDetails 中无效 tokenCount 的情况", () => {
+      const response = JSON.stringify({
+        usageMetadata: {
+          promptTokenCount: 1000,
+          candidatesTokenCount: 500,
+          candidatesTokensDetails: [
+            { modality: "TEXT" },
+            { modality: "IMAGE", tokenCount: null },
+            { modality: "TEXT", tokenCount: -1 },
+          ],
+        },
+      });
+
+      const result = parseUsageFromResponseText(response, "gemini");
+
+      // 无效数据不应覆盖原始 candidatesTokenCount
+      expect(result.usageMetrics?.output_tokens).toBe(500);
+      expect(result.usageMetrics?.output_image_tokens).toBeUndefined();
+    });
+
+    it("应处理 modality 大小写变体", () => {
+      const response = JSON.stringify({
+        usageMetadata: {
+          promptTokenCount: 100,
+          candidatesTokenCount: 2340,
+          candidatesTokensDetails: [
+            { modality: "image", tokenCount: 2000 },
+            { modality: "Image", tokenCount: 100 },
+            { modality: "TEXT", tokenCount: 240 },
+          ],
+        },
+      });
+
+      const result = parseUsageFromResponseText(response, "gemini");
+
+      expect(result.usageMetrics?.output_image_tokens).toBe(2100);
+      expect(result.usageMetrics?.output_tokens).toBe(240);
+    });
  });



Missing test: add coverage for cachedContentTokenCount + promptTokensDetails combination to verify cached tokens are properly deducted from text tokens

Prompt To Fix With AI

This is a comment left during a code review. Path: tests/unit/proxy/extract-usage-metrics.test.ts Line: 343:509 Comment: Missing test: add coverage for `cachedContentTokenCount` + `promptTokensDetails` combination to verify cached tokens are properly deducted from text tokens How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-01-28T04:17:55Z

Additional Comments (1)

src/app/v1/_lib/proxy/response-handler.ts
input_tokens overwriting issue when both promptTokenCount and promptTokensDetails exist

When promptTokensDetails is present (line 1354), it overwrites the input_tokens calculated from promptTokenCount - cachedContentTokenCount (line 1281). This breaks the cache deduction logic.

For responses with both cached tokens and image modality details, input_tokens should equal textTokens from details, but cached tokens aren't deducted from the image tokens.

Consider adjusting line 1354 to handle cached tokens:

      // Deduct cached tokens from text tokens if needed
      const cachedTokens =
        typeof usage.cachedContentTokenCount === "number" ? usage.cachedContentTokenCount : 0;
      result.input_tokens = Math.max(textTokens - cachedTokens, 0);

Prompt To Fix With AI

This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 1278:1357

Comment:
`input_tokens` overwriting issue when both `promptTokenCount` and `promptTokensDetails` exist

When `promptTokensDetails` is present (line 1354), it overwrites the `input_tokens` calculated from `promptTokenCount - cachedContentTokenCount` (line 1281). This breaks the cache deduction logic.

For responses with both cached tokens and image modality details, `input_tokens` should equal `textTokens` from details, but cached tokens aren't deducted from the image tokens.

Consider adjusting line 1354 to handle cached tokens:
```suggestion
      // Deduct cached tokens from text tokens if needed
      const cachedTokens =
        typeof usage.cachedContentTokenCount === "number" ? usage.cachedContentTokenCount : 0;
      result.input_tokens = Math.max(textTokens - cachedTokens, 0);
```

How can I resolve this? If you propose a fix, please make it concise.

github-actions

Code Review Summary

This PR correctly implements billing support for Gemini image generation models by extracting IMAGE modality tokens from candidatesTokensDetails and promptTokensDetails. The implementation is well-structured with proper fallback mechanisms and comprehensive test coverage.

PR Size: M

Lines changed: 408 additions, 0 deletions
Files changed: 5

Issues Found

Category	Critical	High	Medium	Low
Logic/Bugs	0	0	0	0
Security	0	0	0	0
Error Handling	0	0	0	0
Types	0	0	0	0
Comments/Docs	0	0	0	0
Tests	0	0	0	0
Simplification	0	0	0	0

No significant issues identified in this PR.

Review Coverage

Logic and correctness - Clean: Extraction logic correctly parses candidatesTokensDetails and promptTokensDetails, handles case-insensitive modality matching, and properly calculates unaccounted tokens
Security (OWASP Top 10) - Clean: No user input handling, type coercion is safe
Error handling - Clean: Invalid token counts (null, negative, missing) are properly handled with typeof ... === "number" && ... > 0 guards
Type safety - Clean: Both UsageMetrics type definitions updated consistently, new ModelPriceData fields added for image token pricing
Documentation accuracy - Clean: Comments accurately describe the billing calculation and fallback behavior
Test coverage - Excellent: 10 new cost calculation tests + 12 new usage extraction tests covering all major scenarios including edge cases (empty arrays, invalid tokenCount, case variations)
Code clarity - Good: Logic flow is clear, fallback behavior is well-documented

Key Observations

Backward Compatibility: When output_cost_per_image_token or input_cost_per_image_token is not configured, the code correctly falls back to regular token prices
Ordering Logic: The new extraction code is correctly placed before the existing output_tokens check, allowing Gemini-specific handling while preserving backward compatibility for other providers
Test Quality: Tests cover both happy paths and edge cases including invalid data handling and case-insensitive modality matching

Automated review by Claude AI

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/app/v1/_lib/proxy/response-handler.ts (1)
1294-1362: 避免 details 拆分后被 output_tokens 覆盖导致双计费

当 candidatesTokensDetails 已拆分且 usage.output_tokens 仍存在时，Line 1359 的覆盖会把「仅 TEXT」的 output_tokens 覆写为总量，从而与 output_image_tokens 叠加计费。建议仅在未从 details 生成 output_tokens 时才使用 usage.output_tokens。
建议修改
@@
-  const candidatesDetails = usage.candidatesTokensDetails as
+  const candidatesDetails = usage.candidatesTokensDetails as
     | Array<{ modality?: string; tokenCount?: number }>
     | undefined;
+  let hasCandidatesDetails = false;
   if (Array.isArray(candidatesDetails) && candidatesDetails.length > 0) {
@@
     if (hasValidToken) {
+      hasCandidatesDetails = true;
       // 计算未分类的 TEXT tokens: candidatesTokenCount - details总和
       // 这些可能是图片生成的内部开销，按 TEXT 价格计费
       const detailsSum = imageTokens + textTokens;
@@
   }
@@
-  if (typeof usage.output_tokens === "number") {
+  if (typeof usage.output_tokens === "number" && !hasCandidatesDetails) {
     result.output_tokens = usage.output_tokens;
     hasAny = true;
   }

gemini-code-assist

Code Review

This pull request introduces support for handling image tokens in usage metrics and cost calculation, primarily for Gemini models. The changes involve adding input_image_tokens and output_image_tokens to the UsageMetrics and ModelPriceData types. The extractUsageMetrics function was updated to parse image and text tokens from candidatesTokensDetails and promptTokensDetails in Gemini responses, including logic to account for unclassified tokens. The calculateRequestCost function was modified to incorporate these new image token types, prioritizing specific image token prices (output_cost_per_image_token, input_cost_per_image_token) and falling back to general token prices if not specified. New unit tests were added to verify the correct extraction and cost calculation of image tokens. Review comments identified a bug in extractUsageMetrics where calculated output_tokens could be overwritten, suggested refactoring duplicated logic for processing candidatesTokensDetails and promptTokensDetails into a helper function, and pointed out a redundant test case in cost-calculation-image-tokens.test.ts that should be removed.

gemini-code-assist · 2026-01-28T04:19:42Z

src/app/v1/_lib/proxy/response-handler.ts

+  if (Array.isArray(candidatesDetails) && candidatesDetails.length > 0) {
+    let imageTokens = 0;
+    let textTokens = 0;
+    let hasValidToken = false;
+    for (const detail of candidatesDetails) {
+      if (typeof detail.tokenCount === "number" && detail.tokenCount > 0) {
+        hasValidToken = true;
+        const modalityUpper = detail.modality?.toUpperCase();
+        if (modalityUpper === "IMAGE") {
+          imageTokens += detail.tokenCount;
+        } else {
+          textTokens += detail.tokenCount;
+        }
+      }
+    }
+    if (imageTokens > 0) {
+      result.output_image_tokens = imageTokens;
+      hasAny = true;
+    }
+    if (hasValidToken) {
+      // 计算未分类的 TEXT tokens: candidatesTokenCount - details总和
+      // 这些可能是图片生成的内部开销，按 TEXT 价格计费
+      const detailsSum = imageTokens + textTokens;
+      const candidatesTotal =
+        typeof usage.candidatesTokenCount === "number" ? usage.candidatesTokenCount : 0;
+      const unaccountedTokens = Math.max(candidatesTotal - detailsSum, 0);
+      result.output_tokens = textTokens + unaccountedTokens;
+      hasAny = true;
+    }
+  }


这里从 candidatesTokensDetails 计算出的 output_tokens（在 1325 行）是正确的。但是，如果 usage 对象中也存在 output_tokens 字段，那么在 1359 行的现有逻辑会无条件地覆盖掉这里计算出的值。这是一个 Bug，会导致计费错误。1359 行的逻辑应该只在 candidatesTokensDetails 不存在或无效时才执行。请调整代码以修复这个优先级问题。

gemini-code-assist · 2026-01-28T04:19:42Z

src/app/v1/_lib/proxy/response-handler.ts

+  if (Array.isArray(candidatesDetails) && candidatesDetails.length > 0) {
+    let imageTokens = 0;
+    let textTokens = 0;
+    let hasValidToken = false;
+    for (const detail of candidatesDetails) {
+      if (typeof detail.tokenCount === "number" && detail.tokenCount > 0) {
+        hasValidToken = true;
+        const modalityUpper = detail.modality?.toUpperCase();
+        if (modalityUpper === "IMAGE") {
+          imageTokens += detail.tokenCount;
+        } else {
+          textTokens += detail.tokenCount;
+        }
+      }
+    }
+    if (imageTokens > 0) {
+      result.output_image_tokens = imageTokens;
+      hasAny = true;
+    }
+    if (hasValidToken) {
+      // 计算未分类的 TEXT tokens: candidatesTokenCount - details总和
+      // 这些可能是图片生成的内部开销，按 TEXT 价格计费
+      const detailsSum = imageTokens + textTokens;
+      const candidatesTotal =
+        typeof usage.candidatesTokenCount === "number" ? usage.candidatesTokenCount : 0;
+      const unaccountedTokens = Math.max(candidatesTotal - detailsSum, 0);
+      result.output_tokens = textTokens + unaccountedTokens;
+      hasAny = true;
+    }
+  }
+
+  // promptTokensDetails: 输入 token 按 modality 分类
+  const promptDetails = usage.promptTokensDetails as
+    | Array<{ modality?: string; tokenCount?: number }>
+    | undefined;
+  if (Array.isArray(promptDetails) && promptDetails.length > 0) {
+    let imageTokens = 0;
+    let textTokens = 0;
+    let hasValidToken = false;
+    for (const detail of promptDetails) {
+      if (typeof detail.tokenCount === "number" && detail.tokenCount > 0) {
+        hasValidToken = true;
+        const modalityUpper = detail.modality?.toUpperCase();
+        if (modalityUpper === "IMAGE") {
+          imageTokens += detail.tokenCount;
+        } else {
+          textTokens += detail.tokenCount;
+        }
+      }
+    }
+    if (imageTokens > 0) {
+      result.input_image_tokens = imageTokens;
+      hasAny = true;
+    }
+    if (hasValidToken) {
+      result.input_tokens = textTokens;
+      hasAny = true;
+    }
+  }


处理 candidatesTokensDetails（1299-1328行）和 promptTokensDetails（1334-1357行）的逻辑非常相似，存在代码重复。这会增加未来的维护成本。建议将遍历 details 数组、累加 imageTokens 和 textTokens 的通用逻辑提取到一个独立的辅助函数中。

gemini-code-assist · 2026-01-28T04:19:42Z

tests/unit/lib/cost-calculation-image-tokens.test.ts

+  test("完整 Gemini image 响应计费示例", () => {
+    const cost = calculateRequestCost(
+      {
+        input_tokens: 326,
+        output_tokens: 340,
+        output_image_tokens: 2000,
+      },
+      {
+        input_cost_per_token: 0.000002,
+        output_cost_per_token: 0.000012,
+        output_cost_per_image_token: 0.00012,
+      }
+    );
+
+    // Google 官方价格验证
+    // input: 326 * $0.000002 = $0.000652
+    // output text: 340 * $0.000012 = $0.00408
+    // output image: 2000 * $0.00012 = $0.24 (4K image = 2000 tokens)
+    // total: $0.244732
+    expect(cost.toNumber()).toBeCloseTo(0.244732, 6);
+  });


这个测试用例 “完整 Gemini image 响应计费示例” 与前面的 “混合响应：text + image tokens 应分别计费”（55-74行）完全相同。这是不必要的重复，请移除这个多余的测试用例。

github-project-automation bot added this to Claude Code Hub Roadmap Jan 28, 2026

github-project-automation bot moved this to Backlog in Claude Code Hub Roadmap Jan 28, 2026

github-actions bot added bug Something isn't working area:Google Gemini area:core size/M Medium PR (< 500 lines) labels Jan 28, 2026

greptile-apps bot reviewed Jan 28, 2026

View reviewed changes

github-actions bot reviewed Jan 28, 2026

View reviewed changes

coderabbitai bot reviewed Jan 28, 2026

View reviewed changes

coderabbitai bot approved these changes Jan 28, 2026

View reviewed changes

gemini-code-assist bot reviewed Jan 28, 2026

View reviewed changes

ding113 merged commit 704d00a into ding113:dev Jan 28, 2026
17 of 19 checks passed

github-project-automation bot moved this from Backlog to Done in Claude Code Hub Roadmap Jan 28, 2026

This was referenced Jan 28, 2026

release v0.5.2 #672

Closed

release v0.5.2 #675

Merged

fix(billing): use last-wins for Gemini SSE usageMetadata extraction #691

Merged

Uh oh!

Conversation

sususu98 commented Jan 28, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Changes

Core Changes

Test Coverage

Billing Example

Testing

Automated Tests

Manual Testing

Checklist

Greptile Overview

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

coderabbitai bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

gemini-code-assist bot commented Jan 28, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Jan 28, 2026

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Code Review Summary

PR Size: M

Issues Found

Review Coverage

Key Observations

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sususu98 commented Jan 28, 2026 •

edited by greptile-apps bot

Loading

coderabbitai bot commented Jan 28, 2026 •

edited

Loading