fix(billing): 修复 Gemini 缓存 token 重复计费问题 by sususu98 · Pull Request #338 · ding113/claude-code-hub

sususu98 · 2025-12-13T15:22:27Z

问题描述

Gemini API 的 promptTokenCount 包含了 cachedContentTokenCount，但原代码直接使用 promptTokenCount 作为 input_tokens，导致缓存命中的 token 被重复计费：

input 成本：按全部 prompt tokens 计费（含缓存部分）
cache 成本：缓存 tokens 再次按 cache 价格计费

Related Issues:

Related to feat: 增强Gemini 模型上下文窗口>200K计费 #325 - 本 PR 是 Gemini 计费改进的一部分，修复缓存 token 重复计费问题
Follow-up to Fix/gemini thoughts token support #327 - 继续完善 Gemini token 计费准确性
Follow-up to fix: 修复 Gemini 和 OpenAI Chat Completions 流式响应的 usage 解析问题 #153 - 延续 Gemini usage 解析的改进工作

修复方案

在解析 Gemini usage 时，直接从 promptTokenCount 中减去 cachedContentTokenCount：

// 修复前
result.input_tokens = usage.promptTokenCount;

// 修复后
const cachedTokens = typeof usage.cachedContentTokenCount === "number" 
  ? usage.cachedContentTokenCount : 0;
result.input_tokens = Math.max(usage.promptTokenCount - cachedTokens, 0);

数据验证

根据 Gemini 官方 API 返回示例：

promptTokenCount = 696219
cachedContentTokenCount = 696190
candidatesTokenCount = 214
totalTokenCount = 696433

数学验证：696219 + 214 = 696433 ✅

证明 promptTokenCount 确实包含 cachedContentTokenCount，修复后：

input_tokens = 696219 - 696190 = 29
cache_read_input_tokens = 696190

Changes

Core Changes

src/app/v1/_lib/proxy/response-handler.ts: 在 Gemini usage 解析中减去 cachedContentTokenCount 避免重复计费

测试

TypeScript 类型检查通过
代码逻辑验证：使用 Math.max(..., 0) 确保结果非负

Checklist

Code follows project conventions
Self-review completed
Tests pass locally

Description enhanced by Claude AI

Gemini API 的 promptTokenCount 包含 cachedContentTokenCount，计算 input 成本时需要减去缓存部分以避免重复计费。修复前：input = promptTokenCount × input_price（含缓存） cache = cachedContentTokenCount × cache_price 结果：缓存 token 被计费两次修复后：input = (promptTokenCount - cachedContentTokenCount) × input_price cache = cachedContentTokenCount × cache_price 结果：正确计费

gemini-code-assist · 2025-12-13T15:22:40Z

Summary of Changes

Hello @sususu98, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求旨在纠正 Gemini API 使用量统计中的一个计费错误。此前，系统错误地将缓存的 token 计入 input_tokens，导致用户为同一部分内容支付两次费用。通过调整 input_tokens 的计算方式，确保只有实际未缓存的 prompt 部分才被计为输入，从而避免了重复计费，提高了计费的准确性。

Highlights

修复 Gemini 重复计费: 解决了 Gemini API 在计算 input_tokens 时，因 promptTokenCount 包含 cachedContentTokenCount 导致缓存部分被重复计费的问题。
调整 Token 计算逻辑: 在 extractUsageMetrics 函数中，从 promptTokenCount 中减去 cachedContentTokenCount 来获取准确的 input_tokens。
添加计费说明注释: 在代码中增加了详细注释，解释了 Gemini 的计费公式，明确了 input 和 cache 部分的计算方式。

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

github-actions

Code Review Summary

No significant issues identified in this PR.

PR Size: XS

Lines changed: 7
Files changed: 1

Analysis

The fix correctly addresses the double-billing issue for Gemini cached tokens. The PR description accurately explains the problem:

Problem: Gemini's promptTokenCount includes cachedContentTokenCount, causing cached tokens to be billed twice:
- Once as input tokens (at full input price)
- Again as cache read tokens (at cache price)
Solution: The fix subtracts cachedContentTokenCount from promptTokenCount before assigning to input_tokens:
```
result.input_tokens = Math.max(usage.promptTokenCount - cachedTokens, 0);
```
Mathematical verification (from PR description):
- promptTokenCount = 696219
- cachedContentTokenCount = 696190
- Fixed input_tokens = 696219 - 696190 = 29 (only non-cached tokens)
- cache_read_input_tokens = 696190 (cached tokens, billed separately)

Verified

✅ The fix is in the correct location (extractUsageMetrics function which feeds into calculateRequestCost)
✅ The Math.max(..., 0) guard prevents negative values if data is inconsistent
✅ Type checking with typeof usage.cachedContentTokenCount === "number" handles undefined/null gracefully
✅ The comment explains the billing formula clearly
✅ The GeminiAdapter transformation (used for client responses, not billing) is separate and unaffected

Review Coverage

Logic and correctness - Clean
Security (OWASP Top 10) - Clean
Error handling - Clean
Type safety - Clean
Documentation accuracy - Clean
Test coverage - No unit tests for this function (pre-existing gap)
Code clarity - Good

Automated review by Claude AI

gemini-code-assist

Code Review

此拉取请求有效解决了 Gemini API token 计费中的一个关键问题。问题描述清晰，修复方案也得到了正确实施，通过从 promptTokenCount 中减去 cachedContentTokenCount 来避免重复计费。代码中新增的注释也很好地解释了计费逻辑，提高了代码的可读性和可维护性。这是一个必要且正确的修复。

github-actions

Code Review Summary

No significant issues identified in this PR.

PR Size: XS

Lines changed: 7
Files changed: 1

Analysis

The fix correctly addresses the double-billing issue for Gemini cached tokens. The PR description accurately explains the problem:

Problem: Gemini's promptTokenCount includes cachedContentTokenCount, causing cached tokens to be billed twice:
- Once as input tokens (at full input price)
- Again as cache read tokens (at cache price)
Solution: The fix subtracts cachedContentTokenCount from promptTokenCount before assigning to input_tokens:
```
result.input_tokens = Math.max(usage.promptTokenCount - cachedTokens, 0);
```
Mathematical verification (from PR description):
- promptTokenCount = 696219
- cachedContentTokenCount = 696190
- Fixed input_tokens = 696219 - 696190 = 29 (only non-cached tokens)
- cache_read_input_tokens = 696190 (cached tokens, billed separately)

Verified

The fix is in the correct location (extractUsageMetrics function which feeds into calculateRequestCost)
The Math.max(..., 0) guard prevents negative values if data is inconsistent
Type checking with typeof usage.cachedContentTokenCount === "number" handles undefined/null gracefully
The comment explains the billing formula clearly
The GeminiAdapter transformation (used for client responses, not billing) is separate and unaffected

Review Coverage

Logic and correctness - Clean
Security (OWASP Top 10) - Clean
Error handling - Clean
Type safety - Clean
Documentation accuracy - Clean
Test coverage - No unit tests for this function (pre-existing gap)
Code clarity - Good

Automated review by Claude AI

github-project-automation bot added this to Claude Code Hub Roadmap Dec 13, 2025

github-project-automation bot moved this to Backlog in Claude Code Hub Roadmap Dec 13, 2025

github-actions bot added bug Something isn't working size/XS Extra Small PR (< 50 lines) labels Dec 13, 2025

github-actions bot reviewed Dec 13, 2025

View reviewed changes

gemini-code-assist bot reviewed Dec 13, 2025

View reviewed changes

github-actions bot reviewed Dec 13, 2025

View reviewed changes

ding113 merged commit 7cd9d76 into ding113:dev Dec 13, 2025
5 checks passed

github-project-automation bot moved this from Backlog to Done in Claude Code Hub Roadmap Dec 13, 2025

github-actions bot mentioned this pull request Jan 31, 2026

fix(billing): use last-wins for Gemini SSE usageMetadata extraction #691

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(billing): 修复 Gemini 缓存 token 重复计费问题#338

fix(billing): 修复 Gemini 缓存 token 重复计费问题#338
ding113 merged 1 commit intoding113:devfrom
sususu98:fix/gemini-cached-token-billing

sususu98 commented Dec 13, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot commented Dec 13, 2025

Uh oh!

github-actions bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Uh oh!

Conversation

sususu98 commented Dec 13, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

问题描述

修复方案

数据验证

Changes

Core Changes

测试

Checklist

Uh oh!

gemini-code-assist bot commented Dec 13, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Code Review Summary

PR Size: XS

Analysis

Verified

Review Coverage

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Code Review Summary

PR Size: XS

Analysis

Verified

Review Coverage

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

sususu98 commented Dec 13, 2025 •

edited by github-actions bot

Loading