Skip to content

Fix/gemini thoughts token support#327

Merged
ding113 merged 4 commits intoding113:devfrom
sususu98:fix/gemini-thoughts-token-support
Dec 12, 2025
Merged

Fix/gemini thoughts token support#327
ding113 merged 4 commits intoding113:devfrom
sususu98:fix/gemini-thoughts-token-support

Conversation

@sususu98
Copy link
Contributor

@sususu98 sususu98 commented Dec 12, 2025

Summary

Add support for Gemini thinking/reasoning token billing by extracting thoughtsTokenCount from Gemini API responses and including it in cost calculations.

Problem

Gemini thinking models (e.g., gemini-2.5-flash, gemini-2.5-pro) return a thoughtsTokenCount field in their responses representing reasoning tokens consumed during inference. These tokens were previously ignored, resulting in incomplete billing for thinking model usage.

Since thinking token pricing is the same as output token pricing (as defined by LiteLLM's output_cost_per_reasoning_token), the simplest approach is to accumulate thinking tokens into output_tokens for unified billing.

Related Issues:

Related PRs:

Solution

  1. Extended GeminiUsageMetadata type (src/app/v1/_lib/gemini/types.ts):

    • Added thoughtsTokenCount field for reasoning tokens
    • Added GeminiTokenDetail type for modality-based token breakdown
    • Added promptTokensDetails, cacheTokensDetails, candidatesTokensDetails fields
  2. Updated extractUsageMetrics function (src/app/v1/_lib/proxy/response-handler.ts):

    • Detect thoughtsTokenCount in Gemini API responses
    • Accumulate thinking tokens into output_tokens (same pricing tier)
    • Placed after output_tokens assignment to avoid overwriting

Changes

Core Changes

  • src/app/v1/_lib/gemini/types.ts: Extended type definitions for complete Gemini usage metadata support
  • src/app/v1/_lib/proxy/response-handler.ts: Added thinking token extraction and accumulation logic

Supporting Changes

  • CHANGELOG.md: Updated for v0.3.28 release notes

Testing

Manual Testing

  • Verify Gemini non-thinking model requests (no thoughtsTokenCount) bill correctly
  • Verify Gemini thinking model requests (gemini-2.5-flash, gemini-2.5-pro) correctly accumulate thoughtsTokenCount into output tokens
  • Verify cost calculations reflect thinking token usage

Checklist

  • Code follows project conventions
  • Self-review completed
  • Tests pass locally
  • Documentation updated (CHANGELOG.md)

Description enhanced by Claude AI

github-actions bot and others added 4 commits December 10, 2025 12:03
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- 在 GeminiUsageMetadata 类型中添加 thoughtsTokenCount 字段
- 新增 GeminiTokenDetail 类型支持按 modality 分类的 token 详情
- 将 thoughtsTokenCount 累加到 output_tokens 进行计费
  (Gemini 思考 token 价格与输出 token 相同)
- 添加 promptTokensDetails、cacheTokensDetails、
  candidatesTokensDetails 字段支持

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @sususu98, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the system's capability to handle and report token usage for Gemini models. It introduces new types to support detailed token breakdowns, including specific counts for internal reasoning ("thoughts") and different modalities. The changes ensure that all tokens consumed by Gemini are accurately reflected in the overall usage metrics, providing a more precise understanding of model costs and performance.

Highlights

  • Gemini Token Tracking Enhancement: Introduced new data structures (GeminiTokenDetail) and extended GeminiUsageMetadata to capture more granular token usage from Gemini models, including thoughtsTokenCount and detailed breakdowns by modality (text, image, audio, video).
  • Accurate Usage Metrics for Gemini: Updated the extractUsageMetrics function to correctly incorporate Gemini's thoughtsTokenCount into the output_tokens calculation, ensuring comprehensive and accurate reporting of token consumption for billing and analytics.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

本次 PR 主要增加了对 Gemini thoughtsTokenCount 的支持,通过将其计入总输出 token 数来实现正确的成本计算。相关的 TypeScript 类型定义也已更新,以包含此字段及其他新的元数据字段。response-handler.ts 中的实现逻辑清晰,正确地处理了 token 的累加。

有一个小建议是关于 CHANGELOG.md 的更新,似乎缺少了对本次功能变更的记录。具体请看文件评论。

@github-actions github-actions bot added bug Something isn't working size/XS Extra Small PR (< 50 lines) labels Dec 12, 2025
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

This PR adds support for Gemini's thoughtsTokenCount field (thinking/reasoning tokens) to the usage metrics extraction, along with type updates for Gemini usage metadata. The changes are straightforward and correctly implemented.

PR Size: XS

  • Lines changed: 39 (38 additions, 1 deletion)
  • Files changed: 3

Issues Found

Category Critical High Medium Low
Logic/Bugs 0 0 0 0
Security 0 0 0 0
Error Handling 0 0 0 0
Types 0 0 0 0
Comments/Docs 0 0 0 0
Tests 0 0 0 0
Simplification 0 0 0 0

Analysis Notes

Types (src/app/v1/_lib/gemini/types.ts):

  • Added GeminiTokenDetail interface for modality-specific token details
  • Extended GeminiUsageMetadata with thoughtsTokenCount and detailed token breakdown arrays
  • Note: GeminiTokenDetail interface is currently unused in the codebase (only the type definition exists). This appears to be forward-looking infrastructure for future Gemini features. While not actively used, it doesn't cause issues and documents the expected API shape.

Usage Extraction (src/app/v1/_lib/proxy/response-handler.ts:1215-1223):

  • Correctly extracts thoughtsTokenCount and adds it to output_tokens
  • Placement after the output_tokens assignment ensures proper accumulation
  • The conditional > 0 check is appropriate to avoid unnecessary operations
  • Comments accurately describe the behavior and rationale

CHANGELOG.md:

  • Contains v0.3.28 release notes which appear unrelated to this PR's specific feature (Gemini thoughts token support). The CHANGELOG entry for this feature may need to be added separately.

Review Coverage

  • Logic and correctness - Clean
  • Security (OWASP Top 10) - Clean
  • Error handling - Clean
  • Type safety - Clean
  • Documentation accuracy - Clean
  • Test coverage - No tests added (acceptable for this simple change)
  • Code clarity - Good

Automated review by Claude AI

Repository owner deleted a comment from gemini-code-assist bot Dec 12, 2025
@ding113 ding113 merged commit aa0c55d into ding113:dev Dec 12, 2025
5 checks passed
@github-project-automation github-project-automation bot moved this from Backlog to Done in Claude Code Hub Roadmap Dec 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size/XS Extra Small PR (< 50 lines)

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants