Fix/gemini thoughts token support#327
Conversation
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- 在 GeminiUsageMetadata 类型中添加 thoughtsTokenCount 字段 - 新增 GeminiTokenDetail 类型支持按 modality 分类的 token 详情 - 将 thoughtsTokenCount 累加到 output_tokens 进行计费 (Gemini 思考 token 价格与输出 token 相同) - 添加 promptTokensDetails、cacheTokensDetails、 candidatesTokensDetails 字段支持 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary of ChangesHello @sususu98, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves the system's capability to handle and report token usage for Gemini models. It introduces new types to support detailed token breakdowns, including specific counts for internal reasoning ("thoughts") and different modalities. The changes ensure that all tokens consumed by Gemini are accurately reflected in the overall usage metrics, providing a more precise understanding of model costs and performance. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review Summary
This PR adds support for Gemini's thoughtsTokenCount field (thinking/reasoning tokens) to the usage metrics extraction, along with type updates for Gemini usage metadata. The changes are straightforward and correctly implemented.
PR Size: XS
- Lines changed: 39 (38 additions, 1 deletion)
- Files changed: 3
Issues Found
| Category | Critical | High | Medium | Low |
|---|---|---|---|---|
| Logic/Bugs | 0 | 0 | 0 | 0 |
| Security | 0 | 0 | 0 | 0 |
| Error Handling | 0 | 0 | 0 | 0 |
| Types | 0 | 0 | 0 | 0 |
| Comments/Docs | 0 | 0 | 0 | 0 |
| Tests | 0 | 0 | 0 | 0 |
| Simplification | 0 | 0 | 0 | 0 |
Analysis Notes
Types (src/app/v1/_lib/gemini/types.ts):
- Added
GeminiTokenDetailinterface for modality-specific token details - Extended
GeminiUsageMetadatawiththoughtsTokenCountand detailed token breakdown arrays - Note:
GeminiTokenDetailinterface is currently unused in the codebase (only the type definition exists). This appears to be forward-looking infrastructure for future Gemini features. While not actively used, it doesn't cause issues and documents the expected API shape.
Usage Extraction (src/app/v1/_lib/proxy/response-handler.ts:1215-1223):
- Correctly extracts
thoughtsTokenCountand adds it tooutput_tokens - Placement after the
output_tokensassignment ensures proper accumulation - The conditional
> 0check is appropriate to avoid unnecessary operations - Comments accurately describe the behavior and rationale
CHANGELOG.md:
- Contains v0.3.28 release notes which appear unrelated to this PR's specific feature (Gemini thoughts token support). The CHANGELOG entry for this feature may need to be added separately.
Review Coverage
- Logic and correctness - Clean
- Security (OWASP Top 10) - Clean
- Error handling - Clean
- Type safety - Clean
- Documentation accuracy - Clean
- Test coverage - No tests added (acceptable for this simple change)
- Code clarity - Good
Automated review by Claude AI
Summary
Add support for Gemini thinking/reasoning token billing by extracting
thoughtsTokenCountfrom Gemini API responses and including it in cost calculations.Problem
Gemini thinking models (e.g., gemini-2.5-flash, gemini-2.5-pro) return a
thoughtsTokenCountfield in their responses representing reasoning tokens consumed during inference. These tokens were previously ignored, resulting in incomplete billing for thinking model usage.Since thinking token pricing is the same as output token pricing (as defined by LiteLLM's
output_cost_per_reasoning_token), the simplest approach is to accumulate thinking tokens intooutput_tokensfor unified billing.Related Issues:
Related PRs:
Solution
Extended
GeminiUsageMetadatatype (src/app/v1/_lib/gemini/types.ts):thoughtsTokenCountfield for reasoning tokensGeminiTokenDetailtype for modality-based token breakdownpromptTokensDetails,cacheTokensDetails,candidatesTokensDetailsfieldsUpdated
extractUsageMetricsfunction (src/app/v1/_lib/proxy/response-handler.ts):thoughtsTokenCountin Gemini API responsesoutput_tokens(same pricing tier)output_tokensassignment to avoid overwritingChanges
Core Changes
src/app/v1/_lib/gemini/types.ts: Extended type definitions for complete Gemini usage metadata supportsrc/app/v1/_lib/proxy/response-handler.ts: Added thinking token extraction and accumulation logicSupporting Changes
CHANGELOG.md: Updated for v0.3.28 release notesTesting
Manual Testing
thoughtsTokenCount) bill correctlythoughtsTokenCountinto output tokensChecklist
Description enhanced by Claude AI