fix(proxy): add OpenAI chat completion format support in usage extraction#716
fix(proxy): add OpenAI chat completion format support in usage extraction#716
Conversation
…tion (#705) The `extractUsageMetrics` function was missing support for OpenAI chat completion format fields (`prompt_tokens`/`completion_tokens`), causing token statistics to not be recorded for OpenAI-compatible providers. Changes: - Add `prompt_tokens` -> `input_tokens` mapping - Add `completion_tokens` -> `output_tokens` mapping - Preserve priority: Claude > Gemini > OpenAI format - Add 5 unit tests for OpenAI format handling Closes #705 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary of ChangesHello @ding113, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical bug where usage metrics from OpenAI chat completion responses were not being properly recorded. It enhances the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
📝 WalkthroughWalkthrough本次变更为响应处理器添加了兼容性映射,支持OpenAI/Claude风格的token字段(prompt_tokens/completion_tokens映射到input_tokens/output_tokens),并增加了相应的测试覆盖以验证跨provider的token使用统计功能。 Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request correctly adds support for extracting usage metrics from OpenAI-compatible chat completion responses, including prompt_tokens and completion_tokens. The changes follow the existing structure and the new unit tests are comprehensive, covering various scenarios like format priority and streaming responses.
I've identified one related issue regarding priority handling for Gemini tokens that you may want to address for consistency. Overall, this is a good fix that improves provider compatibility.
🧪 测试结果
总体结果: ✅ 所有测试通过 |
Additional Comments (2)
Prompt To Fix With AIThis is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 1186:1191
Comment:
Pre-existing issue (not introduced by this PR): Gemini's `promptTokenCount` doesn't check if `result.input_tokens` is already defined, which could override Claude's `input_tokens` if both formats are present. OpenAI format correctly checks `result.input_tokens === undefined` on line 1199. Consider adding the same check here for consistency:
```suggestion
if (result.input_tokens === undefined && typeof usage.promptTokenCount === "number") {
const cachedTokens =
typeof usage.cachedContentTokenCount === "number" ? usage.cachedContentTokenCount : 0;
result.input_tokens = Math.max(usage.promptTokenCount - cachedTokens, 0);
hasAny = true;
}
```
How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix With AIThis is a comment left during a code review.
Path: src/app/v1/_lib/proxy/response-handler.ts
Line: 1192:1195
Comment:
Pre-existing issue (not introduced by this PR): `candidatesTokenCount` should check if `result.output_tokens` is already defined to maintain stated priority. OpenAI format correctly implements this on line 1290.
```suggestion
if (result.output_tokens === undefined && typeof usage.candidatesTokenCount === "number") {
result.output_tokens = usage.candidatesTokenCount;
hasAny = true;
}
```
How can I resolve this? If you propose a fix, please make it concise. |
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/app/v1/_lib/proxy/response-handler.ts (1)
1186-1202:⚠️ Potential issue | 🟠 MajorGemini 的 promptTokenCount 可能覆盖 Claude 的 input_tokens,优先级未被真正执行。
当前逻辑在存在
usage.input_tokens时仍会被promptTokenCount覆盖,和注释中“Claude > Gemini > OpenAI”的优先级不一致,可能导致混合/转发场景下计费偏差。建议仅在input_tokens未设置时才使用 Gemini 的计数。Proposed fix
- if (typeof usage.promptTokenCount === "number") { + if (result.input_tokens === undefined && typeof usage.promptTokenCount === "number") { const cachedTokens = typeof usage.cachedContentTokenCount === "number" ? usage.cachedContentTokenCount : 0; result.input_tokens = Math.max(usage.promptTokenCount - cachedTokens, 0); hasAny = true; }
There was a problem hiding this comment.
Code Review Summary
No significant issues identified in this PR. Review performed via static inspection of the diff; automated test/build execution was not run in this runner.
PR Size: S
- Lines changed: 96
- Files changed: 2
Review Coverage
- Logic and correctness - Clean
- Security (OWASP Top 10) - Clean
- Error handling - Clean
- Type safety - Clean
- Documentation accuracy - Clean
- Test coverage - Adequate
- Code clarity - Good
Automated review by Codex AI
There was a problem hiding this comment.
- PR
#716: applied labelsize/S(96 changed lines across 2 files). - Reviewed diff in
src/app/v1/_lib/proxy/response-handler.tsandtests/unit/proxy/extract-usage-metrics.test.ts; no issues in modified/added lines met the reporting threshold, so no inline comments were posted. - Submitted a summary review to the PR via
gh pr review --comment. - Note: automated test/build execution wasn’t run in this runner (
bunnot present andnode_modulesmissing); if you want a verification run, usebun install && bun run test.
* fix(proxy): add 'cannot be modified' error detection to thinking signature rectifier Extend the thinking signature rectifier to detect and handle the Anthropic API error when thinking/redacted_thinking blocks have been modified from their original response. This error occurs when clients inadvertently modify these blocks in multi-turn conversations. The rectifier will now remove these blocks and retry the request, similar to how it handles other thinking-related signature errors. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * chore(deps): bump jspdf in the npm_and_yarn group across 1 directory Bumps the npm_and_yarn group with 1 update in the / directory: [jspdf](https://github.com/parallax/jsPDF). Updates `jspdf` from 3.0.4 to 4.1.0 - [Release notes](https://github.com/parallax/jsPDF/releases) - [Changelog](https://github.com/parallax/jsPDF/blob/master/RELEASE.md) - [Commits](parallax/jsPDF@v3.0.4...v4.1.0) --- updated-dependencies: - dependency-name: jspdf dependency-version: 4.1.0 dependency-type: direct:production dependency-group: npm_and_yarn ... Signed-off-by: dependabot[bot] <support@github.com> * fix: Hot-reload cache invalidation for Request Filters and Sensitive Words (#710) * fix: hot-reload request filters via globalThis singleton pattern EventEmitter and RequestFilterEngine now use globalThis caching to ensure the same instance is shared across different Next.js worker contexts. This fixes the issue where filter changes required Docker restart. Added diagnostic logging for event subscription and propagation. * fix(redis): wait for subscriber connection ready before subscribe - ensureSubscriber now returns Promise<Redis>, waits for 'ready' event - subscribeCacheInvalidation returns null on failure instead of noop - RequestFilterEngine checks cleanup !== null before logging success - Fixes false "Subscribed" log when Redis connection fails * feat(sensitive-words): add hot-reload via Redis pub/sub Enable real-time cache invalidation for sensitive words detector, matching the pattern used by request-filter-engine and error-rule-detector. * fix(redis): harden cache invalidation subscriptions Ensure sensitive-words CRUD emits update events so hot-reload propagates across workers. Roll back failed pub/sub subscriptions, add retry/timeout coverage, and avoid sticky provider-cache subscription state. * fix(codex): bump default User-Agent fallback Update the hardcoded Codex UA used when requests lack an effective user-agent (e.g. filtered out). Keep unit tests in sync with the new default. * fix(redis): resubscribe cache invalidation after reconnect Clear cached subscription state on disconnect and resubscribe on ready so cross-worker cache invalidation survives transient Redis reconnects. Add unit coverage, avoid misleading publish logs, track detector cleanup handlers, and translate leftover Russian comments to English. * fix(sensitive-words): use globalThis singleton detector Align SensitiveWordDetector with existing __CCH_* singleton pattern to avoid duplicate instances across module reloads. Extend singleton unit tests to cover the detector. * chore: format code (req-fix-dda97fd) * fix: address PR review comments - pubsub.ts: use `once` instead of `on` for ready event to prevent duplicate resubscription handlers on reconnect - forwarder.ts: extract DEFAULT_CODEX_USER_AGENT constant - provider-cache.ts: wrap subscribeCacheInvalidation in try/catch - tests: use exported constant instead of hardcoded UA string * fix(redis): resubscribe across repeated reconnects Ensure pub/sub resubscribe runs on every reconnect, extend unit coverage, and keep emitRequestFiltersUpdated resilient when logger import fails. --------- Co-authored-by: John Doe <johndoe@example.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * feat(logs): make cost column toggleable with improved type safety (#715) close #713 * fix(proxy): add OpenAI chat completion format support in usage extraction (#705) (#716) The `extractUsageMetrics` function was missing support for OpenAI chat completion format fields (`prompt_tokens`/`completion_tokens`), causing token statistics to not be recorded for OpenAI-compatible providers. Changes: - Add `prompt_tokens` -> `input_tokens` mapping - Add `completion_tokens` -> `output_tokens` mapping - Preserve priority: Claude > Gemini > OpenAI format - Add 5 unit tests for OpenAI format handling Closes #705 Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * fix(currency): respect system currencyDisplay setting in UI (#717) Fixes #678 - Currency display unit configuration was not applied. Root cause: - `users-page-client.tsx` hardcoded `currencyCode="USD"` - `UserLimitBadge` and `LimitStatusIndicator` had hardcoded `unit="$"` default - `big-screen/page.tsx` used hardcoded "$" in multiple places Changes: - Add `getCurrencySymbol()` helper function to currency.ts - Fetch system settings in `users-page-client.tsx` and pass to table - Pass `currencySymbol` from `user-key-table-row.tsx` to limit badges - Remove hardcoded "$" defaults from badge components - Update big-screen page to fetch settings and use dynamic symbol - Add unit tests for `getCurrencySymbol` Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * feat(gemini): add Google Search web access preference for Gemini providers (#721) * feat(gemini): add Google Search web access preference for Gemini providers Add provider-level preference for Gemini API type providers to control Google Search (web access) tool injection: - inherit: Follow client request (default) - enabled: Force inject googleSearch tool into request - disabled: Force remove googleSearch tool from request Changes: - Add geminiGoogleSearchPreference field to provider schema - Add GeminiGoogleSearchPreference type and validation - Implement applyGeminiGoogleSearchOverride with audit trail - Add UI controls in provider form (Gemini Overrides section) - Add i18n translations for 5 languages (en, zh-CN, zh-TW, ja, ru) - Integrate override logic in proxy forwarder for Gemini requests - Add 22 unit tests for the override logic Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(gemini): address code review feedback - Use explicit else-if for disabled preference check (gemini-code-assist) - Use i18n for SelectValue placeholder instead of hardcoded string (coderabbitai) - Sync overridden body back to session.request.message for log consistency (coderabbitai) - Persist Gemini special settings immediately, matching Anthropic pattern (coderabbitai) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(gemini): use strict types for provider config and audit - Narrow preference type to "enabled" | "disabled" (exclude unreachable "inherit") - Use ProviderType and GeminiGoogleSearchPreference types instead of string Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * fix(api): 透传 /api/actions 认证会话以避免 getUsers 返回空数据 (#720) * fix(api): 透传 /api/actions 认证会话以避免 getUsers 返回空数据 * fix(auth): 让 scoped session 继承 allowReadOnlyAccess 语义并支持内部降权校验 * chore: format code (dev-93585fa) * fix: bound SessionTracker active_sessions zsets by env TTL (#718) * fix(session): bound active_sessions zsets by env ttl Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * fix(rate-limit): pass session ttl to lua cleanup Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * fix(session): validate SESSION_TTL env and prevent ZSET leak on invalid values - Add input validation for SESSION_TTL (reject NaN, 0, negative; default 300) - Guard against invalid TTL in Lua script to prevent clearing all sessions - Use dynamic EXPIRE based on SESSION_TTL instead of hardcoded 3600s - Add unit tests for TTL validation and dynamic expiry behavior --------- Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * fix(provider): stop standard-path fallback to legacy provider url * feat(providers): expose vendor endpoint pools in settings UI (#719) * feat(providers): add endpoint status mapping * feat(providers): add endpoint pool hover * feat(providers): show vendor endpoints in list rows * feat(providers): extract vendor endpoint CRUD table * chore(i18n): add provider endpoint UI strings * fix(providers): integrate endpoint pool into provider form * fix(provider): wrap endpoint sync in transactions to prevent race conditions (#730) * fix(provider): wrap provider create/update endpoint sync in transactions Provider create and update operations now run vendor resolution and endpoint sync inside database transactions to prevent race conditions that could leave orphaned or inconsistent endpoint rows. Key changes: - createProvider: wrap vendor + insert + endpoint seed in a single tx - updateProvider: wrap vendor + update + endpoint sync in a single tx - Add syncProviderEndpointOnProviderEdit for atomic URL/type/vendor migration with in-place update, soft-delete, and conflict handling - Vendor cleanup failures degrade to warnings instead of propagating - Add comprehensive unit and integration tests for sync edge cases Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(provider): defer endpoint circuit reset until transaction commit Avoid running endpoint circuit reset side effects inside DB transactions to prevent rollback inconsistency. Run resets only after commit and add regression tests for deferred reset behavior in helper and provider update flows. * fix(provider): distinguish noop from created-next in endpoint sync action label When ensureNextEndpointActive() returns "noop" (concurrent transaction already created an active next endpoint), the action was incorrectly labelled "kept-previous-and-created-next". Add a new "kept-previous-and-kept-next" action to ProviderEndpointSyncAction and use a three-way branch so callers and logs reflect the true outcome. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review comments from PR #731 - fix(auth): prevent scoped session access widening via ?? -> && guard - fix(i18n): standardize zh-CN provider terminology to "服务商" - fix(i18n): use consistent Russian translations for circuit status - fix(i18n): replace raw formatDistanceToNow with locale-aware RelativeTime - fix(gemini): log warning for unknown google search preference values - fix(error-rules): check subscribeCacheInvalidation return value - fix(test): correct endpoint hover sort test to assert URLs not labels Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: export auth session storage and fix test mock types - Export authSessionStorage from auth-session-storage.node.ts to prevent undefined on named imports; remove duplicate declare global block - Fix mockEndpoints in provider-endpoint-hover test: remove nonexistent lastOk/lastLatencyMs fields, add missing lastProbe* fields, use Date objects for createdAt/updatedAt Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: miraserver <20286838+miraserver@users.noreply.github.com> Co-authored-by: John Doe <johndoe@example.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: 泠音 <im@ling.plus> Co-authored-by: Longlone <41830147+WAY29@users.noreply.github.com> Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Summary
Fixes OpenAI chat completion format token usage extraction bug where
prompt_tokens/completion_tokenswere not being extracted, resulting in 0 cost/token display.Related Issues:
Problem
When using the
/v1/chat/completionsendpoint with OpenAI-compatible providers, the system failed to extract token usage from responses, causing:Root Cause:
The
extractUsageMetricsfunction (response-handler.ts:1193-1468) only handled:input_tokens,output_tokenspromptTokenCount,candidatesTokenCountBut OpenAI chat completion format uses
prompt_tokens/completion_tokens, which were never extracted. This affected both JSON responses and SSE streaming (where usage appears in the final chunk).Solution
Enhanced
extractUsageMetricsto support OpenAI format with priority-based extraction:prompt_tokens→input_tokens(ifinput_tokensundefined)completion_tokens→output_tokens(ifoutput_tokensundefined)This ensures Claude and Gemini formats take precedence when present, while OpenAI format serves as fallback.
Changes
Core Changes
src/app/v1/_lib/proxy/response-handler.ts(+14 lines)prompt_tokensextraction at line 1199-1202completion_tokensextraction at line 1288-1291Testing
tests/unit/proxy/extract-usage-metrics.test.ts(+82 lines)prompt_tokens/completion_tokensextractioncompletion_tokens_detailshandlingTest Results
Verification
Tested with actual OpenAI-compatible provider response (from Issue #705):
{ "usage": { "completion_tokens": 57, "total_tokens": 123, "prompt_tokens": 66, "completion_tokens_details": { "reasoning_tokens": 0 } } }✅ Now correctly extracts
input_tokens: 66,output_tokens: 57Generated with Claude Code
Description enhanced by Claude AI PR Agent