Skip to content

feat: add provider cache hit rate ranking to dashboard and leaderboard#398

Merged
ding113 merged 10 commits intodevfrom
fix/message-pricing-fix
Dec 21, 2025
Merged

feat: add provider cache hit rate ranking to dashboard and leaderboard#398
ding113 merged 10 commits intodevfrom
fix/message-pricing-fix

Conversation

@ding113
Copy link
Owner

@ding113 ding113 commented Dec 21, 2025

Summary

This PR adds a new provider cache hit rate ranking feature to the dashboard leaderboard and enhances Anthropic SSE usage parsing to better handle relay services and edge cases.

Problem

Feature Request:

  • Users need visibility into how well their providers are utilizing prompt caching, a key cost optimization feature in Claude API

Bug/Enhancement:

  • Some relay services use non-standard field names for cache token metrics (claude_cache_creation_5_m_tokens / claude_cache_creation_1_h_tokens)
  • The existing SSE parsing logic could incorrectly prioritize message_start usage over message_delta in some scenarios

Related Issues:

Solution

1. Provider Cache Hit Rate Leaderboard

Added a new admin-only leaderboard scope that ranks providers by their cache hit rate:

  • Cache hit rate = cache_read_input_tokens / (total tokens)
  • Only considers requests that actually used caching (where cache_creation or cache_read tokens > 0)
  • Supports all time periods (daily, weekly, monthly, all-time, custom range)

2. Enhanced Anthropic SSE Usage Parsing

Improved the parseUsageFromResponseText function to:

  • Prefer message_delta usage metrics over message_start (delta is more complete at stream end)
  • Fall back to message_start for fields missing from delta (e.g., cache_creation_1h_input_tokens)
  • Support relay-specific field naming (claude_cache_creation_5_m_tokens, claude_cache_creation_1_h_tokens)
  • Use a merge strategy that combines metrics from both events intelligently

Changes

Core Changes

  • src/repository/leaderboard.ts - Added ProviderCacheHitRateLeaderboardEntry type and query functions
  • src/lib/redis/leaderboard-cache.ts - Extended cache layer to support new scope
  • src/app/api/leaderboard/route.ts - Added providerCacheHitRate scope to API validation
  • src/app/v1/_lib/proxy/response-handler.ts - Enhanced SSE parsing with merge strategy and relay field support

UI Changes

  • src/app/[locale]/dashboard/leaderboard/_components/leaderboard-view.tsx - Added cache hit rate tab and columns

i18n

  • Updated translation files for all 5 locales (en, ja, ru, zh-CN, zh-TW)

Tests

  • tests/unit/proxy/anthropic-usage-parsing.test.ts - Added comprehensive unit tests for SSE usage parsing

Breaking Changes

None. The API now accepts an additional scope=providerCacheHitRate parameter but maintains backward compatibility.

Testing

Automated Tests

  • Unit tests added for Anthropic SSE usage parsing (anthropic-usage-parsing.test.ts)
  • TypeScript compilation passes
  • Biome lint passes

Manual Testing

  1. Navigate to Dashboard → Leaderboard as admin
  2. Click on "Provider Cache Hit Rate" tab
  3. Verify cache hit rates are displayed as percentages
  4. Switch between time periods and verify data updates
  5. Test with providers that use prompt caching

Checklist

  • Code follows project conventions
  • Self-review completed
  • Tests pass locally
  • i18n translations added for all 5 locales

Description enhanced by Claude AI

ding113 and others added 10 commits December 21, 2025 22:42
- Introduced a new daily limit field in multiple languages for the dashboard.
- Updated user management interfaces and components to support daily quota updates.
- Enhanced batch edit dialog to include daily limit functionality.

This addition improves user control over daily spending limits, aligning with user management needs.
- Added createdAt timestamp to the message context in ProxyMessageService for better tracking.
- Updated response handler to include requestId and createdAtMs for cost tracking.
- Introduced new methods to retrieve cost entries with createdAt for users, providers, and keys, facilitating accurate rolling window calculations.
- Enhanced Lua scripts to support optional request_id for better member tracking in Redis.

These changes improve the accuracy and traceability of cost-related data in the system.
…racking in RateLimitService

- Added useEffect to sync initialData in UsageLogsSection for better state management.
- Updated trackUserDailyCost method to accept optional parameters for requestId and createdAtMs, improving cost tracking accuracy.
- Refactored cost calculation logic in RateLimitService to utilize detailed cost entries for rolling window calculations.

These changes enhance data consistency and tracking capabilities across the application.
- Introduced functions to read local version files and fetch the latest release from GitHub, improving version retrieval.
- Added error handling for GitHub API requests and fallback mechanisms to ensure version information is consistently available.
- Updated the version comparison logic to handle semantic versioning, including support for pre-release identifiers.

These changes enhance the application's ability to manage and display version information effectively.
- Added new interfaces for GitHub commits and enhanced version management functions.
- Implemented logic to check for development builds and fetch the latest commit information from GitHub.
- Improved the response structure to include commit details and comparison URLs for development versions.

These changes improve the application's ability to manage versioning and provide users with up-to-date commit information.
…components

- Added userProviderGroup prop to KeyRowItem for improved group management.
- Implemented splitGroups function to handle provider groups more effectively.
- Updated UserKeyTableRow and KeyEditSection to utilize userProviderGroup for better data consistency.
- Enhanced UI components to display provider groups with tooltips and badges for clarity.
- Refactored UsageLogsSection to improve filter management and state handling.

These changes enhance the user experience by providing clearer group information and improving data handling across components.
- Introduced new translations for provider cache hit rate in English, Japanese, Russian, Chinese (Simplified and Traditional).
- Updated leaderboard component to support provider cache hit rate as a new scope.
- Enhanced API endpoints to include provider cache hit rate in leaderboard queries.
- Added new database functions to calculate and retrieve provider cache hit rate rankings.

These changes improve the dashboard's functionality by allowing users to view and analyze provider cache hit rates, enhancing overall data insights.
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @ding113, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the dashboard's analytical capabilities by integrating a 'Provider Cache Hit Rate' ranking. This new feature provides deeper insights into provider performance, supported by comprehensive backend changes to data retrieval and API handling, as well as improved accuracy in processing usage metrics from streaming API responses.

Highlights

  • New Leaderboard Ranking: Introduced a new 'Provider Cache Hit Rate' ranking to the dashboard and leaderboard, allowing users to track and analyze provider performance based on cache efficiency.
  • Internationalization Support: Added new translation keys for 'Provider Cache Hit Rate Ranking' and 'Cache Hit Rate' across English, Japanese, Russian, Simplified Chinese, and Traditional Chinese.
  • API and Database Integration: Enhanced API endpoints to support queries for the new provider cache hit rate scope and implemented new database functions to calculate and retrieve these rankings efficiently.
  • Improved Claude SSE Usage Parsing: Refactored the parsing logic for Anthropic (Claude) SSE responses to more robustly extract and merge usage metrics, including detailed cache creation and read tokens, handling various field locations and ensuring backward compatibility.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable new feature for monitoring provider cache hit rates on the dashboard and leaderboard. The changes are comprehensive, spanning the database, API, frontend, and internationalization files. The refactoring of the Anthropic/Claude SSE parsing logic is a notable improvement in robustness and is well-supported by new unit tests. I've included a few suggestions to enhance code maintainability and readability in the frontend component, primarily by refactoring duplicated logic and simplifying complex conditionals. Additionally, I've pointed out some inconsistent JSON formatting that should be addressed for better code hygiene. Overall, this is a solid contribution.

Comment on lines +242 to +248
"providerRanking": "供应商排行",
"providerCacheHitRateRanking": "供应商缓存命中率排行",
"modelRanking": "模型排行",
"dailyRanking": "今日",
"weeklyRanking": "本周",
"monthlyRanking": "本月",
"allTimeRanking": "全部"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The indentation in this block has changed from 6 spaces to 4 spaces. While this isn't a functional issue, it creates noise in the diff and suggests inconsistent formatting practices. Please ensure consistent formatting is applied across the file and project to improve readability and maintainability. It seems other JSON files in this PR also have similar indentation changes. It would be best to either format all files consistently or revert the unrelated formatting changes.

Comment on lines 41 to +45
const initialScope: LeaderboardScope =
(urlScope === "provider" || urlScope === "model") && isAdmin ? urlScope : "user";
(urlScope === "provider" || urlScope === "providerCacheHitRate" || urlScope === "model") &&
isAdmin
? urlScope
: "user";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for normalizing the scope from URL parameters is duplicated here and again on lines 61-67 inside the useEffect hook. To improve maintainability and reduce code duplication, consider extracting this logic into a reusable function.

For example, you could define a function like this within your component:

const getNormalizedScope = useCallback(
  (scopeParam: string | null): LeaderboardScope => {
    const validAdminScopes: LeaderboardScope[] = ["provider", "providerCacheHitRate", "model"];
    if (isAdmin && scopeParam && validAdminScopes.includes(scopeParam as LeaderboardScope)) {
      return scopeParam as LeaderboardScope;
    }
    return "user";
  },
  [isAdmin]
);

Then you can use it like const initialScope: LeaderboardScope = getNormalizedScope(urlScope); and similarly inside the useEffect.

Comment on lines 126 to +135
const skeletonColumns =
scope === "user" ? 5 : scope === "provider" ? 7 : scope === "model" ? 6 : 5;
scope === "user"
? 5
: scope === "provider"
? 7
: scope === "providerCacheHitRate"
? 5
: scope === "model"
? 6
: 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This nested ternary operator is difficult to read and maintain. Using a map object to define the number of columns for each scope would make the code cleaner and easier to extend in the future.

  const SKELETON_COLUMNS_MAP: Record<LeaderboardScope, number> = {
    user: 5,
    provider: 7,
    providerCacheHitRate: 5,
    model: 6,
  };
  const skeletonColumns = SKELETON_COLUMNS_MAP[scope] ?? 5;

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

No significant issues identified in this PR. Changes add provider cache hit rate ranking (UI/API/DB) and refine Claude SSE usage parsing/merging with unit coverage.

PR Size: L

  • Lines changed: 518
  • Files changed: 11
  • Split suggestions: Consider splitting into (1) leaderboard cache hit rate scope (DB query + API + UI + i18n) and (2) proxy usage parsing compatibility + tests.

Review Coverage

  • Logic and correctness - Clean
  • Security (OWASP Top 10) - Clean
  • Error handling - Clean
  • Type safety - Clean
  • Documentation accuracy - Clean
  • Test coverage - Adequate
  • Code clarity - Good

Automated review by Codex AI

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Identified PR #398 and pulled full diff + file list via gh.
  • Applied size label: size/L (518 lines changed, 11 files).
  • Completed multi-perspective review on changed lines; no high-confidence issues to file as inline comments.
  • Posted summary review comment to PR #398 via gh pr review --comment.

@ding113 ding113 merged commit 7af2b22 into dev Dec 21, 2025
14 of 16 checks passed
@github-project-automation github-project-automation bot moved this from Backlog to Done in Claude Code Hub Roadmap Dec 21, 2025
@ding113 ding113 deleted the fix/message-pricing-fix branch December 23, 2025 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant

Comments