-
Notifications
You must be signed in to change notification settings - Fork 121
Add prompt processing metrics #250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- capture prompt processing metrics - display prompt processing metrics on UI Activity page
WalkthroughAdds prompt_per_second to backend TokenMetrics, parses it from timings.prompt_per_second in proxy middleware, exposes it via JSON, updates UI API types, and displays a new “Prompt Processing” column on the Activity page using formatSpeed. Changes
Sequence Diagram(s)sequenceDiagram
participant LlamaServer as llama-server
participant Proxy as Proxy (metrics_middleware)
participant UIAPI as APIProvider
participant Activity as Activity Page
LlamaServer-->>Proxy: Response with timings.prompt_per_second
Proxy->>Proxy: parseAndRecordMetrics() extracts prompt_per_second
Proxy-->>UIAPI: TokenMetrics JSON (prompt_per_second)
UIAPI-->>Activity: Metrics.prompt_per_second
Activity->>Activity: formatSpeed(prompt_per_second)
Activity-->>User: Render "Prompt Processing" column
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Assessment against linked issues
Assessment against linked issues: Out-of-scope changesNone found. Possibly related PRs
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
proxy/metrics_monitor.go (1)
12-21: Document the sentinel semantics for PromptPerSecondWe’re relying on -1.0 to mean “unknown” downstream. Make this explicit in the struct to prevent misuse by future contributors.
type TokenMetrics struct { ID int `json:"id"` Timestamp time.Time `json:"timestamp"` Model string `json:"model"` InputTokens int `json:"input_tokens"` OutputTokens int `json:"output_tokens"` - PromptPerSecond float64 `json:"prompt_per_second"` + // PromptPerSecond represents average prompt token processing speed (tokens/sec). + // A value of -1.0 indicates the speed is unknown/unavailable. + PromptPerSecond float64 `json:"prompt_per_second"` TokensPerSecond float64 `json:"tokens_per_second"` DurationMs int `json:"duration_ms"` }ui/src/pages/Activity.tsx (1)
55-55: Nit: Header wording for clarityConsider renaming to better mirror “Generation Speed” and explicitly convey units.
- <th className="px-6 py-3 text-left text-xs font-medium uppercase tracking-wider">Prompt Processing</th> + <th className="px-6 py-3 text-left text-xs font-medium uppercase tracking-wider">Prompt Speed (t/s)</th>
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these settings in your CodeRabbit configuration.
📒 Files selected for processing (4)
proxy/metrics_middleware.go(2 hunks)proxy/metrics_monitor.go(1 hunks)ui/src/contexts/APIProvider.tsx(1 hunks)ui/src/pages/Activity.tsx(2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
ui/src/contexts/APIProvider.tsx (1)
ui/src/pages/Models.tsx (3)
totalRequests(136-144)StatsPanel(133-167)sum(142-142)
ui/src/pages/Activity.tsx (1)
ui/src/pages/Models.tsx (1)
StatsPanel(133-167)
🔇 Additional comments (3)
ui/src/contexts/APIProvider.tsx (1)
31-31: LGTM: API surface aligned with backendAdding prompt_per_second keeps the UI type in sync with the backend JSON (snake_case keys). The “unknown” handling in the Activity page covers the negative sentinel.
ui/src/pages/Activity.tsx (1)
67-67: LGTM: Correct rendering of prompt speedUsing formatSpeed(metric.prompt_per_second) ensures negative values map to “unknown” and others are displayed as t/s with two decimals.
proxy/metrics_middleware.go (1)
92-100: LGTM: Emitting new PromptPerSecond fieldWires promptPerSecond through to TokenMetrics with the correct JSON field for the UI.
Fixes #249
Summary by CodeRabbit