feat(sdk): Track and display auxiliary LLM costs#115
Merged
Conversation
Route interrupt message through Ink rendering pipeline (TTY) and logPlain (non-TTY) instead of raw stderr writes that corrupt Ink's cursor tracking. The abort signal triggers the message in both paths. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Warden makes two auxiliary LLM calls (direct Anthropic API) whose costs were invisible: extraction repair and semantic dedup. Both now capture usage data and surface it in all output formats. Add AuxiliaryUsageMap type (record of agent name to UsageStats) and optional auxiliaryUsage field on SkillReport. Create pricing module for calculating costs from raw API token counts. Thread auxiliary usage from extraction repair through hunk -> file -> skill report in both runSkill and runSkillTask code paths. Capture semantic dedup usage and merge it into reports in the review poster. Update formatStatsCompact to show total cost (primary + auxiliary) with per-agent breakdown suffix. Update GitHub check summaries, PR comment renderer, and JSONL output to include auxiliary costs. Refs #108 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
…elpers Hoist usage variable outside try block in findSemanticDuplicates so API usage is preserved even when response parsing fails. Also extract shared auxiliary cost formatting helpers from github-checks.ts to formatters.ts to eliminate duplication. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace hardcoded MODEL_PRICING table with JSON generated from the open-source pydantic/genai-prices repository. Adds scripts/update-pricing.ts to fetch and normalize Anthropic pricing data (handling tiered pricing), and commits the generated model-pricing.json so the library works without running the script. Also includes per-file report tracking (FileReport), ink-runner file completion display, and JSONL per-file records. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use Promise.all return values to collect results in input order instead of pushing to shared arrays from concurrent async functions. The previous side-effect approach produced non-deterministic ordering for report.files and findings when parallel=true. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The action was calling runSkill() without callbacks, producing zero output during skill execution. In CI this meant minutes of silence after "Running trigger". Wire up onFileStart, onHunkStart, and onFileComplete so file and hunk progress appears in the Actions log. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the ad-hoc console.log callbacks in the trigger executor with the same runSkillTask + createDefaultCallbacks log-mode reporter the CLI uses. This gives CI output consistent formatting: timestamped lines with file progress, hunk ranges, duration, cost, and finding counts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add null/type check before iterating anthropic.models in update-pricing.ts (external data source may omit the field) - Fix JSONL spec example: findings array now has 2 entries matching the "2 issues (1 high, 1 medium)" summary - Code simplifier: extract AuxiliaryUsageEntry type (was repeated 7x), FileProcessResult interface, simplify aux collection with flatMap Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The JSONL summary was only aggregating main SDK usage, omitting auxiliary costs (extraction LLM calls). Consumers reading only the summary line would undercount total costs compared to GitHub checks. Now aggregates auxiliaryUsage across all skill reports using mergeAuxiliaryUsage, matching the GitHub check summary behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add null check on model.prices in update-pricing script to skip models without pricing data. Add batchDelayMs rate-limiting between file batches in runSkillTask, matching the existing behavior in runSkill. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| findingsBySeverity: Record<string, number>; | ||
| totalDurationMs?: number; | ||
| totalUsage?: UsageStats; | ||
| totalAuxiliaryUsage?: AuxiliaryUsageMap; |
There was a problem hiding this comment.
Inconsistent indentation for new totalAuxiliaryUsage properties
Low Severity
The totalAuxiliaryUsage property is indented with 2 spaces in both the return type declaration (line 67) and the return object literal (line 93), while all sibling properties use 4 spaces (type) and 6 spaces (object literal) respectively. This misalignment breaks the visual structure of the code block.
Additional Locations (1)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Track and display costs from auxiliary LLM calls that were previously invisible.
Warden makes two direct Anthropic API calls (not through Claude Code SDK) whose costs were untracked:
extractFindingsWithLLM): Usesclaude-haiku-4-5when regex extraction failsfindSemanticDuplicates): Usesclaude-haiku-4-5to detect duplicate findingsBoth now capture
response.usage, convert it toUsageStatsvia a new pricing module, and surface it in all output formats. The data model usesAuxiliaryUsageMap(a record of agent name toUsageStats) onSkillReport, designed to accommodate future auxiliary agents.What changed:
AuxiliaryUsageMapSchematype and optionalauxiliaryUsagefield onSkillReportsrc/sdk/pricing.tswith model pricing constants andapiUsageToStats()aggregateAuxiliaryUsage()andmergeAuxiliaryUsage()helpers inusage.tsextract.tsnow returns usage; threaded through hunk -> file -> report in bothrunSkill()andrunSkillTask()code pathsdedup.tsnow returns usage; merged into report inposter.tsformatStatsCompact()shows total cost with per-agent breakdown:$0.0060 (+extraction: $0.0012)Depends on #114.
Refs #108