feat(sdk): Track and display auxiliary LLM costs by dcramer · Pull Request #115 · getsentry/warden

dcramer · 2026-02-07T00:19:56Z

Track and display costs from auxiliary LLM calls that were previously invisible.

Warden makes two direct Anthropic API calls (not through Claude Code SDK) whose costs were untracked:

Extraction repair (extractFindingsWithLLM): Uses claude-haiku-4-5 when regex extraction fails
Semantic dedup (findSemanticDuplicates): Uses claude-haiku-4-5 to detect duplicate findings

Both now capture response.usage, convert it to UsageStats via a new pricing module, and surface it in all output formats. The data model uses AuxiliaryUsageMap (a record of agent name to UsageStats) on SkillReport, designed to accommodate future auxiliary agents.

What changed:

New AuxiliaryUsageMapSchema type and optional auxiliaryUsage field on SkillReport
New src/sdk/pricing.ts with model pricing constants and apiUsageToStats()
New aggregateAuxiliaryUsage() and mergeAuxiliaryUsage() helpers in usage.ts
Extraction repair in extract.ts now returns usage; threaded through hunk -> file -> report in both runSkill() and runSkillTask() code paths
Semantic dedup in dedup.ts now returns usage; merged into report in poster.ts
formatStatsCompact() shows total cost with per-agent breakdown: $0.0060 (+extraction: $0.0012)
GitHub check summaries, PR comment renderer, and JSONL output all include auxiliary costs
All new fields are optional for backward compatibility

Depends on #114.

Refs #108

Route interrupt message through Ink rendering pipeline (TTY) and logPlain (non-TTY) instead of raw stderr writes that corrupt Ink's cursor tracking. The abort signal triggers the message in both paths. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Warden makes two auxiliary LLM calls (direct Anthropic API) whose costs were invisible: extraction repair and semantic dedup. Both now capture usage data and surface it in all output formats. Add AuxiliaryUsageMap type (record of agent name to UsageStats) and optional auxiliaryUsage field on SkillReport. Create pricing module for calculating costs from raw API token counts. Thread auxiliary usage from extraction repair through hunk -> file -> skill report in both runSkill and runSkillTask code paths. Capture semantic dedup usage and merge it into reports in the review poster. Update formatStatsCompact to show total cost (primary + auxiliary) with per-agent breakdown suffix. Update GitHub check summaries, PR comment renderer, and JSONL output to include auxiliary costs. Refs #108 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel · 2026-02-07T00:20:01Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
warden	Ready	Preview, Comment	Feb 7, 2026 3:58am

src/output/dedup.ts

src/output/github-checks.ts

…elpers Hoist usage variable outside try block in findSemanticDuplicates so API usage is preserved even when response parsing fails. Also extract shared auxiliary cost formatting helpers from github-checks.ts to formatters.ts to eliminate duplication. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

src/sdk/pricing.ts

Replace hardcoded MODEL_PRICING table with JSON generated from the open-source pydantic/genai-prices repository. Adds scripts/update-pricing.ts to fetch and normalize Anthropic pricing data (handling tiered pricing), and commits the generated model-pricing.json so the library works without running the script. Also includes per-file report tracking (FileReport), ink-runner file completion display, and JSONL per-file records. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

src/sdk/analyze.ts

Use Promise.all return values to collect results in input order instead of pushing to shared arrays from concurrent async functions. The previous side-effect approach produced non-deterministic ordering for report.files and findings when parallel=true. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The action was calling runSkill() without callbacks, producing zero output during skill execution. In CI this meant minutes of silence after "Running trigger". Wire up onFileStart, onHunkStart, and onFileComplete so file and hunk progress appears in the Actions log. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace the ad-hoc console.log callbacks in the trigger executor with the same runSkillTask + createDefaultCallbacks log-mode reporter the CLI uses. This gives CI output consistent formatting: timestamped lines with file progress, hunk ranges, duration, cost, and finding counts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

src/action/review/poster.ts

src/sdk/extract.ts

- Add null/type check before iterating anthropic.models in update-pricing.ts (external data source may omit the field) - Fix JSONL spec example: findings array now has 2 entries matching the "2 issues (1 high, 1 medium)" summary - Code simplifier: extract AuxiliaryUsageEntry type (was repeated 7x), FileProcessResult interface, simplify aux collection with flatMap Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

src/cli/output/jsonl.ts

scripts/update-pricing.ts

The JSONL summary was only aggregating main SDK usage, omitting auxiliary costs (extraction LLM calls). Consumers reading only the summary line would undercount total costs compared to GitHub checks. Now aggregates auxiliaryUsage across all skill reports using mergeAuxiliaryUsage, matching the GitHub check summary behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

scripts/update-pricing.ts

src/action/triggers/executor.ts

Add null check on model.prices in update-pricing script to skip models without pricing data. Add batchDelayMs rate-limiting between file batches in runSkillTask, matching the existing behavior in runSkill. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

cursor · 2026-02-07T04:05:14Z

src/action/checks/manager.ts

  findingsBySeverity: Record<string, number>;
  totalDurationMs?: number;
  totalUsage?: UsageStats;
+  totalAuxiliaryUsage?: AuxiliaryUsageMap;


Inconsistent indentation for new totalAuxiliaryUsage properties

Low Severity

The totalAuxiliaryUsage property is indented with 2 spaces in both the return type declaration (line 67) and the return object literal (line 93), while all sibling properties use 4 spaces (type) and 6 spaces (object literal) respectively. This misalignment breaks the visual structure of the code block.

Additional Locations (1)

src/action/checks/manager.ts#L92-L93

dcramer and others added 2 commits February 6, 2026 14:19

cursor bot reviewed Feb 7, 2026

View reviewed changes

src/output/dedup.ts Outdated Show resolved Hide resolved

src/output/github-checks.ts Outdated Show resolved Hide resolved

dcramer and others added 2 commits February 6, 2026 16:33

build: Rebuild dist

ec7adbf

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel bot deployed to Preview February 7, 2026 00:34 View deployment

cursor bot reviewed Feb 7, 2026

View reviewed changes

src/sdk/pricing.ts Outdated Show resolved Hide resolved

vercel bot deployed to Preview February 7, 2026 01:27 View deployment

Missed reporters spec in commit

1fc3e41

vercel bot deployed to Preview February 7, 2026 01:31 View deployment

cursor bot reviewed Feb 7, 2026

View reviewed changes

src/sdk/analyze.ts Show resolved Hide resolved

vercel bot deployed to Preview February 7, 2026 01:53 View deployment

vercel bot deployed to Preview February 7, 2026 02:05 View deployment

vercel bot deployed to Preview February 7, 2026 02:11 View deployment

cursor bot reviewed Feb 7, 2026

View reviewed changes

src/action/review/poster.ts Show resolved Hide resolved

src/sdk/extract.ts Show resolved Hide resolved

vercel bot deployed to Preview February 7, 2026 02:41 View deployment

cursor bot reviewed Feb 7, 2026

View reviewed changes

src/cli/output/jsonl.ts Show resolved Hide resolved

sentry-warden bot reviewed Feb 7, 2026

View reviewed changes

scripts/update-pricing.ts Show resolved Hide resolved

vercel bot deployed to Preview February 7, 2026 03:27 View deployment

sentry-warden bot reviewed Feb 7, 2026

View reviewed changes

scripts/update-pricing.ts Show resolved Hide resolved

cursor bot reviewed Feb 7, 2026

View reviewed changes

src/action/triggers/executor.ts Show resolved Hide resolved

vercel bot deployed to Preview February 7, 2026 03:58 View deployment

cursor bot reviewed Feb 7, 2026

View reviewed changes

dcramer marked this pull request as ready for review February 7, 2026 05:13

dcramer merged commit bf8bc00 into main Feb 7, 2026
12 checks passed

dcramer deleted the feat/auxiliary-usage-tracking branch February 7, 2026 05:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(sdk): Track and display auxiliary LLM costs#115

feat(sdk): Track and display auxiliary LLM costs#115
dcramer merged 12 commits intomainfrom
feat/auxiliary-usage-tracking

dcramer commented Feb 7, 2026

Uh oh!

vercel bot commented Feb 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Feb 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

dcramer commented Feb 7, 2026

Uh oh!

vercel bot commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Feb 7, 2026

Choose a reason for hiding this comment

Inconsistent indentation for new totalAuxiliaryUsage properties

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel bot commented Feb 7, 2026 •

edited

Loading

Inconsistent indentation for new `totalAuxiliaryUsage` properties