Skip to content

Add compact command for verifying and consolidating reviews#261

Closed
anshbansal wants to merge 14 commits intoroborev-dev:mainfrom
anshbansal:feature/compact-command
Closed

Add compact command for verifying and consolidating reviews#261
anshbansal wants to merge 14 commits intoroborev-dev:mainfrom
anshbansal:feature/compact-command

Conversation

@anshbansal
Copy link
Contributor

@anshbansal anshbansal commented Feb 15, 2026

PR for #257

I did

make install
roborev compact

and I believe this is working

anshbansal and others added 12 commits February 15, 2026 16:27
Implements `roborev compact` command that discovers unaddressed review jobs,
sends them to an agent for verification against the current codebase, consolidates
related findings, and creates a new consolidated review job. Original jobs are
marked as addressed after consolidation.

This adds a quality layer between `review` and `fix` to reduce false positives
and consolidate findings from multiple reviews for easier human review.

Features:
- Discovers unaddressed jobs with branch filtering (--branch, --all-branches)
- Builds verification prompt that instructs agent to check each finding
- Creates consolidated task job with agentic mode for codebase search
- Streams agent output during verification (unless --quiet)
- Marks original jobs as addressed after successful consolidation
- Supports --dry-run to preview what would be done
- Uses fix agent configuration by default (--agent, --model, --reasoning)

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Add isValidConsolidatedReview() to detect agent errors vs valid findings
Add extractJobIDs() helper to extract IDs from jobReview slice
Add comprehensive unit tests for validation and prompt building

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Critical fixes:
- Fix data loss bug: track successfulJobIDs separately, only mark those as addressed
- Add output validation: reject agent errors before marking jobs as addressed
- Add --limit flag (default 20) to prevent oversized prompts
- Remove unused --unaddressed flag

Refactoring:
- Extract fetchJobReviews() helper for cleaner error handling
- Extract verifyAndConsolidate() for separation of concerns
- Extract markJobsAsAddressed() for reusability
- Reduce runCompact() from 170 lines to ~100 lines
- Create job_helpers.go with shared waitForJobCompletion()
- Document race condition limitation in code and help text

All tests pass, build succeeds.

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Validation fixes:
- Only check for error patterns at line start (avoid false positives)
- Require both severity AND structure markers for validation
- Accept words like "cannot" and "error" in review content
- Support more file extensions (.js, .ts, not just .go/.py)

Progress improvements:
- Show enqueued job ID when starting consolidation
- Makes it clear which job is running

Updated tests to reflect stricter but more accurate validation.

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Shows detailed progress when fetching and marking jobs:
  [1/4] Fetching job 152... done
  [2/4] Fetching job 151... done
  [3/4] Fetching job 144... done
  [4/4] Fetching job 142... done

And when marking as addressed:
  [1/4] Marking job 152... done
  [2/4] Marking job 151... done
  ...

Makes it clear what's happening during multi-job operations.

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Shows progress while waiting for agent to start and process:
- "Waiting for agent to start..." (with dots)
- "Agent processing..." (with dots until output appears)

Provides feedback during the main wait time after job enqueue
and before agent output starts streaming.

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Compact jobs now get Pass/Fail verdicts computed:
- Added JobTypeCompact to distinguish from generic task jobs
- Compact jobs are not considered "task jobs" so verdicts are computed
- Explicit job_type parameter passed from compact command to daemon

Fixes compact reviews showing "-" instead of "F" in TUI when findings exist.

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Compact jobs have custom prompts but non-task job types, so they
need special handling in worker. Changed prompt selection to check
for job.Prompt directly instead of job.IsTaskJob().

Fixes "git log: exit status 128" error when processing compact jobs.

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Compact now runs in background by default:
- Enqueues consolidation job and returns immediately
- User can continue working while consolidation runs
- Use --wait flag to block until completion (like analyze --wait)

Background mode output:
- Shows enqueued job ID
- Explains how to check progress (roborev status/tui)
- No streaming output to avoid false progress expectations

With --wait flag:
- Streams agent output
- Waits for completion
- Validates output
- Marks jobs as addressed automatically

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Use singular "review" when count=1, plural "reviews" otherwise:
- "1 unaddressed review" (was: "1 unaddressed reviews")
- "Verified and consolidated 1 unaddressed review" (was: "reviews")

Updated tests to expect correct grammar.

Fixes the last remaining issue from Review roborev-dev#164.

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Add "Review Verification" section to README explaining how compact
automates verification and consolidation of review findings. Include
command examples and workflow explanation. Add compact to commands
table between analyze and show for logical grouping.

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
- Remove ambiguous "wide code search" phrasing - agent uses its search tools, not an inherent capability
- Use active voice: clarify that roborev automatically marks jobs as addressed
- Split dense sentence into two for better readability

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
@roborev-ci
Copy link

roborev-ci bot commented Feb 15, 2026

roborev: Combined Review

Verdict: PR has unresolved High and Medium issues; not ready to merge.

High

  1. Indirect prompt-injection path into an agentic compact job (security)

    • Locations: cmd/roborev/compact.go:404, cmd/roborev/compact.go:418, cmd/roborev/compact.go:460, cmd/roborev/compact.go:464
    • Why it matters: buildCompactPrompt embeds prior jr.review.Output (untrusted/model-influenced text) into a new prompt, and compact enqueue sets agentic: true. This enables instruction smuggling from prior review text into a tool-capable run.
    • Recommended fix: Run compact in a restricted read-only profile (agentic: false or strict non-mutating tool allowlist), and treat embedded review text as inert data with strong delimiters/quoting.
  2. compact default background flow does not auto-address source jobs

    • Locations: cmd/roborev/compact.go:340, cmd/roborev/compact.go:367
    • Why it matters: With default --wait=false, command returns before markJobsAsAddressed runs, so source jobs can remain unaddressed and be reprocessed despite CLI messaging implying they will be addressed.
    • Recommended fix: Move addressing to daemon post-success handling for compact jobs (or persist/reconcile source job IDs), or make blocking behavior explicit/default.

Medium

  1. Unbounded prompt aggregation can cause resource exhaustion / context overflow

    • Locations: cmd/roborev/compact.go:95, cmd/roborev/compact.go:291, cmd/roborev/compact.go:404
    • Why it matters: --limit has no hard cap and each prior review output is appended in full, allowing oversized prompts (memory pressure, failed API calls, high cost).
    • Recommended fix: Enforce hard caps on --limit, per-review bytes, and total prompt size; truncate with explicit markers and fail early when limits are exceeded.
  2. Resolved agent/model shown to user may differ from enqueued payload

    • Locations: cmd/roborev/compact.go:166, cmd/roborev/compact.go:459
    • Why it matters: CLI resolves fix-agent values for messaging, but enqueue may send unresolved/empty fields, letting daemon defaults diverge from what user expects.
    • Recommended fix: Enqueue resolved agent/model/reasoning values (or report daemon-resolved values only).
  3. HTTP response body leak in polling path on non-200 responses

    • Location: cmd/roborev/job_helpers.go:96
    • Why it matters: In /api/review polling, response bodies are closed only on 200 path; repeated non-200 polling can leak connections/resources.
    • Recommended fix: Close reviewResp.Body on every successful Do call, regardless of status code.

Synthesized from 4 reviews (agents: codex, gemini | types: security, default)

@anshbansal anshbansal marked this pull request as ready for review February 15, 2026 12:30
@roborev-ci
Copy link

roborev-ci bot commented Feb 15, 2026

roborev: Combined Review

Verdict: Changes are not PR-ready; there are 2 High and 3 Medium issues that should be addressed before merge.

High

  1. Indirect prompt injection with write-capable compact agent
  • Locations: cmd/roborev/compact.go:~398-406, cmd/roborev/compact.go:~411, cmd/roborev/compact.go:~417-418, cmd/roborev/compact.go:~452, cmd/roborev/compact.go:~459, cmd/roborev/compact.go:~464-465
  • Issue: Prior review output (jr.review.Output, untrusted text) is embedded into a new prompt and enqueued with agentic: true, enabling prompt-injection against a write/shell-capable agent.
  • Recommended fix: Run compact with a read-only tool allowlist, treat embedded review text as untrusted data with strong delimiting, and enforce capability restrictions by job type (compact) in worker/runtime.
  1. Default background compact flow does not mark source jobs addressed
  • Locations: cmd/roborev/compact.go:341, cmd/roborev/compact.go:367
  • Issue: In non---wait mode, runCompact returns before markJobsAsAddressed, so source jobs remain unaddressed and can be repeatedly re-compacted.
  • Recommended fix: Move source-job addressing/finalization to daemon-side completion for job_type=compact (or equivalent async completion path), not only CLI wait flow.

Medium

  1. job_type trust/type confusion + overly broad prompt execution path
  • Locations: internal/daemon/server.go:~370, internal/daemon/server.go:~548, internal/storage/jobs.go:~755, internal/daemon/worker.go:~286
  • Issue: API accepts/propagates job_type without strict allowlisting, while worker executes prompt for any job with job.Prompt != ""; this allows inconsistent/forged workflow states.
  • Recommended fix: Enforce server-side job_type allowlist and request-shape validation, reject inconsistent combinations, and gate prompt execution to explicit allowed types (task, compact).
  1. HTTP response body leak in polling helper on non-200 review responses
  • Location: cmd/roborev/job_helpers.go:95
  • Issue: reviewResp.Body.Close() is only in the OK path; repeated non-200 responses can leak connections/bodies.
  • Recommended fix: Close body for every successful client.Do response regardless of status.
  1. Compact enqueue can lose git_ref metadata
  • Locations: internal/daemon/server.go:550, internal/storage/jobs.go:773, internal/storage/jobs.go:774
  • Issue: Prompt enqueue path sets Label but not GitRef; fallback behavior only covers task, so compact jobs may persist empty git_ref.
  • Recommended fix: Populate GitRef in prompt enqueue options (or apply fallback for compact as well).

Synthesized from 4 reviews (agents: codex, gemini | types: security, default)

@anshbansal anshbansal marked this pull request as draft February 15, 2026 13:27
anshbansal and others added 2 commits February 15, 2026 19:14
Previously, when running `roborev compact` without --wait flag, the
source jobs were never marked as addressed because the CLI returned
immediately after enqueueing. This defeated the purpose of compact.

Solution: Make daemon worker automatically mark source jobs when compact
job completes. Uses temporary JSON files to track source job IDs.

- CLI writes compact-{jobID}.json with source job IDs after enqueue
- Worker reads metadata file when compact job completes successfully
- Worker marks each source job as addressed via MarkReviewAddressedByJobID
- Worker cleans up metadata file after processing
- Gracefully handles legacy compact jobs without metadata files

This works in both --wait and background modes. The --wait mode still
marks jobs immediately for user feedback, but worker is source of truth.

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Addresses multiple critical and medium severity issues identified in
review roborev-dev#184 of commit da929f2:

1. Race Condition (Critical): Fixed race where metadata file was written
   AFTER job was enqueued. Now if metadata write fails, the job is
   canceled and an error is returned to the user. This prevents source
   jobs from never being marked as addressed.

2. Error Handling (Critical): Made metadata write failures fatal instead
   of just logging a warning. If metadata cannot be written, the compact
   job is canceled via the daemon API and the error is propagated to the
   user.

3. Duplicate Code (Medium): Removed duplicate getDataDir() function and
   replaced with config.DataDir() throughout. Added config package import.

4. Missing Tests (Medium): Added comprehensive test coverage:
   - internal/daemon/compact_test.go: Tests for ReadCompactMetadata,
     DeleteCompactMetadata, and compactMetadataPath
   - cmd/roborev/compact_test.go: Tests for writeCompactMetadata
     including empty source IDs validation

5. Empty Validation (Low): Added early return in writeCompactMetadata
   for empty sourceJobIDs to avoid creating unnecessary metadata files.

6. Error Messages (Low): Improved error message in markCompactSourceJobs
   to include job ID for better debugging context.

7. Redundant Logging (Low): Removed per-job success logs in
   markCompactSourceJobs, keeping only the summary log to reduce noise.

Technical changes:
- Added cancelJob() helper function to cancel jobs via daemon API
- Updated writeCompactMetadata() to validate and fail fast
- All tests pass: go test ./...
- Code formatted: go fmt ./...
- No vet issues: go vet ./...

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
@roborev-ci
Copy link

roborev-ci bot commented Feb 15, 2026

roborev: Combined Review

Verdict: Not ready to merge — 3 High and 4 Medium issues remain after deduplication.

High

  1. Indirect prompt-injection in compact aggregation (agentic execution)

    • Refs: cmd/roborev/compact.go:421, cmd/roborev/compact.go:431, cmd/roborev/compact.go:433, cmd/roborev/compact.go:475, cmd/roborev/compact.go:477, cmd/roborev/compact.go:480
    • Prior review.Output is embedded verbatim into a new prompt while compact runs with agentic: true, so malicious instructions in prior review text can influence tool-using behavior.
    • Use strict data delimiting/sanitization for prior outputs and prefer non-agentic or read-only tool mode for compact jobs.
  2. Untrusted job_type can trigger privileged compact side effects

    • Refs: internal/daemon/server.go:370, internal/daemon/server.go:548, internal/daemon/server.go:551, internal/storage/jobs.go:752, internal/storage/jobs.go:757, internal/daemon/worker.go:391
    • job_type is client-controlled and used in logic that enables compact-specific behavior (including auto-address flow), creating a broken trust boundary.
    • Derive/gate privileged behavior server-side and validate job_type against an allowlist.
  3. TOCTOU race between enqueue and metadata write for compact source-job mapping

    • Refs: cmd/roborev/compact.go:333, cmd/roborev/compact.go:340, cmd/roborev/compact.go:346, cmd/roborev/compact.go:356, internal/daemon/worker.go:454, internal/daemon/worker.go:456
    • Job can execute before metadata file exists (or is fully written), causing source jobs to be skipped/unaddressed.
    • Move source_job_ids into atomic DB/API job metadata instead of filesystem side-channel.

Medium

  1. Daemon marks source jobs addressed without validating compact output

    • Refs: internal/daemon/worker.go:391, internal/daemon/worker.go:414, cmd/roborev/compact.go:215, cmd/roborev/compact.go:583, internal/daemon/compact.go:20
    • Validation is tied to CLI --wait path; background completion may still auto-address on invalid/empty consolidation output.
    • Enforce output validity in daemon before any addressing side effects.
  2. HTTP response body leak in polling helper on non-200 path

    • Ref: cmd/roborev/job_helpers.go:95
    • Response body is not consistently closed on non-200 /api/review responses, risking FD/connection leaks in long polling loops.
    • Close body on all successful Do response paths.
  3. Retry behavior regression for non-task jobs with stored prompt

    • Refs: internal/daemon/worker.go:286, internal/daemon/worker.go:313, internal/storage/jobs.go:1098
    • Non-task jobs with non-empty prompt may enter custom-prompt/retry path and change prompt semantics unexpectedly.
    • Restrict custom-prompt retry path to intended job types (task/compact) or preserve original review prompt behavior.
  4. Filesystem metadata trust lacks integrity/context binding

    • Refs: cmd/roborev/compact.go:583, internal/daemon/compact.go:20, internal/daemon/worker.go:414
    • Worker trusts a local JSON file for authoritative source-job mapping without strong binding to compact job context.
    • Store and verify mapping transactionally in DB keyed to compact job/repo context.

Synthesized from 4 reviews (agents: codex, gemini | types: security, default)

@wesm
Copy link
Collaborator

wesm commented Feb 15, 2026

Thanks for putting this up. I'm probably going to try to get 0.32.0 out before working on this but give me a day or two, if you have more work you want to continue pushing here feel free otherwise I can take it over and get it to a merge ready state

@anshbansal
Copy link
Contributor Author

@wesm I am not sure I will be able to push during the weekdays. Please feel free to take over. I think I was mostly done.

@wesm
Copy link
Collaborator

wesm commented Feb 16, 2026

All good. I can take it from here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments