Skip to content

Conversation

@shaal
Copy link

@shaal shaal commented Jan 6, 2026

Summary

This PR implements the ability to run multiple Claude agents in parallel to speed up project completion, as requested in issue #5.

Key Features

  • Git worktrees for isolation: Each agent gets its own working directory, preventing file conflicts
  • Atomic task claiming: New feature_claim_next(agent_id) MCP tool prevents race conditions
  • Agent assignment tracking: Features show which agent is working on them via UI badges
  • Configurable parallelism: --num-agents CLI flag, REST API for UI control

New Components

File Purpose
worktree.py Git worktree management for agent isolation
parallel_agents.py Orchestrator for managing multiple agents
parallel_agent_runner.py Individual agent process runner
server/routers/parallel_agents.py REST API endpoints for parallel agent control

Database Changes

  • Added assigned_agent_id column to Feature model
  • Migration automatically runs on startup

Usage

# CLI: Run 3 agents in parallel
python autonomous_agent_demo.py --project-dir my-app --num-agents 3

# Combine with YOLO mode for maximum speed
python autonomous_agent_demo.py --project-dir my-app --num-agents 3 --yolo

# API: Start parallel agents
POST /api/projects/{name}/parallel-agents/start
Body: { "num_agents": 3, "yolo_mode": false }

Screenshots

Agent badges appear on feature cards showing which agent (#1, #2, etc.) is working on each feature.

CodeRabbitAI Review Fixes

The following issues from the CodeRabbitAI review have been addressed:

Critical/Major Fixes

  • Race condition in feature_claim_next: Changed from OR to AND logic to ensure features must be both in_progress=False AND assigned_agent_id=None to be claimed
  • Sequential agent startup: Now uses asyncio.gather() for concurrent startup, reducing init time
  • AttributeError in router: Fixed .status access on non-existent agents
  • Git init error handling: Now returns False if add/commit fails during initialization
  • Symlink failure handling: Added marker file fallback when symlinks aren't supported
  • Merge target branch: Now explicitly checks out target branch before merging

Minor Fixes

  • ✅ Removed unnecessary empty-string checks from filter conditions (codebase uses None)
  • ✅ Added comprehensive docstrings warning about SIGSTOP/SIGCONT limitations with asyncio
  • ✅ Added *.tsbuildinfo to .gitignore and removed from tracking

Additional Fixes (Second Pass)

  • Worktree path filtering: Replaced fragile string matching (str(dir) in str(path)) with Path.is_relative_to() to prevent false positives from substring matches
  • Docstring coverage: Added missing docstrings to 7 functions to ensure codebase exceeds 80% coverage threshold:
    • run_parallel() in autonomous_agent_demo.py
    • read_file(), write_file() in server/routers/projects.py
    • ConnectionManager.__init__() in server/websocket.py
    • status getter/setter, pid property in server/services/process_manager.py

Additional Fixes (Third Pass)

  • --max-iterations ignored in parallel mode: Added max_iterations parameter to ParallelAgentOrchestrator.start_agents() and start_agent(), passing it through to parallel_agent_runner.py. Also added to API schema ParallelAgentStartRequest.
  • Health check ambiguity: Updated CLI health check loop to distinguish between agents that completed successfully vs crashed, printing separate lists for each status.
  • Fallback commit return code: Now checks the return code of the empty commit created during worktree initialization and fails early with a descriptive error if it fails.
  • Symlink fallback clarification: Updated comments and log messages to explain that symlink failure is OK because parallel_agent_runner.py sets the PROJECT_DIR environment variable, which the MCP server uses to locate features.db. The marker file is now documented as debugging-only.

Acknowledged but Not Changed

  • asyncio subprocess vs Popen: The current subprocess.Popen approach works correctly with run_in_executor for output streaming. Converting to asyncio.create_subprocess_exec would require significant refactoring for minimal benefit.
  • Controlled component pattern for UI: The current callback pattern works correctly; this is an optional refactoring suggestion.

Test plan

  • Run python -m py_compile on all new Python files
  • Verify single-agent mode still works (--num-agents 1 or default)
  • Test parallel mode with --num-agents 2 on a sample project
  • Verify UI shows agent badges on in-progress features
  • Test API endpoints via curl or UI

Closes #5

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Parallel agent mode: run multiple agents concurrently with per-agent worktrees, lifecycle controls, and CLI flag for agent count.
    • REST API and UI controls to start/stop/pause/resume agents, view real-time per-agent status, merge/cleanup worktrees, and color-coded agent badges.
    • Atomic feature claim/release and per-feature agent assignment tracking to avoid conflicts.
  • Documentation

    • Expanded Parallel Agents Mode guidance with CLI, API, workflows, and merge/conflict instructions.

✏️ Tip: You can customize this high-level summary in your review settings.

shaal and others added 2 commits January 6, 2026 15:35
Implements the ability to run multiple Claude agents in parallel to speed up project completion (GitHub issue leonvanzyl#5).

## Key Features:
- **Git worktrees**: Each agent gets isolated working directory
- **Atomic task claiming**: Prevents race conditions between agents
- **Agent assignment tracking**: Features show which agent is working on them
- **Configurable parallelism**: --num-agents flag for CLI, API for UI

## New Files:
- worktree.py: Git worktree management for agent isolation
- parallel_agents.py: Orchestrator for managing multiple agents
- parallel_agent_runner.py: Individual agent process runner
- server/routers/parallel_agents.py: REST API for parallel agent control

## Changes:
- Database schema: Added assigned_agent_id column to features
- MCP server: New atomic feature_claim_next() tool
- UI: Agent badges on feature cards
- CLI: --num-agents option for parallel execution

Closes leonvanzyl#5

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add support for managing parallel agents in the UI, enhancing project execution capabilities. Key changes include:

- Introduced `ParallelAgentControl` component for toggling parallel mode.
- Updated `App` component to conditionally render agent controls based on parallel mode.
- Added hooks for querying and mutating parallel agent statuses in `useProjects`.
- Implemented API functions for starting, stopping, merging, and cleaning up parallel agents.

This update improves the user experience by allowing simultaneous agent operations, streamlining project workflows.
@Bortus-AI
Copy link

Bortus-AI commented Jan 6, 2026

nice! Look forward to trying this

@jpbester
Copy link

jpbester commented Jan 7, 2026

will this work on the UI side as well?

Resolved conflicts in:
- server/main.py: Combined cleaner multi-line import format with parallel_agents_router
- server/routers/__init__.py: Added parallel_agents_router import in alphabetical order

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Jan 7, 2026

📝 Walkthrough

Walkthrough

Adds first-class parallel agents: per-agent git worktrees, an orchestrator and runner, agent_id propagation through client/MCP, DB column assigned_agent_id, atomic MCP claim/release APIs, server routes/schemas for agent control, UI components/hooks, and updated docs/templates for parallel execution.

Changes

Cohort / File(s) Summary
Docs & Templates
CLAUDE.md, .claude/templates/coding_prompt.template.md
New "Parallel Agents Mode" docs and template guidance covering CLI flags, AGENT_ID, atomic feature claiming, worktree merge workflows, and troubleshooting.
Agent CLI & Runner
agent.py, autonomous_agent_demo.py, parallel_agent_runner.py, client.py
run_autonomous_agent gains agent_id; create_client accepts agent_id and injects AGENT_ID into MCP env; demo CLI adds --num-agents and parallel orchestration path; new per-agent runner script.
Worktree Management
worktree.py
New WorktreeManager: create/remove/list per-agent worktrees, share features.db/prompts via symlinks, merge worktree changes back to main, and cleanup.
Parallel Orchestrator
parallel_agents.py
New ParallelAgentOrchestrator and AgentInfo: spawn/manage agent subprocesses, per-agent worktrees, streaming output, status callbacks, pause/resume, healthchecks, merge and cleanup; global registry accessor.
MCP Feature Server
mcp_server/feature_mcp.py
Feature endpoints accept/return assigned_agent_id; new atomic tools feature_claim_next, feature_release, feature_clear_in_progress; in-progress/claim APIs validate ownership.
Database
api/database.py
Adds nullable, indexed assigned_agent_id column on Feature, includes it in to_dict, and adds a migration helper invoked during DB creation.
Server API & Schemas
server/main.py, server/routers/__init__.py, server/routers/parallel_agents.py, server/schemas.py, server/routers/features.py
Registers parallel_agents_router; new router exposes start/stop/pause/resume/status/merge/cleanup endpoints; new schemas (ParallelAgentStartRequest, ParallelAgentInfo, ParallelAgentsStatus, ParallelAgentActionResponse); FeatureResponse and feature serializers include assigned_agent_id.
UI: Components, Hooks & API
ui/src/App.tsx, ui/src/components/FeatureCard.tsx, ui/src/components/ParallelAgentControl.tsx, ui/src/hooks/useProjects.ts, ui/src/lib/api.ts, ui/src/lib/types.ts
Adds ParallelAgentControl and mode toggle; FeatureCard shows agent badge; client API + hooks for start/stop/status/merge/cleanup; new types including assigned_agent_id.
Server internals & small docs
server/services/process_manager.py, server/websocket.py, server/routers/projects.py
Added status callback notifications and docstrings; minor websocket/docstring edits.
Build / Ignore
ui/tsconfig.tsbuildinfo, .gitignore
Removed generated TypeScript build metadata file; added *.tsbuildinfo to .gitignore.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant User
    participant API as Parallel Agents API
    participant Orch as ParallelAgentOrchestrator
    participant WT as WorktreeManager
    participant Runner as parallel_agent_runner (subproc)
    participant Agent as run_autonomous_agent
    participant MCP as Feature MCP
    participant DB as Database

    User->>API: POST /start (project, num_agents)
    API->>Orch: start_agents(num_agents, yolo_mode, model)
    Orch->>WT: create_worktree(agent_id)
    WT-->>Orch: worktree_path
    Orch->>Runner: spawn subprocess (agent_id, worktree_path)
    Runner->>Agent: start run_autonomous_agent(..., agent_id)
    loop Per-agent work loop
        Agent->>MCP: feature_claim_next(agent_id)
        MCP->>DB: atomic SELECT/UPDATE (set in_progress, assigned_agent_id)
        DB-->>MCP: claimed feature
        MCP-->>Agent: feature payload (includes assigned_agent_id)
        Agent->>WT: make commits in worktree
        Agent->>MCP: feature_mark_passing(feature_id)
        MCP->>DB: clear assigned_agent_id, mark passed
    end
    User->>API: POST /merge
    API->>Orch: merge_all_worktrees()
    Orch->>WT: merge_worktree_changes(agent_id) for each agent
    WT-->>Orch: merge results
    Orch-->>API: summary
    API-->>User: response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I nibble code and dig a trail,

Worktrees sprout like carrot hale,
Tiny agents, each a spark,
Claim, commit, and stitch the dark,
Hooray — the project hops to mark!

🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 79.17% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title "feat: Add parallel subagent execution support" directly and clearly summarizes the main change: adding support for running multiple agents in parallel.
Linked Issues check ✅ Passed The PR implements all coding requirements from issue #5: configurable parallel agents (--num-agents CLI, REST API), atomic task claiming (feature_claim_next), agent isolation (Git worktrees), assignment tracking (assigned_agent_id column), and UI controls for agent management.
Out of Scope Changes check ✅ Passed All changes are directly aligned with parallel agent execution: worktree management, MCP tools, database schema updates, API endpoints, UI components, and orchestration logic. No unrelated changes detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🤖 Fix all issues with AI agents
In @mcp_server/feature_mcp.py:
- Around line 166-171: The filter on Feature.assigned_agent_id inside the
query.filter call should not include the empty-string check; remove the
(Feature.assigned_agent_id == "") clause so the filter only checks for None or
the matching agent_id. Update both places where this pattern appears (the
query.filter that currently uses (Feature.assigned_agent_id == None) |
(Feature.assigned_agent_id == "") | (Feature.assigned_agent_id == agent_id)) to
instead use (Feature.assigned_agent_id == None) | (Feature.assigned_agent_id ==
agent_id) to match the codebase’s use of None when clearing assignments.
- Around line 400-412: The claim filter uses OR which allows a row with
in_progress=True and assigned_agent_id==None to be claimed; change the predicate
so in_progress must be False AND the assigned_agent_id must be either None/empty
or equal to agent_id. Update the .filter block that references
Feature.in_progress and Feature.assigned_agent_id (and uses agent_id) to require
Feature.in_progress == False AND (Feature.assigned_agent_id is None OR == "" OR
== agent_id) while keeping the outer passes == False, ordering,
with_for_update(), and .first() intact.

In @parallel_agents.py:
- Around line 101-132: start_agents is launching agents sequentially by awaiting
start_agent inside the for loop; change it to start all agent startups
concurrently by creating tasks for each start_agent call (using
asyncio.create_task or a comprehension) keyed by generate_agent_id, then await
them together with asyncio.gather (with return_exceptions=True) and populate
results mapping agent_id to boolean success while converting exceptions to
False; keep the prior git repo check and ensure any per-task exceptions are
logged/handled rather than letting one failure block others.
- Around line 285-315: pause_agent and resume_agent currently call
psutil.Process.suspend()/resume() (psutil.Process.suspend,
psutil.Process.resume) which uses SIGSTOP/SIGCONT and will freeze the entire
asyncio event loop and cause unpredictable behavior; replace these methods with
stop/start semantics instead: remove calls to suspend()/resume(), implement
pause_agent to perform a graceful shutdown/stop of the agent process (call the
existing stop logic or a new stop_agent/_terminate_agent flow that closes API
clients, cancels outstanding async tasks, and waits for process termination),
and implement resume_agent as start_agent (or spawn a fresh process via your
existing start logic) that re-initializes the agent state and then calls
_notify_status(agent_id, "running"); if true pause/resume semantics must be
kept, add explicit documentation/comments in pause_agent/resume_agent about
platform-specific asyncio risks and add extensive tests/recovery logic for
interrupted async operations instead of using psutil suspend/resume.

In @server/routers/parallel_agents.py:
- Around line 150-155: The return builds the status by indexing
orchestrator.agents and then accessing .status on a fallback {}, which raises
AttributeError; change it to safely fetch the agent object and then read its
status with a default (e.g., use agent = orchestrator.agents.get(agent_id) and
then status = getattr(agent, "status", "unknown") or similar), and return that
status under the "status" key so missing agents yield "unknown" without raising.

In @ui/tsconfig.tsbuildinfo:
- Line 1: The tsconfig.tsbuildinfo file is a generated TypeScript build artifact
that should not be committed; remove tsconfig.tsbuildinfo from version control
(unstage/delete tracked file) and add the pattern *.tsbuildinfo to the
.gitignore under the "Build outputs" section alongside ui/dist/ and .vite/ so
future builds are ignored by VCS.

In @worktree.py:
- Around line 68-77: The git init code currently always returns True even if
"git add" or "git commit" fail; update the initialization in the method that
calls self._run_git for "add" and "commit" so that if either command returns
non-zero it logs the stderr and returns False (or raises an exception) instead
of True; specifically, change the block that runs self._run_git("add", "-A") and
self._run_git("commit", "-m", "Initial commit for parallel agents",
"--allow-empty") to check result.returncode and immediately return False when a
failure occurs (include result.stderr in the logger.warning/error), and ensure
create_worktree checks the initializer's return value before relying on HEAD.
- Around line 147-175: The _setup_shared_resources function currently swallows
symlink failures for features.db which breaks shared DB coordination required by
feature_claim_next(agent_id); change the fallback to enforce accessibility: in
the except OSError block for worktree_db.symlink_to(features_db) either
(preferable) raise an exception to abort worktree creation with a clear error
mentioning features_db/worktree_db and MCP server configuration, or
(alternative) write the absolute path of features_db into the worktree MCP
server config and validate that the new worktree process can open the DB before
returning; update the error message/log to be explicit and reference
features_db, worktree_db and MCP server so callers know to use the main project
path when symlinks aren’t supported.
- Around line 248-285: merge_worktree_changes currently merges the agent branch
into whatever branch is checked out (using _run_git("merge", branch)) which can
merge into a feature branch; update merge_worktree_changes to explicitly
checkout the intended target branch (use the computed main_branch from
_run_git("branch", "--show-current") or fallback "main") in the main repo via
_run_git("checkout", main_branch) before running _run_git("merge", branch), and
ensure you pass the correct cwd for main repo operations (not worktree_path)
when checking out and merging; additionally, when a merge conflict occurs
(result.returncode != 0) keep the merge aborted but leave a clear return path:
call _run_git("merge", "--abort") for cleanup, log a user-facing message that
the worktree has conflicts and must be resolved manually (include
result.stderr), and do not silently leave the worktree in an uncommitted state —
optionally return False so caller can prompt conflict resolution.
🧹 Nitpick comments (13)
ui/src/App.tsx (1)

165-182: Consider using controlled component pattern for mode state.

The current implementation works correctly: ParallelAgentControl maintains internal isParallelMode state and notifies the parent via onModeChange, while the parent's parallelMode controls visibility of AgentControl. The states stay synchronized through the callback pattern.

However, for better predictability and maintenance, consider making ParallelAgentControl a fully controlled component by passing parallelMode as a prop instead of having it manage internal state. This would eliminate potential synchronization issues and make the data flow more explicit.

Additionally, you may want to consider whether the parallel mode preference should persist when users switch projects, or if each project should start in single-agent mode.

♻️ Optional refactoring to controlled component pattern

In ParallelAgentControl.tsx, accept isParallelMode as a prop:

interface ParallelAgentControlProps {
  projectName: string
  isParallelMode: boolean  // Add this prop
  onModeChange?: (mode: boolean) => void
}

export function ParallelAgentControl({
  projectName,
  isParallelMode,  // Receive from parent instead of internal state
  onModeChange,
}: ParallelAgentControlProps) {
  // Remove: const [isParallelMode, setIsParallelMode] = useState(false)
  
  const handleToggleMode = () => {
    onModeChange?.(!isParallelMode)  // Just notify parent
  }
  // ... rest of component
}

In App.tsx, pass the state as a prop:

 <ParallelAgentControl
   projectName={selectedProject}
+  isParallelMode={parallelMode}
   onModeChange={setParallelMode}
 />
ui/src/components/FeatureCard.tsx (1)

30-48: Extract shared agent color logic to avoid duplication.

The getAgentColor function and AGENT_COLORS array are duplicated in both FeatureCard.tsx and ParallelAgentControl.tsx (lines 28-44). Consider extracting this logic to a shared utility file (e.g., ui/src/lib/agentColors.ts) to maintain consistency and reduce maintenance burden.

♻️ Suggested refactor

Create a new file ui/src/lib/agentColors.ts:

const AGENT_COLORS = [
  '#00b4d8', // cyan - agent-1
  '#70e000', // green - agent-2
  '#8338ec', // purple - agent-3
  '#ff5400', // orange - agent-4
  '#ff006e', // pink - agent-5
]

export function getAgentColor(agentId: string): string {
  const match = agentId.match(/\d+/)
  if (match) {
    const num = parseInt(match[0], 10) - 1
    return AGENT_COLORS[num % AGENT_COLORS.length]
  }
  return AGENT_COLORS[0]
}

Then import and use it in both components:

-// Generate consistent color for agent ID
-function getAgentColor(agentId: string): string {
-  const colors = [
-    '#00b4d8', // cyan - agent-1
-    '#70e000', // green - agent-2
-    '#8338ec', // purple - agent-3
-    '#ff5400', // orange - agent-4
-    '#ff006e', // pink - agent-5
-  ]
-
-  // Extract number from agent-N format
-  const match = agentId.match(/\d+/)
-  if (match) {
-    const num = parseInt(match[0], 10) - 1
-    return colors[num % colors.length]
-  }
-
-  return colors[0]
-}
+import { getAgentColor } from '../lib/agentColors'
parallel_agent_runner.py (1)

79-100: Consider optional input validation improvements.

The script could benefit from additional validation:

  • Verify agent_id matches expected format (e.g., agent-\d+)
  • Confirm worktree_dir is actually a git worktree (e.g., check for .git file pointing to parent)

These validations would catch configuration errors earlier and provide clearer error messages.

💡 Optional validation enhancements
 def main() -> None:
     """Main entry point."""
     args = parse_args()

     project_dir = Path(args.project_dir).resolve()
     worktree_dir = Path(args.worktree_dir).resolve()

     if not project_dir.exists():
         print(f"Error: Project directory does not exist: {project_dir}")
         return

     if not worktree_dir.exists():
         print(f"Error: Worktree directory does not exist: {worktree_dir}")
         return
+
+    # Validate agent_id format
+    import re
+    if not re.match(r'^agent-\d+$', args.agent_id):
+        print(f"Error: Invalid agent_id format: {args.agent_id} (expected: agent-N)")
+        return
+
+    # Verify worktree_dir is a git worktree
+    git_file = worktree_dir / '.git'
+    if not git_file.exists() or not git_file.is_file():
+        print(f"Warning: {worktree_dir} may not be a git worktree")

     print(f"[{args.agent_id}] Starting agent")
api/database.py (1)

78-91: Consider adding transaction safety to migration.

The migration function lacks explicit error handling and transaction rollback. While adding a nullable column is low-risk, wrapping the migration in a try/except block would prevent partial failures from leaving the database in an inconsistent state.

♻️ Enhanced error handling
 def _migrate_add_assigned_agent_id_column(engine) -> None:
     """Add assigned_agent_id column to existing databases that don't have it."""
     from sqlalchemy import text

     with engine.connect() as conn:
-        # Check if column exists
-        result = conn.execute(text("PRAGMA table_info(features)"))
-        columns = [row[1] for row in result.fetchall()]
-
-        if "assigned_agent_id" not in columns:
-            # Add the column (nullable, no default)
-            conn.execute(text("ALTER TABLE features ADD COLUMN assigned_agent_id VARCHAR(50)"))
-            conn.commit()
+        try:
+            # Check if column exists
+            result = conn.execute(text("PRAGMA table_info(features)"))
+            columns = [row[1] for row in result.fetchall()]
+
+            if "assigned_agent_id" not in columns:
+                # Add the column (nullable, no default)
+                conn.execute(text("ALTER TABLE features ADD COLUMN assigned_agent_id VARCHAR(50)"))
+                conn.commit()
+        except Exception as e:
+            conn.rollback()
+            raise RuntimeError(f"Failed to migrate assigned_agent_id column: {e}") from e

Note: The existing _migrate_add_in_progress_column function could benefit from the same enhancement.

ui/src/components/ParallelAgentControl.tsx (2)

106-254: Add error feedback for failed operations.

The component doesn't display error messages when mutations fail. Users should be informed when actions like starting or stopping agents encounter errors. Consider adding error display using the error property from each mutation hook.

💬 Error display example
   return (
     <div className="flex flex-col gap-3 p-4 bg-white border-3 border-[var(--color-neo-border)] shadow-neo">
       {/* Header */}
       <div className="flex items-center justify-between">
         ...
       </div>

+      {/* Error Display */}
+      {(startAgents.error || stopAgents.error || mergeWorktrees.error || cleanupAgents.error) && (
+        <div className="p-2 bg-red-100 border border-red-400 text-red-700 text-sm rounded">
+          {startAgents.error?.message || stopAgents.error?.message || 
+           mergeWorktrees.error?.message || cleanupAgents.error?.message}
+        </div>
+      )}

       {/* Agent Status Grid */}
       {status && status.agents.length > 0 && (

140-154: Consider using dynamic max from backend.

The agent count selector hardcodes a maximum of 5 agents (line 149), but the backend provides status.max_agents. Consider using the backend value to keep the UI synchronized with server-side constraints.

🔧 Dynamic max agents
+  const maxAgents = status?.max_agents ?? 5
+
   <button
-    onClick={() => setNumAgents(Math.min(5, numAgents + 1))}
+    onClick={() => setNumAgents(Math.min(maxAgents, numAgents + 1))}
     className="neo-btn neo-btn-secondary p-1"
-    disabled={numAgents >= 5}
+    disabled={numAgents >= maxAgents}
   >
     <Plus size={14} />
   </button>
autonomous_agent_demo.py (2)

163-188: Consider propagating max_iterations to parallel agents.

The parallel execution path does not use args.max_iterations, while the single-agent path does. If --max-iterations is intended to work with parallel mode, it should be passed to the orchestrator or individual agents.


189-195: Redundant KeyboardInterrupt handling.

The KeyboardInterrupt is already caught inside run_parallel() at line 181. The outer catch at line 191 will only trigger if the interrupt occurs during asyncio.run() startup/teardown, which is an edge case. This is acceptable but could be simplified.

ui/src/lib/api.ts (1)

181-187: Consider using ParallelAgentActionResponse type.

The inline return type matches ParallelAgentActionResponse from ./types. Using the shared type would improve consistency and maintainability.

♻️ Suggested refactor
 export async function mergeParallelWorktrees(
   projectName: string
-): Promise<{ success: boolean; agents: Record<string, boolean>; message: string }> {
+): Promise<ParallelAgentActionResponse> {
   return fetchJSON(`/projects/${encodeURIComponent(projectName)}/parallel-agents/merge`, {
     method: 'POST',
   })
 }
server/routers/parallel_agents.py (1)

22-30: Consider moving import to module level.

The dynamic sys.path manipulation inside _get_project_path is a code smell. Consider restructuring imports at module level or using a proper package structure.

worktree.py (2)

100-103: Consider validating existing worktree state.

When a worktree already exists, the method returns the path immediately without validating that the worktree is in a usable state (e.g., not corrupted, branch exists, git operations work). While rare, a corrupted worktree could cause agent startup failures that are difficult to diagnose.

Consider adding basic validation:

if worktree_path.exists():
    # Verify worktree is registered with git
    result = self._run_git("worktree", "list", "--porcelain")
    if str(worktree_path) not in result.stdout:
        logger.warning(f"Worktree path exists but not registered, recreating")
        # Clean up and recreate
    else:
        logger.info(f"Worktree already exists for agent {agent_id}")
        return worktree_path

287-306: Strengthen worktree filtering logic.

Lines 293-295 rely on string matching and prefix removal to identify agent worktrees. This works but is fragile if naming conventions change. Consider using a more robust approach.

        for wt in worktrees:
            path = Path(wt.get("path", ""))
-           if path != self.project_dir and str(self.worktrees_dir) in str(path):
+           # Only remove worktrees we manage (under our worktrees directory)
+           try:
+               path.relative_to(self.worktrees_dir)
+               is_managed = True
+           except ValueError:
+               is_managed = False
+           
+           if is_managed and path != self.project_dir:
                # Extract agent_id from path
-               agent_id = path.name.replace("agent_", "")
+               if path.name.startswith("agent_"):
+                   agent_id = path.name[6:]  # Remove "agent_" prefix
+               else:
+                   continue
                self.remove_worktree(agent_id)
parallel_agents.py (1)

354-360: Wrap blocking git operations in executor.

The method calls self.worktree_manager.merge_worktree_changes() (line 358), which runs synchronous git commands via subprocess. In an async function, these blocking operations should be wrapped in run_in_executor to avoid blocking the event loop.

    async def merge_all_worktrees(self) -> dict[str, bool]:
        """Merge changes from all agent worktrees back to main."""
        results = {}
+       loop = asyncio.get_running_loop()
        for agent_id in self.agents:
-           success = self.worktree_manager.merge_worktree_changes(agent_id)
+           success = await loop.run_in_executor(
+               None,
+               self.worktree_manager.merge_worktree_changes,
+               agent_id
+           )
            results[agent_id] = success
        return results
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 122f03d and ba6975f.

📒 Files selected for processing (20)
  • CLAUDE.md
  • agent.py
  • api/database.py
  • autonomous_agent_demo.py
  • client.py
  • mcp_server/feature_mcp.py
  • parallel_agent_runner.py
  • parallel_agents.py
  • server/main.py
  • server/routers/__init__.py
  • server/routers/parallel_agents.py
  • server/schemas.py
  • ui/src/App.tsx
  • ui/src/components/FeatureCard.tsx
  • ui/src/components/ParallelAgentControl.tsx
  • ui/src/hooks/useProjects.ts
  • ui/src/lib/api.ts
  • ui/src/lib/types.ts
  • ui/tsconfig.tsbuildinfo
  • worktree.py
🧰 Additional context used
🧬 Code graph analysis (8)
ui/src/components/ParallelAgentControl.tsx (2)
ui/src/hooks/useProjects.ts (5)
  • useParallelAgentsStatus (158-165)
  • useStartParallelAgents (167-178)
  • useStopParallelAgents (180-190)
  • useMergeParallelWorktrees (192-201)
  • useCleanupParallelAgents (203-213)
server/schemas.py (1)
  • ParallelAgentInfo (134-140)
parallel_agent_runner.py (3)
agent.py (1)
  • run_autonomous_agent (109-239)
registry.py (1)
  • get_project_path (195-213)
autonomous_agent_demo.py (2)
  • parse_args (45-108)
  • main (111-213)
ui/src/App.tsx (2)
ui/src/components/AgentControl.tsx (1)
  • AgentControl (17-128)
ui/src/components/ParallelAgentControl.tsx (1)
  • ParallelAgentControl (46-255)
ui/src/lib/api.ts (1)
server/schemas.py (2)
  • ParallelAgentsStatus (143-147)
  • ParallelAgentActionResponse (150-154)
ui/src/lib/types.ts (2)
server/schemas.py (3)
  • ParallelAgentInfo (134-140)
  • ParallelAgentsStatus (143-147)
  • ParallelAgentActionResponse (150-154)
server/services/process_manager.py (3)
  • status (87-88)
  • status (91-95)
  • pid (139-140)
server/schemas.py (1)
server/services/process_manager.py (3)
  • status (87-88)
  • status (91-95)
  • pid (139-140)
agent.py (1)
client.py (1)
  • create_client (79-217)
mcp_server/feature_mcp.py (2)
api/database.py (2)
  • Feature (19-46)
  • to_dict (34-46)
ui/src/lib/types.ts (1)
  • Feature (60-67)
🔇 Additional comments (25)
agent.py (3)

114-114: LGTM! Clean parameter addition.

The agent_id parameter is properly typed and documented, enabling agent identification in parallel execution scenarios.


126-146: LGTM! Clear agent identification in output.

The prefix pattern consistently labels all output with the agent ID when running in parallel mode, making logs easy to follow when multiple agents are running.


187-187: LGTM! Proper propagation to client.

The agent_id is correctly passed through to create_client, which will set the AGENT_ID environment variable for the MCP servers (as shown in the relevant code snippet from client.py).

server/routers/__init__.py (1)

12-12: LGTM! Standard router registration.

The parallel agents router is properly imported and exposed, following the same pattern as existing routers.

Also applies to: 20-20

CLAUDE.md (1)

74-109: LGTM! Comprehensive parallel agents documentation.

The documentation clearly explains:

  • CLI usage with --num-agents flag
  • API endpoints for control
  • MCP tools for coordination
  • Operational model (worktrees, shared DB, atomic claiming)

This will help users understand and use the parallel execution feature effectively.

server/main.py (1)

23-23: LGTM! Proper router registration.

The parallel agents router is correctly imported and included in the FastAPI app, following the same pattern as other routers. The comment clearly indicates its purpose.

Also applies to: 93-93

ui/src/App.tsx (1)

11-11: LGTM! UI support for parallel mode toggle.

The import and state initialization enable the parallel agents UI feature.

Also applies to: 38-38

ui/src/lib/types.ts (1)

69-69: LGTM! Type definitions align with server schemas.

The new assigned_agent_id field and parallel agent types are correctly defined and match the backend schemas shown in the relevant snippets.

Also applies to: 102-121

api/database.py (1)

32-32: LGTM! Database schema changes are well-structured.

The assigned_agent_id column is properly defined with appropriate constraints (nullable, indexed) and correctly exposed in the to_dict method for API serialization.

Also applies to: 45-45

ui/src/components/ParallelAgentControl.tsx (1)

46-255: Well-structured component with good UX patterns.

The component provides clear visual feedback, appropriate loading states, and intuitive controls. The mode toggle, agent status badges, and action buttons are well-organized and user-friendly.

autonomous_agent_demo.py (1)

101-106: LGTM!

The --num-agents argument is well-defined with sensible defaults and a clear description. The max limit of 10 agents is enforced both here in the help text and on line 140.

server/schemas.py (2)

134-140: LGTM!

The ParallelAgentInfo schema correctly extends the base status literals with "unknown" to handle edge cases where agent state cannot be determined. This aligns with the router's error handling patterns.


87-87: LGTM!

Adding assigned_agent_id to FeatureResponse aligns with the database model changes in api/database.py and enables the UI to display agent assignment badges on feature cards.

ui/src/hooks/useProjects.ts (2)

158-164: LGTM!

The hook follows the established pattern with the same 3-second polling interval as useAgentStatus. Query key naming is consistent with other hooks.


167-213: LGTM!

The mutation hooks correctly invalidate both parallel-agents-status and agent-status queries where appropriate. The useMergeParallelWorktrees hook correctly only invalidates parallel-agents-status since merging doesn't affect agent running state.

client.py (2)

173-189: LGTM!

The centralized mcp_env dictionary cleanly handles environment propagation for the MCP server. Conditionally adding AGENT_ID only when provided avoids polluting the environment with empty values.


28-30: All three tools are properly registered in the MCP server with matching names. Verification confirms:

  • feature_claim_next registered at feature_mcp.py:382
  • feature_release registered at feature_mcp.py:432
  • feature_clear_in_progress registered at feature_mcp.py:350

All tools are decorated with @mcp.tool() and the tool names in client.py match exactly.

ui/src/lib/api.ts (1)

156-179: LGTM!

The API functions properly encode project names, use correct HTTP methods, and align with the server endpoint definitions in server/routers/parallel_agents.py.

server/routers/parallel_agents.py (2)

65-91: LGTM!

The status endpoint correctly runs a healthcheck before reporting and properly constructs the ParallelAgentInfo objects from orchestrator state.


49-62: Caching is properly implemented. The _get_orchestrator function in parallel_agents.py maintains a module-level _orchestrators dictionary that caches orchestrators per project name (lines 379-383). Each orchestrator is created once and reused across requests, correctly preserving agent state.

mcp_server/feature_mcp.py (2)

381-428: Good use of row-level locking for atomic claims.

The with_for_update() clause provides proper database-level locking to prevent race conditions between parallel agents. The exception handling with rollback is also appropriate.


431-472: LGTM!

The feature_release function correctly validates ownership when agent_id is provided and properly clears both in_progress and assigned_agent_id fields.

parallel_agents.py (3)

73-96: LGTM: Robust callback implementation.

The callback registration and notification methods properly handle errors in individual callbacks without affecting others (lines 84-87, 92-95), which prevents one faulty callback from breaking the entire notification system.


236-273: LGTM: Proper process termination with graceful fallback.

The stop_agent method correctly implements a two-stage shutdown: terminate with a 5-second timeout, then kill if necessary (lines 252-262). The output task is properly canceled and awaited (lines 244-249), preventing resource leaks.


335-352: Good practice: Proactive crashed state detection.

The healthcheck method (lines 335-352) polls process status and proactively updates the agent status to "crashed" when a process exits unexpectedly. This provides better observability than waiting for the next operation to discover the failure.

Comment on lines 166 to 171
if agent_id:
query = query.filter(
(Feature.assigned_agent_id == None) |
(Feature.assigned_agent_id == "") |
(Feature.assigned_agent_id == agent_id)
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check all places where assigned_agent_id is set
rg -n 'assigned_agent_id\s*=' --type py

Repository: leonvanzyl/autocoder

Length of output: 1096


Remove unnecessary empty-string check from filter conditions.

The filter checks for both None and "" to handle assigned_agent_id, but the code exclusively uses None when clearing the assignment (lines 242, 372, 462). The empty-string check in the filter (lines 168-169 and 405-406) is inconsistent with the actual clearing behavior and appears to be unnecessary dead code. Remove the (Feature.assigned_agent_id == "") condition from both filter blocks to align with the consistent use of None throughout the codebase.

🤖 Prompt for AI Agents
In @mcp_server/feature_mcp.py around lines 166 - 171, The filter on
Feature.assigned_agent_id inside the query.filter call should not include the
empty-string check; remove the (Feature.assigned_agent_id == "") clause so the
filter only checks for None or the matching agent_id. Update both places where
this pattern appears (the query.filter that currently uses
(Feature.assigned_agent_id == None) | (Feature.assigned_agent_id == "") |
(Feature.assigned_agent_id == agent_id)) to instead use
(Feature.assigned_agent_id == None) | (Feature.assigned_agent_id == agent_id) to
match the codebase’s use of None when clearing assignments.

Comment on lines 400 to 412
feature = (
session.query(Feature)
.filter(Feature.passes == False)
.filter(
(Feature.in_progress == False) |
(Feature.assigned_agent_id == None) |
(Feature.assigned_agent_id == "") |
(Feature.assigned_agent_id == agent_id)
)
.order_by(Feature.priority.asc(), Feature.id.asc())
.with_for_update() # Lock the row
.first()
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential race condition in claim filter logic.

The current filter allows claiming a feature if in_progress=False OR assigned_agent_id is None/empty. This means a feature with in_progress=True and assigned_agent_id=None (e.g., set by non-parallel agent) could be claimed by a parallel agent, causing conflicts.

Consider using AND logic to ensure the feature is truly available:

🐛 Proposed fix
         feature = (
             session.query(Feature)
             .filter(Feature.passes == False)
             .filter(
-                (Feature.in_progress == False) |
-                (Feature.assigned_agent_id == None) |
-                (Feature.assigned_agent_id == "") |
-                (Feature.assigned_agent_id == agent_id)
+                (Feature.in_progress == False) &
+                ((Feature.assigned_agent_id == None) | (Feature.assigned_agent_id == ""))
+            ) | (
+                (Feature.assigned_agent_id == agent_id)  # Allow re-claiming own feature
             )
             .order_by(Feature.priority.asc(), Feature.id.asc())
             .with_for_update()  # Lock the row
             .first()
         )

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In @mcp_server/feature_mcp.py around lines 400 - 412, The claim filter uses OR
which allows a row with in_progress=True and assigned_agent_id==None to be
claimed; change the predicate so in_progress must be False AND the
assigned_agent_id must be either None/empty or equal to agent_id. Update the
.filter block that references Feature.in_progress and Feature.assigned_agent_id
(and uses agent_id) to require Feature.in_progress == False AND
(Feature.assigned_agent_id is None OR == "" OR == agent_id) while keeping the
outer passes == False, ordering, with_for_update(), and .first() intact.

Comment on lines +174 to +202
try:
process = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
cwd=str(worktree_path),
)

agent_info = AgentInfo(
agent_id=agent_id,
process=process,
worktree_path=worktree_path,
status="running",
started_at=datetime.now(),
)
self.agents[agent_id] = agent_info

# Start output streaming
agent_info.output_task = asyncio.create_task(
self._stream_output(agent_id)
)

await self._notify_status(agent_id, "running")
logger.info(f"Started agent {agent_id} (PID {process.pid})")
return True

except Exception as e:
logger.exception(f"Failed to start agent {agent_id}")
return False
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use asyncio subprocess instead of blocking Popen.

Using subprocess.Popen (lines 175-180) in an async function blocks the event loop during process creation. For a parallel agent orchestrator, this defeats the purpose of async/await. Additionally, there's no verification that the process actually started—if Popen fails silently or the process exits immediately, the method still returns True.

🔄 Recommended fix using asyncio subprocess
        try:
-           process = subprocess.Popen(
-               cmd,
-               stdout=subprocess.PIPE,
-               stderr=subprocess.STDOUT,
-               cwd=str(worktree_path),
-           )
+           process = await asyncio.create_subprocess_exec(
+               *cmd,
+               stdout=asyncio.subprocess.PIPE,
+               stderr=asyncio.subprocess.STDOUT,
+               cwd=str(worktree_path),
+           )
+           
+           # Give process a moment to start
+           await asyncio.sleep(0.1)
+           
+           # Verify process didn't immediately exit
+           if process.returncode is not None:
+               logger.error(f"Agent {agent_id} process exited immediately with code {process.returncode}")
+               return False

            agent_info = AgentInfo(
                agent_id=agent_id,

Note: This will require updating AgentInfo.process type from subprocess.Popen to asyncio.subprocess.Process and adjusting _stream_output accordingly.

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines 285 to 315
async def pause_agent(self, agent_id: str) -> bool:
"""Pause an agent using psutil."""
agent = self.agents.get(agent_id)
if not agent or not agent.process or agent.status != "running":
return False

try:
proc = psutil.Process(agent.process.pid)
proc.suspend()
agent.status = "paused"
await self._notify_status(agent_id, "paused")
return True
except Exception as e:
logger.exception(f"Failed to pause agent {agent_id}")
return False

async def resume_agent(self, agent_id: str) -> bool:
"""Resume a paused agent."""
agent = self.agents.get(agent_id)
if not agent or not agent.process or agent.status != "paused":
return False

try:
proc = psutil.Process(agent.process.pid)
proc.resume()
agent.status = "running"
await self._notify_status(agent_id, "running")
return True
except Exception as e:
logger.exception(f"Failed to resume agent {agent_id}")
return False
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check psutil version and platform support notes

# Check installed psutil version
python -c "import psutil; print(f'psutil version: {psutil.__version__}')" 2>/dev/null || echo "psutil not found"

# Check if suspend/resume are available (they are on most platforms)
python -c "
import psutil
p = psutil.Process()
print(f'suspend available: {hasattr(p, \"suspend\")}')
print(f'resume available: {hasattr(p, \"resume\")}')
print(f'Platform: {psutil.os.name}')
" 2>/dev/null || echo "Error checking psutil methods"

Repository: leonvanzyl/autocoder

Length of output: 150


🌐 Web query:

Python asyncio SIGSTOP SIGCONT process suspension handling side effects

💡 Result:

Short summary and key side effects

  • SIGSTOP cannot be caught, blocked, or ignored; it immediately suspends the entire process (all threads) at the kernel level. The process (and the asyncio event loop) is frozen until a SIGCONT (or similar) resumes it. [1][2]

  • SIGCONT will resume the process; SIGCONT itself can be caught by a handler, but the resume action is performed by the kernel when SIGCONT is generated — the process is continued even if the signal is blocked or ignored; pending signals are then processed when execution resumes. That ordering (resume then deliver/handle pending signals) is defined by POSIX and observed on Linux. [3][4]

  • Practical effects for asyncio programs

    • While stopped by SIGSTOP the event loop cannot run, so no callbacks, tasks, timers or subprocess-watcher callbacks execute while the process is suspended.
    • On resume, pending signals (including SIGCHLD) and any scheduled timeouts or delayed callbacks may be processed immediately. This can create surprising behavior: timeouts that you expected to elapse gradually will instead be handled at once when the process resumes, and signal-driven logic (child watchers, signal handlers) may run in a burst. [1][3]
    • Asyncio’s child-watcher implementations and signal-handling code interact with OS signal semantics; there are known races and complexities (e.g., SIGCHLD handling / child watchers) you should be aware of when suspending/resuming processes. See asyncio child-watcher docs and related bug reports for details. [5][6]
  • Things to watch for / mitigations

    • Don’t rely on sleeping or periodic timers to behave smoothly across a SIGSTOP/SIGCONT cycle; design code to handle timers firing immediately after resume (use explicit monotonic timestamps and check elapsed intervals in your coroutine logic).
    • If you need to run cleanup on suspend/resume, install a SIGCONT handler (SIGSTOP can’t be caught) to perform resume-time recovery/adjustments — but remember SIGCONT’s resume happens at kernel level and handlers only run once execution continues. [3][4]
    • Be cautious with subprocess management and rely on robust child-watcher choices (see asyncio docs) because SIGCHLD delivery and waiting can race with stop/continue cycles. [5][6]

References

  1. Python signal module docs (execution of Python signal handlers). [signal — Set handlers for asynchronous events — Python docs]. [turn0search2]
  2. POSIX / general: SIGKILL and SIGSTOP cannot be caught or ignored. [Open Group / POSIX; common signals references]. [turn1search4][turn1search9]
  3. POSIX & Linux notes on SIGCONT behavior (resume semantics and pending-signal handling). [turn1search4][turn1search7]
  4. Discussion: SIGCONT is effective to resume even if blocked/ignored; order of delivery and resume explained. [turn1search3][turn1search7]
  5. asyncio child-watcher and policy docs (effects for subprocess and SIGCHLD handling). [asyncio policy / child watcher docs]. [turn0search3][turn0search6]
  6. Python issue / discussions on races in asyncio signal/child watcher handling. [Python issue 45718]. [turn0search1]

If you want, I can give a short example showing how to install a SIGCONT handler in asyncio and a pattern for adjusting timers after resume.


Reconsider pause/resume implementation due to documented asyncio compatibility issues.

The pause and resume operations use psutil.Process.suspend() and .resume() (lines 292-293, 308-309), which translate to SIGSTOP/SIGCONT on Unix and kernel-level suspension on Windows. Research on asyncio behavior shows significant practical issues:

  • SIGSTOP freezes the entire process and asyncio event loop, blocking all callbacks, timers, and I/O handlers
  • On resume via SIGCONT, pending signals (SIGCHLD, etc.), timeouts, and callbacks all fire at once in a burst rather than being processed normally—this causes timeouts expected to elapse gradually to trigger immediately
  • Asyncio's child-watcher and signal handling interact with these stop/continue cycles in ways that create known races and unpredictable ordering
  • Outstanding async operations (API requests, database queries, I/O) may timeout, fail, or encounter inconsistent state when the process is suspended mid-execution

For a multi-agent system managing Claude API clients and external processes, this creates significant reliability risks. Either:

  1. Replace pause/resume with stop/start (cleaner state transition), or
  2. If pause/resume is required, extensively document the limitations and test thoroughly across target platforms, including recovery logic for interrupted async operations and signal handling.
🤖 Prompt for AI Agents
In @parallel_agents.py around lines 285 - 315, pause_agent and resume_agent
currently call psutil.Process.suspend()/resume() (psutil.Process.suspend,
psutil.Process.resume) which uses SIGSTOP/SIGCONT and will freeze the entire
asyncio event loop and cause unpredictable behavior; replace these methods with
stop/start semantics instead: remove calls to suspend()/resume(), implement
pause_agent to perform a graceful shutdown/stop of the agent process (call the
existing stop logic or a new stop_agent/_terminate_agent flow that closes API
clients, cancels outstanding async tasks, and waits for process termination),
and implement resume_agent as start_agent (or spawn a fresh process via your
existing start logic) that re-initializes the agent state and then calls
_notify_status(agent_id, "running"); if true pause/resume semantics must be
kept, add explicit documentation/comments in pause_agent/resume_agent about
platform-specific asyncio risks and add extensive tests/recovery logic for
interrupted async operations instead of using psutil suspend/resume.

Comment on lines 248 to 285
def merge_worktree_changes(self, agent_id: str, commit_message: Optional[str] = None) -> bool:
"""
Merge changes from an agent's worktree back to main branch.
Args:
agent_id: Unique identifier for the agent
commit_message: Optional commit message
Returns:
True if successful, False otherwise
"""
worktree_path = self.get_worktree_path(agent_id)
branch = f"agent-{agent_id}"

if not worktree_path.exists():
logger.error(f"Worktree for agent {agent_id} does not exist")
return False

# Commit any uncommitted changes in the worktree
self._run_git("add", "-A", cwd=worktree_path)
msg = commit_message or f"Agent {agent_id} changes"
self._run_git("commit", "-m", msg, "--allow-empty", cwd=worktree_path)

# Get current branch in main repo
result = self._run_git("branch", "--show-current")
main_branch = result.stdout.strip() or "main"

# Merge the agent branch into main
result = self._run_git("merge", branch, "--no-edit")

if result.returncode != 0:
logger.warning(f"Merge conflict for agent {agent_id}: {result.stderr}")
# Abort the merge
self._run_git("merge", "--abort")
return False

logger.info(f"Merged changes from agent {agent_id}")
return True
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Ensure merge targets the correct branch.

The method merges agent changes into whatever branch is currently checked out in the main repository (line 276), without first checking out the intended target branch. If a user is on a feature branch when merge is triggered, agent changes will merge into that branch instead of the main development branch.

Additionally, when merge conflicts occur (lines 278-282), the merge is aborted but the worktree remains in an uncommitted state, and no guidance is provided to resolve the conflict.

🔧 Proposed fix
        # Get current branch in main repo
        result = self._run_git("branch", "--show-current")
        main_branch = result.stdout.strip() or "main"
+       
+       # Checkout the main branch before merging
+       result = self._run_git("checkout", main_branch)
+       if result.returncode != 0:
+           logger.error(f"Failed to checkout {main_branch}: {result.stderr}")
+           return False

        # Merge the agent branch into main
        result = self._run_git("merge", branch, "--no-edit")

        if result.returncode != 0:
            logger.warning(f"Merge conflict for agent {agent_id}: {result.stderr}")
+           logger.warning(f"Manual resolution required. Agent branch: {branch}")
            # Abort the merge
            self._run_git("merge", "--abort")
            return False
🤖 Prompt for AI Agents
In @worktree.py around lines 248 - 285, merge_worktree_changes currently merges
the agent branch into whatever branch is checked out (using _run_git("merge",
branch)) which can merge into a feature branch; update merge_worktree_changes to
explicitly checkout the intended target branch (use the computed main_branch
from _run_git("branch", "--show-current") or fallback "main") in the main repo
via _run_git("checkout", main_branch) before running _run_git("merge", branch),
and ensure you pass the correct cwd for main repo operations (not worktree_path)
when checking out and merging; additionally, when a merge conflict occurs
(result.returncode != 0) keep the merge aborted but leave a clear return path:
call _run_git("merge", "--abort") for cleanup, log a user-facing message that
the worktree has conflicts and must be resolved manually (include
result.stderr), and do not silently leave the worktree in an uncommitted state —
optionally return False so caller can prompt conflict resolution.

- Fix race condition in feature_claim_next by using AND logic instead of OR
- Remove unnecessary empty-string checks from filter conditions
- Make agent startup concurrent with asyncio.gather() for faster init
- Add docstrings warning about SIGSTOP/SIGCONT limitations with asyncio
- Fix AttributeError when accessing non-existent agent status
- Improve git init error handling (return False on add/commit failure)
- Better symlink failure handling with marker file fallback
- Fix merge to checkout target branch before merging
- Add *.tsbuildinfo to .gitignore and remove from tracking

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @server/routers/parallel_agents.py:
- Around line 49-62: The import of parallel_agents in get_orchestrator is
fragile because it occurs before ensuring the repository root is on sys.path;
add the same sys.path management used in _get_project_path: compute root as
Path(__file__).parent.parent.parent, insert str(root) into sys.path if missing,
then import parallel_agents.get_orchestrator as _get_orchestrator; keep the rest
of the function (calls to validate_project_name, _get_project_path, and
returning _get_orchestrator(project_name, project_dir, ROOT_DIR)) unchanged.
🧹 Nitpick comments (6)
worktree.py (1)

314-333: cleanup_all_worktrees relies on string matching for path filtering.

The condition at line 320 uses str(self.worktrees_dir) in str(path) which could match unintended paths if the worktrees directory name appears as a substring elsewhere. Consider using path.is_relative_to(self.worktrees_dir) for stricter matching (Python 3.9+).

♻️ Safer path comparison
-            if path != self.project_dir and str(self.worktrees_dir) in str(path):
+            if path != self.project_dir and path.is_relative_to(self.worktrees_dir):
server/routers/parallel_agents.py (3)

22-30: Inline import with sys.path manipulation is fragile.

The _get_project_path function modifies sys.path at runtime (lines 24-27). This approach works but is fragile in production. Consider restructuring imports at module level or using proper package configuration.


117-137: Stop endpoint assumes all stops succeed without verification.

Lines 131-136 build results assuming all running agents stopped successfully, but stop_all_agents doesn't return individual success/failure status. If an agent fails to stop, the response still indicates success.

Consider returning actual stop results from the orchestrator or verifying agent statuses after stopping:

♻️ Verify actual stop status
     await orchestrator.stop_all_agents()

-    # Build results dict
-    results = {aid: True for aid in running_agents}
+    # Verify actual status after stop
+    results = {
+        aid: orchestrator.agents[aid].status == "stopped"
+        for aid in running_agents
+        if aid in orchestrator.agents
+    }

     return ParallelAgentActionResponse(
-        success=True,
+        success=all(results.values()) if results else True,
         agents=results,
         message=f"Stopped {len(running_agents)} agents",
     )

215-225: Cleanup endpoint could mask partial failures.

The endpoint returns success: True unconditionally (line 223) without checking if all agents actually stopped or if worktree cleanup succeeded. Consider returning actual cleanup status.

parallel_agents.py (2)

381-387: merge_all_worktrees calls blocking I/O in async function.

worktree_manager.merge_worktree_changes (line 385) is a synchronous method that runs git commands. Calling it directly in an async function blocks the event loop during each merge.

♻️ Use executor for blocking git operations
     async def merge_all_worktrees(self) -> dict[str, bool]:
         """Merge changes from all agent worktrees back to main."""
         results = {}
+        loop = asyncio.get_running_loop()
         for agent_id in self.agents:
-            success = self.worktree_manager.merge_worktree_changes(agent_id)
+            success = await loop.run_in_executor(
+                None, self.worktree_manager.merge_worktree_changes, agent_id
+            )
             results[agent_id] = success
         return results

395-410: Global orchestrator registry may cause stale state.

The _orchestrators cache (line 396) persists orchestrators indefinitely. If a project's directory changes or an orchestrator enters a bad state, callers continue receiving the stale instance. Consider adding a mechanism to invalidate or refresh orchestrators.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ba6975f and 75a99f2.

📒 Files selected for processing (6)
  • .gitignore
  • mcp_server/feature_mcp.py
  • parallel_agents.py
  • server/routers/parallel_agents.py
  • ui/tsconfig.tsbuildinfo
  • worktree.py
💤 Files with no reviewable changes (1)
  • ui/tsconfig.tsbuildinfo
🧰 Additional context used
🧬 Code graph analysis (3)
mcp_server/feature_mcp.py (1)
api/database.py (2)
  • Feature (19-46)
  • to_dict (34-46)
server/routers/parallel_agents.py (4)
server/schemas.py (3)
  • ParallelAgentStartRequest (128-131)
  • ParallelAgentInfo (134-140)
  • ParallelAgentActionResponse (150-154)
registry.py (1)
  • get_project_path (195-213)
parallel_agents.py (2)
  • get_orchestrator (399-410)
  • healthcheck (362-379)
server/services/process_manager.py (1)
  • pid (139-140)
parallel_agents.py (3)
worktree.py (4)
  • ensure_git_repo (51-79)
  • create_worktree (85-147)
  • merge_worktree_changes (263-312)
  • cleanup_all_worktrees (314-333)
server/services/process_manager.py (2)
  • pid (139-140)
  • resume (338-359)
server/routers/parallel_agents.py (2)
  • merge_all_worktrees (202-212)
  • get_orchestrator (49-62)
🔇 Additional comments (23)
.gitignore (1)

101-101: ✅ Good practice to ignore TypeScript build artifacts.

Adding *.tsbuildinfo to the "Build outputs" section is appropriate and follows standard practice for TypeScript projects. Build info files are generated during compilation and should not be committed.

mcp_server/feature_mcp.py (6)

145-179: LGTM on feature_get_next with agent filtering.

The updated logic correctly filters features by excluding those assigned to other agents when agent_id is provided. The filter uses only None checks (no empty-string condition), which aligns with the codebase's consistent use of None for clearing assignments.


303-345: LGTM on feature_mark_in_progress with agent assignment.

The logic properly checks if a feature is already in progress by another agent before allowing the assignment. The condition at lines 330-335 correctly prevents one agent from claiming a feature already assigned to a different agent.


380-428: feature_claim_next correctly implements atomic claiming.

The filter logic now uses AND (&) to ensure features are truly available: in_progress=False AND assigned_agent_id=None, or already assigned to the requesting agent. The with_for_update() row lock provides atomicity against concurrent claims. The exception handler properly rolls back on failure.

Previous review concerns about race conditions have been addressed.


431-472: LGTM on feature_release.

The ownership validation at lines 455-459 correctly prevents agents from releasing features they don't own. The response includes helpful status information.


216-247: LGTM on feature_mark_passing clearing agent assignment.

Clearing assigned_agent_id on completion (line 241) ensures the feature is fully released when marked as passing.


348-377: LGTM on feature_clear_in_progress clearing agent assignment.

The update correctly clears both in_progress and assigned_agent_id (lines 370-371), returning the feature to an available state.

worktree.py (5)

51-79: Error handling in ensure_git_repo is now correct.

The method now returns False when git add or git commit fail (lines 70-77), addressing the previous concern about returning True despite failures.


225-261: Potential off-by-one in list_worktrees parsing.

The parsing loop correctly handles the porcelain format, but if the output ends without a trailing blank line, the final worktree entry is still captured at lines 258-259. This looks correct.


263-312: Merge logic now explicitly checks out target branch first.

Lines 290-294 now ensure the main repository is on the target branch before merging, and conflict handling provides actionable guidance (lines 300-308). This addresses the previous concern about merging into the wrong branch.


85-147: LGTM on create_worktree with fallback for existing branches.

The logic correctly handles both new branch creation (lines 123-128) and fallback to existing branches (lines 131-136). The early return for existing worktrees (lines 103-105) prevents duplicate work.


191-223: LGTM on remove_worktree with manual fallback.

The fallback to shutil.rmtree (lines 212-217) when git removal fails is a reasonable approach for cleanup scenarios.

server/routers/parallel_agents.py (4)

39-46: Project name validation is appropriate for path traversal prevention.

The regex at line 41 restricts names to alphanumeric, underscore, and hyphen characters with a 50-character limit, which prevents path traversal attacks like ../ or null bytes.


65-91: LGTM on status endpoint.

The endpoint correctly runs a healthcheck before reporting status, and properly constructs ParallelAgentInfo objects with all required fields.


140-156: Agent status retrieval is now correct.

The previous issue with accessing .status on an empty dict fallback has been fixed. Line 151 now uses agent = orchestrator.agents.get(agent_id) and line 155 correctly accesses agent.status if agent else "unknown".


201-212: LGTM on merge endpoint.

The endpoint properly aggregates results and reports success/failure counts.

parallel_agents.py (7)

101-140: start_agents now uses concurrent startup.

The implementation correctly uses asyncio.gather(*tasks, return_exceptions=True) (line 131) to start agents concurrently rather than sequentially. Exception handling properly converts failures to False results.


293-342: Pause/resume limitations are now documented.

The extensive docstrings (lines 297-304, 324-328) clearly warn about SIGSTOP/SIGCONT behavior with asyncio. This addresses the previous concern by documenting the caveats, though the underlying implementation risks remain if users rely on pause/resume with long-running async operations.


212-242: _stream_output correctly uses executor for non-blocking I/O.

The run_in_executor pattern (lines 221-223) properly offloads blocking readline() calls. The finally block correctly updates agent status based on exit code.


244-281: LGTM on stop_agent with graceful termination.

The implementation properly cancels the output task, terminates the process with a timeout, and falls back to kill() if needed (lines 268-270).


362-379: LGTM on healthcheck.

The healthcheck correctly detects crashed processes via poll() and updates status accordingly.


26-35: LGTM on AgentInfo dataclass.

The dataclass cleanly captures all agent state with appropriate defaults.


47-71: LGTM on orchestrator initialization.

The constructor properly initializes the WorktreeManager and tracking structures.

Comment on lines +49 to +62
def get_orchestrator(project_name: str):
"""Get or create the parallel agent orchestrator for a project."""
from parallel_agents import get_orchestrator as _get_orchestrator

project_name = validate_project_name(project_name)
project_dir = _get_project_path(project_name)

if not project_dir:
raise HTTPException(status_code=404, detail=f"Project '{project_name}' not found in registry")

if not project_dir.exists():
raise HTTPException(status_code=404, detail=f"Project directory not found: {project_dir}")

return _get_orchestrator(project_name, project_dir, ROOT_DIR)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, let's understand the project structure and locate parallel_agents module
echo "=== Project structure ==="
ls -la

echo -e "\n=== Search for parallel_agents module ==="
find . -type f -name "*parallel_agents*" | head -20

echo -e "\n=== Search for ParallelAgentOrchestrator class ==="
rg -l 'class ParallelAgentOrchestrator' --type py

echo -e "\n=== Search for parallel_agents imports ==="
rg 'from parallel_agents import|import parallel_agents' --type py

Repository: leonvanzyl/autocoder

Length of output: 2589


🏁 Script executed:

#!/bin/bash
# Check the current file and its imports
echo "=== Content of server/routers/parallel_agents.py ==="
cat -n server/routers/parallel_agents.py | head -70

echo -e "\n=== Check if __init__.py exists in server directory ==="
ls -la server/ | grep __init__

echo -e "\n=== Check if __init__.py exists in server/routers directory ==="
ls -la server/routers/ | grep __init__

Repository: leonvanzyl/autocoder

Length of output: 2904


🏁 Script executed:

#!/bin/bash
# Look for sys.path configuration or setup.py
echo "=== Search for sys.path modifications or setup ==="
rg 'sys.path' --type py -B 2 -A 2

echo -e "\n=== Check for setup.py, pyproject.toml, or similar ==="
find . -maxdepth 2 -type f \( -name "setup.py" -o -name "pyproject.toml" -o -name "setup.cfg" -o -name "MANIFEST.in" \)

Repository: leonvanzyl/autocoder

Length of output: 4048


Add sys.path management before importing parallel_agents to match the established pattern.

Line 51 imports parallel_agents without ensuring the root directory is in sys.path. The function _get_project_path (lines 22-30) and other functions throughout the codebase properly manage sys.path before importing root-level modules. While the current implementation may work if _get_project_path is called first (line 54), this creates a fragile implicit dependency. Add the same sys.path setup pattern used in _get_project_path to ensure the import is reliable regardless of execution context:

def get_orchestrator(project_name: str):
    """Get or create the parallel agent orchestrator for a project."""
    import sys
    root = Path(__file__).parent.parent.parent
    if str(root) not in sys.path:
        sys.path.insert(0, str(root))
    
    from parallel_agents import get_orchestrator as _get_orchestrator
    
    project_name = validate_project_name(project_name)
    project_dir = _get_project_path(project_name)
    ...
🤖 Prompt for AI Agents
In @server/routers/parallel_agents.py around lines 49 - 62, The import of
parallel_agents in get_orchestrator is fragile because it occurs before ensuring
the repository root is on sys.path; add the same sys.path management used in
_get_project_path: compute root as Path(__file__).parent.parent.parent, insert
str(root) into sys.path if missing, then import parallel_agents.get_orchestrator
as _get_orchestrator; keep the rest of the function (calls to
validate_project_name, _get_project_path, and returning
_get_orchestrator(project_name, project_dir, ROOT_DIR)) unchanged.

worktree.py Outdated
Comment on lines 149 to 179
def _setup_shared_resources(self, worktree_path: Path) -> None:
"""
Set up shared resources in the worktree.
The features.db is shared across all worktrees via symlink.
This is critical for atomic feature claiming via feature_claim_next().
Raises:
RuntimeError: If features.db cannot be shared with the worktree.
"""
# Create symlink to shared features.db
features_db = self.project_dir / "features.db"
worktree_db = worktree_path / "features.db"

if features_db.exists() and not worktree_db.exists():
try:
# Try symlink first (preferred)
worktree_db.symlink_to(features_db)
logger.debug(f"Created symlink to features.db in {worktree_path}")
except OSError as e:
# Symlink failed - this breaks shared DB coordination
# The MCP server uses PROJECT_DIR env var which points to worktree,
# so we need the features.db to be accessible there
logger.warning(
f"Symlink creation failed ({e}). "
f"Ensure MCP server is configured to use {features_db} directly."
)
# Write a marker file to indicate the absolute DB path
db_path_file = worktree_path / ".features_db_path"
db_path_file.write_text(str(features_db))
logger.info(f"Wrote features.db path to {db_path_file}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Search for .features_db_path usage
rg -n '\.features_db_path' --type py

Repository: leonvanzyl/autocoder

Length of output: 146


🏁 Script executed:

# Also check for any references to features.db path handling in parallel_agent_runner.py
rg -n 'features.db' --type py | head -20

Repository: leonvanzyl/autocoder

Length of output: 1635


🏁 Script executed:

# Check the MCP server initialization files
fd -e py -name '*mcp*' -o -name '*server*' | head -20

Repository: leonvanzyl/autocoder

Length of output: 294


🏁 Script executed:

# Search for any code that reads from marker files or reads .features_db_path
rg -n 'read_text|open.*\.features_db_path' --type py

Repository: leonvanzyl/autocoder

Length of output: 1378


🏁 Script executed:

# Look at the features.py router to see how it accesses features.db
cat -n server/routers/features.py | head -50

Repository: leonvanzyl/autocoder

Length of output: 1617


🏁 Script executed:

# Search for parallel_agent_runner.py
fd parallel_agent_runner

Repository: leonvanzyl/autocoder

Length of output: 88


🏁 Script executed:

# Check if there's any code that reads marker files from worktree_path
rg -n 'worktree_path.*read|read.*marker|\.features_db_path' --type py

Repository: leonvanzyl/autocoder

Length of output: 146


🏁 Script executed:

# Check parallel_agent_runner.py to see if it handles the marker file
cat -n parallel_agent_runner.py | head -100

Repository: leonvanzyl/autocoder

Length of output: 3240


🏁 Script executed:

# Search for how parallel_agent_runner.py sets up the database
rg -n 'features.db|PROJECT_DIR|worktree' parallel_agent_runner.py -A 3 -B 1

Repository: leonvanzyl/autocoder

Length of output: 1513


The .features_db_path marker file is never read — the fallback is incomplete.

The marker file (lines 177-178) is written when symlink creation fails, but no code in the repository reads it. The database coordination currently works only because parallel_agent_runner.py sets PROJECT_DIR to the main project directory (line 99), allowing the router and agent to access the shared features.db. If the symlink fails, the written marker file provides no actual recovery path since no MCP server, runner, or application code is configured to read it. Either remove the unused marker file or implement the reading mechanism in the code that needs to handle the fallback case.

- Use Path.is_relative_to() instead of string matching in worktree cleanup
  to prevent false positives from substring matches
- Add missing docstrings to improve coverage above 80% threshold:
  - run_parallel() in autonomous_agent_demo.py
  - read_file(), write_file() in server/routers/projects.py
  - ConnectionManager.__init__() in server/websocket.py
  - status getter/setter, pid property in process_manager.py

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In @autonomous_agent_demo.py:
- Around line 163-170: The parallel runner ignores the --max-iterations flag;
modify run_parallel to forward the max_iterations value to
orchestrator.start_agents and ensure the orchestrator and its agent-launching
code accept and pass that parameter into each agent's run_autonomous_agent call
(or change orchestrator.start_agents signature to accept max_iterations and
propagate it when creating/starting agents), so parallel agents honor the same
iteration limit as the single-agent path.
- Around line 173-180: The healthcheck loop treats False as both crashed and
finished, so change the loop to query per-agent statuses (e.g., add or use
orchestrator.get_agent_statuses() or similar) and distinguish terminal
"finished" states from "crashed"/"error" states before printing the completion
message; specifically, when running == 0 call orchestrator.get_agent_statuses()
(or introduce that method if it doesn't exist) and if all agents report a
terminal success state print "All agents have finished", otherwise log an error
or "Agents crashed" with the list of failed agent IDs/names.

In @worktree.py:
- Around line 113-118: The fallback empty commit call using
self._run_git("commit", "--allow-empty", "-m", "Initial commit") is not checked;
capture its return value (e.g., commit_result = self._run_git(...)) and verify
commit_result.returncode before proceeding to run rev-parse; if the commit
failed log a descriptive error with the commit_result output (using
logger.error) and return None (or raise) instead of continuing to call
self._run_git("rev-parse", "HEAD").
🧹 Nitpick comments (4)
server/services/process_manager.py (1)

99-111: Consider logging when notifications are silently dropped.

The except RuntimeError: pass block silently discards status change notifications when no event loop is running. While current usage ensures an event loop exists (all status mutations are in async methods), a debug-level log here would aid troubleshooting if this assumption changes in the future.

🔧 Optional: Add debug logging for dropped notifications
         except RuntimeError:
             # No running event loop
-            pass
+            logger.debug("Status change notification dropped: no running event loop")
autonomous_agent_demo.py (2)

101-106: Consider adding input validation for --num-agents.

Negative or zero values will silently fall through to single-agent mode due to the num_agents > 1 check on line 142. While not incorrect, this may confuse users who expect an error for invalid input.

🔧 Optional: Add minimum value validation
     parser.add_argument(
         "--num-agents",
         type=int,
         default=1,
-        help="Number of parallel agents to run (default: 1, max: 10)",
+        choices=range(1, 11),
+        metavar="N",
+        help="Number of parallel agents to run (1-10, default: 1)",
     )

Alternatively, validate after parsing and show a clear error message.


174-180: Consider adding periodic status output during polling.

The 5-second polling loop is silent. For long-running parallel agent sessions, users have no visibility into progress. Consider logging agent count or a brief status message periodically.

🔧 Optional: Add status logging
                 while True:
                     health = await orchestrator.healthcheck()
                     running = sum(1 for v in health.values() if v)
                     if running == 0:
                         print("\nAll agents have finished.")
                         break
+                    # Provide periodic visibility
+                    print(f"\r{running}/{len(health)} agents running...", end="", flush=True)
                     await asyncio.sleep(5)
worktree.py (1)

163-179: Remove the unused marker file code.

Lines 177-178 write a .features_db_path marker file that is never read by any code in the repository. Per the earlier review discussion, the system works because parallel_agent_runner.py sets PROJECT_DIR to the main project directory, allowing direct access to features.db. Consider removing the dead marker file code to improve maintainability.

♻️ Proposed cleanup
            except OSError as e:
                # Symlink failed - this breaks shared DB coordination
                # The MCP server uses PROJECT_DIR env var which points to worktree,
                # so we need the features.db to be accessible there
                logger.warning(
                    f"Symlink creation failed ({e}). "
                    f"Ensure MCP server is configured to use {features_db} directly."
                )
-               # Write a marker file to indicate the absolute DB path
-               db_path_file = worktree_path / ".features_db_path"
-               db_path_file.write_text(str(features_db))
-               logger.info(f"Wrote features.db path to {db_path_file}")
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 75a99f2 and c90c769.

📒 Files selected for processing (5)
  • autonomous_agent_demo.py
  • server/routers/projects.py
  • server/services/process_manager.py
  • server/websocket.py
  • worktree.py
🧰 Additional context used
🧬 Code graph analysis (1)
autonomous_agent_demo.py (3)
parallel_agents.py (5)
  • ParallelAgentOrchestrator (37-392)
  • start_agents (101-140)
  • healthcheck (362-379)
  • stop_all_agents (283-291)
  • cleanup (389-392)
server/services/process_manager.py (1)
  • healthcheck (364-384)
agent.py (1)
  • run_autonomous_agent (109-239)
🔇 Additional comments (20)
server/websocket.py (1)

54-54: LGTM! Documentation improvement.

The added docstring clearly describes the initialization purpose and improves code documentation.

server/routers/projects.py (2)

279-280: LGTM! Clear documentation added.

The docstring accurately describes the helper function's behavior of returning an empty string when the file is not found or cannot be read.


314-315: LGTM! Clear documentation added.

The docstring accurately describes the helper function's conditional behavior of only writing when content is provided (not None).

server/services/process_manager.py (4)

86-89: LGTM!

Clear and concise docstring for the property getter.


91-97: LGTM!

The setter correctly compares old and new status to avoid unnecessary notifications, and the docstring accurately describes the behavior. Since all status mutations occur within async methods on the event loop thread, the lack of locking around the read-compare-write is acceptable.


113-118: LGTM!

Solid defensive programming. The method safely wraps callback execution and prevents callback errors from propagating to the caller. Warning-level logging is appropriate for this use case.


140-143: LGTM!

Clear docstring accurately describes the property's return behavior.

autonomous_agent_demo.py (1)

197-214: LGTM!

Single-agent mode is correctly preserved as an else branch with appropriate error handling that matches the parallel mode pattern.

worktree.py (12)

1-16: LGTM!

Module structure and imports are clean and appropriate.


26-34: LGTM!

Initialization logic is sound. Using resolve() ensures absolute paths, and placing worktrees in the parent directory keeps them isolated from the main repository.


36-44: LGTM!

The helper method correctly captures output and allows flexible working directory specification.


46-49: LGTM!

Standard and reliable method to check for Git repository presence.


51-79: LGTM!

Error handling has been properly addressed. The method now correctly returns False when git add or git commit fail, preventing downstream issues with worktree creation.


81-83: LGTM!

Simple and clear path construction.


122-147: LGTM!

Worktree creation logic properly handles the case where the branch already exists, and shared resources are set up correctly.


181-190: LGTM!

The prompts directory setup includes a proper fallback mechanism using shutil.copytree when symlinks are not supported.


191-223: LGTM!

Robust worktree removal with appropriate fallback handling when Git commands fail.


225-261: LGTM!

Correct parsing of Git's porcelain output format for worktree listing.


263-312: LGTM!

The merge logic has been properly fixed. The method now correctly checks out the target branch before merging and provides clear guidance when conflicts occur.


314-335: The code at lines 314–335 is compatible with the project's Python version requirements. The pyproject.toml targets Python 3.11 (via ruff's target-version = "py311"), and Path.is_relative_to() has been available since Python 3.9. No Python version compatibility issue exists here.

Likely an incorrect or invalid review comment.

Comment on lines 173 to 180
# Wait for all agents to complete (or user interrupt)
while True:
health = await orchestrator.healthcheck()
running = sum(1 for v in health.values() if v)
if running == 0:
print("\nAll agents have finished.")
break
await asyncio.sleep(5)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Health check doesn't distinguish completion from crash.

The loop exits when running == 0, but this occurs both when agents complete successfully and when they crash. The message "All agents have finished" is misleading if agents crashed.

From the healthcheck() implementation, False is returned for both crashed and stopped agents. Consider checking actual agent statuses before printing the completion message.

🔧 Suggested improvement
                 # Wait for all agents to complete (or user interrupt)
                 while True:
                     health = await orchestrator.healthcheck()
                     running = sum(1 for v in health.values() if v)
                     if running == 0:
-                        print("\nAll agents have finished.")
+                        # Check if any agents crashed
+                        statuses = orchestrator.get_agent_statuses()
+                        crashed = [aid for aid, s in statuses.items() if s == "crashed"]
+                        if crashed:
+                            print(f"\nAgents finished. Crashed: {crashed}")
+                        else:
+                            print("\nAll agents completed successfully.")
                         break
                     await asyncio.sleep(5)

This requires exposing agent statuses from the orchestrator (e.g., a get_agent_statuses() method).

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In @autonomous_agent_demo.py around lines 173 - 180, The healthcheck loop treats
False as both crashed and finished, so change the loop to query per-agent
statuses (e.g., add or use orchestrator.get_agent_statuses() or similar) and
distinguish terminal "finished" states from "crashed"/"error" states before
printing the completion message; specifically, when running == 0 call
orchestrator.get_agent_statuses() (or introduce that method if it doesn't exist)
and if all agents report a terminal success state print "All agents have
finished", otherwise log an error or "Agents crashed" with the list of failed
agent IDs/names.

- Add max_iterations parameter to parallel agent system (was ignored)
- Improve health check to distinguish completion vs crash
- Check fallback commit return code in worktree creation
- Clarify symlink fallback behavior (env var is the actual solution)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
worktree.py (1)

26-34: Consider validating parent directory accessibility.

The worktrees directory is created in the parent of project_dir (line 34), which could fail if the parent is read-only or if project_dir is at the filesystem root. While uncommon in typical usage, this assumption could cause unexpected failures.

💡 Suggested improvement

Add a validation check or fallback location:

 def __init__(self, project_dir: Path):
     """
     Initialize the worktree manager.

     Args:
         project_dir: The main project directory (must be a git repo)
     """
     self.project_dir = project_dir.resolve()
+    # Validate parent is accessible
+    if not project_dir.parent.exists() or not os.access(project_dir.parent, os.W_OK):
+        raise ValueError(f"Parent directory not writable: {project_dir.parent}")
     self.worktrees_dir = self.project_dir.parent / f".{self.project_dir.name}_worktrees"
parallel_agents.py (1)

189-217: Consider using asyncio subprocess for non-blocking process creation.

The current implementation uses subprocess.Popen (line 190), which blocks the event loop during process creation. While process creation is typically fast, using asyncio.create_subprocess_exec would be more consistent with async/await patterns and prevent any potential blocking.

This would require updating AgentInfo.process type (line 30) and adjusting _stream_output accordingly.

⚡ Suggested improvement
         try:
-            process = subprocess.Popen(
-                cmd,
-                stdout=subprocess.PIPE,
-                stderr=subprocess.STDOUT,
-                cwd=str(worktree_path),
-            )
+            process = await asyncio.create_subprocess_exec(
+                *cmd,
+                stdout=asyncio.subprocess.PIPE,
+                stderr=asyncio.subprocess.STDOUT,
+                cwd=str(worktree_path),
+            )

Note: This also requires updating the AgentInfo dataclass and _stream_output method to work with asyncio.subprocess.Process instead of subprocess.Popen.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c90c769 and 8372a37.

📒 Files selected for processing (5)
  • autonomous_agent_demo.py
  • parallel_agents.py
  • server/routers/parallel_agents.py
  • server/schemas.py
  • worktree.py
🧰 Additional context used
🧬 Code graph analysis (2)
server/schemas.py (1)
server/services/process_manager.py (3)
  • status (87-89)
  • status (92-97)
  • pid (141-143)
parallel_agents.py (3)
worktree.py (4)
  • ensure_git_repo (51-79)
  • create_worktree (85-150)
  • merge_worktree_changes (268-317)
  • cleanup_all_worktrees (319-340)
server/services/process_manager.py (2)
  • pid (141-143)
  • resume (341-362)
server/routers/parallel_agents.py (2)
  • merge_all_worktrees (203-213)
  • get_orchestrator (49-62)

Comment on lines +152 to +195
def _setup_shared_resources(self, worktree_path: Path) -> None:
"""
Set up shared resources in the worktree.
The features.db is shared across all worktrees via symlink.
This is critical for atomic feature claiming via feature_claim_next().
Raises:
RuntimeError: If features.db cannot be shared with the worktree.
"""
# Create symlink to shared features.db
features_db = self.project_dir / "features.db"
worktree_db = worktree_path / "features.db"

if features_db.exists() and not worktree_db.exists():
try:
# Try symlink first (preferred)
worktree_db.symlink_to(features_db)
logger.debug(f"Created symlink to features.db in {worktree_path}")
except OSError as e:
# Symlink failed (common on Windows without admin privileges)
# This is OK because parallel_agent_runner.py sets PROJECT_DIR
# environment variable to the main project directory, and the
# MCP server uses that env var to locate features.db
logger.info(
f"Symlink creation failed ({e}), but this is OK - "
f"MCP server uses PROJECT_DIR env var to access {features_db}"
)
# Write a marker file for debugging purposes only
# (not read by MCP server, just useful for troubleshooting)
db_path_file = worktree_path / ".features_db_path"
db_path_file.write_text(str(features_db))
logger.debug(f"Wrote features.db path marker to {db_path_file}")

# Copy prompts directory if it exists
prompts_dir = self.project_dir / "prompts"
worktree_prompts = worktree_path / "prompts"
if prompts_dir.exists() and not worktree_prompts.exists():
try:
worktree_prompts.symlink_to(prompts_dir)
except OSError:
import shutil
shutil.copytree(prompts_dir, worktree_prompts)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update docstring to match actual behavior.

The docstring (lines 159-160) states the function raises RuntimeError if features.db cannot be shared, but the implementation never raises this exception. Instead, it logs an informational message and writes a marker file when symlink creation fails.

Either update the docstring to reflect the actual behavior or implement the documented exception.

🔧 Suggested fix
 def _setup_shared_resources(self, worktree_path: Path) -> None:
     """
     Set up shared resources in the worktree.

     The features.db is shared across all worktrees via symlink.
     This is critical for atomic feature claiming via feature_claim_next().
-
-    Raises:
-        RuntimeError: If features.db cannot be shared with the worktree.
+    
+    If symlink creation fails (e.g., on Windows without admin privileges),
+    the MCP server uses the PROJECT_DIR environment variable to access
+    the shared features.db from the main project directory.
     """

@Bortus-AI
Copy link

Bortus-AI commented Jan 7, 2026

Need to address dynamic/assigned frontend and the backend ports for each agent. Each agent needs their own port numbers or they will not be seeing their changes made. Also each worktree needs a copy of the database. I'll post some suggestions in a minute

@Bortus-AI
Copy link

Added dynamic port allocation to enable parallel agents to run simultaneously without port conflicts. Each agent environment now gets its own dedicated frontend and backend ports based on the AGENT_ID environment variable.

Port Allocation Scheme

Environment Frontend Backend
Default (manual dev) 5173 3000
agent-1 5174 3001
agent-2 5175 3002
agent-3 5176 3003

How It Works

  • Ports are calculated dynamically from AGENT_ID environment variable
  • Formula: base_port + agent_number (e.g., agent-2 → 5173+2 = 5175)
  • Both init.sh and vite.config.js use the same logic for consistency
  • Backend port is exported as PORT env var for the Express server

Changes

init.sh

+# Calculate ports based on AGENT_ID
+# Default (no AGENT_ID): frontend 5173, backend 3000
+# agent-1: frontend 5174, backend 3001
+# agent-2: frontend 5175, backend 3002, etc.
+if [ -n "$AGENT_ID" ]; then
+    AGENT_NUM=$(echo "$AGENT_ID" | grep -oE '[0-9]+')
+    FRONTEND_PORT=$((5173 + AGENT_NUM))
+    BACKEND_PORT=$((3000 + AGENT_NUM))
+else
+    FRONTEND_PORT=5173
+    BACKEND_PORT=3000
+fi
+export PORT=$BACKEND_PORT  # For the backend server

 # Start backend server in background
 echo "Starting backend server..."
 cd backend
 npm run dev &

 # Start frontend server
 echo "Starting frontend server..."
 cd frontend
-npm run dev &
+VITE_API_URL=http://localhost:$BACKEND_PORT npm run dev &

 echo "Access the application:"
-echo "  - Frontend: http://localhost:5173"
-echo "  - Backend API: http://localhost:3000"
+echo "  - Frontend: http://localhost:$FRONTEND_PORT"
+echo "  - Backend API: http://localhost:$BACKEND_PORT"

frontend/vite.config.js

 import { defineConfig } from 'vite';
 import react from '@vitejs/plugin-react';
 import { VitePWA } from 'vite-plugin-pwa';

+// Calculate ports based on AGENT_ID env var
+// Default (no AGENT_ID): frontend 5173, backend 3000
+// agent-1: frontend 5174, backend 3001
+// agent-2: frontend 5175, backend 3002, etc.
+function getAgentNum() {
+  const agentId = process.env.AGENT_ID;
+  if (agentId) {
+    const match = agentId.match(/\d+/);
+    if (match) {
+      return parseInt(match[0], 10);
+    }
+  }
+  return 0; // Default (manual development) = 0
+}
+
+function getPort() {
+  const agentNum = getAgentNum();
+  return 5173 + agentNum; // default -> 5173, agent-1 -> 5174, agent-2 -> 5175, etc.
+}
+
+function getBackendPort() {
+  const agentNum = getAgentNum();
+  return 3000 + agentNum; // default -> 3000, agent-1 -> 3001, agent-2 -> 3002, etc.
+}
+
 export default defineConfig({
   plugins: [
     react(),
     VitePWA({
       // ... PWA config unchanged ...
     })
   ],
   server: {
-    port: 5173,
+    port: getPort(),
     proxy: {
       '/api': {
-        target: 'http://localhost:3000',
+        target: `http://localhost:${getBackendPort()}`,
         changeOrigin: true,
       },
     },
   },
 });

Usage

# Manual development (default ports)
./init.sh

# Run as agent-1 (ports 5174/3001)
AGENT_ID=agent-1 ./init.sh

# Run as agent-2 (ports 5175/3002)
AGENT_ID=agent-2 ./init.sh

Testing

- Manual dev without AGENT_ID → ports 5173/3000
- AGENT_ID=agent-1 → ports 5174/3001
- AGENT_ID=agent-2 → ports 5175/3002
- Multiple parallel agents run simultaneously without port conflicts

@Bortus-AI
Copy link

The other issue I saw with this is related to the database files and worktrees.

When parallel agents are started, each agent's worktree is missing the backend database (backend/database.sqlite). This causes authentication failures and API errors because the backend server cannot find user accounts or other required data.

Root Cause

The WorktreeManager._setup_shared_resources() method only handles:

  • features.db (symlinked for shared feature tracking)
  • prompts/ directory

It does not copy the backend application database, which is an untracked file (in .gitignore) and therefore not included when git creates the worktree.

Impact

  • Agents cannot authenticate users
  • API calls return errors or empty results
  • Backend server may crash on startup

Fix

Added _copy_backend_database() method to automatically copy the backend database to each worktree during creation.

worktree.py

     def _setup_shared_resources(self, worktree_path: Path) -> None:
         """
         Set up shared resources in the worktree.

         The features.db is shared across all worktrees via symlink.
+        The backend database is copied (not shared) so each agent has isolated data.
         """
         # ... existing features.db and prompts handling ...

+        # Copy backend database if it exists (each worktree needs its own copy)
+        self._copy_backend_database(worktree_path)
+
+    def _copy_backend_database(self, worktree_path: Path) -> None:
+        """
+        Copy the backend database to the worktree.
+
+        Each worktree needs its own database copy for isolated backend operation.
+        Checks common database locations (backend/database.sqlite, etc.)
+        """
+        import shutil
+
+        # Common database locations to check
+        db_locations = [
+            "backend/database.sqlite",
+            "database.sqlite",
+            "backend/db.sqlite",
+            "db.sqlite",
+        ]
+
+        for db_rel_path in db_locations:
+            source_db = self.project_dir / db_rel_path
+            if source_db.exists() and source_db.stat().st_size > 0:
+                target_db = worktree_path / db_rel_path
+
+                # Create parent directory if needed
+                target_db.parent.mkdir(parents=True, exist_ok=True)
+
+                if not target_db.exists():
+                    try:
+                        shutil.copy2(source_db, target_db)
+                        logger.info(f"Copied database {db_rel_path} to worktree {worktree_path.name}")
+                    except Exception as e:
+                        logger.warning(f"Failed to copy database {db_rel_path}: {e}")

soekamtiveri referenced this pull request Jan 8, 2026
start_ui.sh and start_ui.bat were using system Python directly,
causing ModuleNotFoundError for dotenv. Now they create and activate
the venv before running, matching the pattern in start.sh.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@shaal
Copy link
Author

shaal commented Jan 8, 2026

@jpbester Yes, this works on the UI side as well! The PR includes full UI support:

New Components:

UI Features:

  • Real-time status monitoring of all parallel agents
  • Merge button to combine all agent worktrees back to main branch
  • Cleanup button to stop agents and remove worktrees
  • WebSocket updates for live progress tracking

You can enable parallel mode in the UI before starting the agent, then control all agents from the dashboard.

- Add parallel agent mode instructions to coding prompt template
- Update prompt to use feature_claim_next(agent_id) for atomic claiming
- Add worktree-safe auto-merge instructions after each feature
- Increase max agents from 3 to 10 (backend and UI)
- Add colors for agents 6-10 in FeatureCard and ParallelAgentControl
- Fix React component identity bug causing double-click on Parallel button

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In @.claude/templates/coding_prompt.template.md:
- Around line 115-118: The fenced code block that starts with the line
"Atomically claim the next feature (prevents race conditions with other agents)"
lacks a language specifier; update its opening triple-backtick to include a
language (e.g., change ``` to ```text, or to ```bash if you intend it as a shell
command) so the code block becomes fenced with a language identifier for proper
syntax highlighting and consistency in the template.

In @parallel_agents.py:
- Around line 189-195: Replace the blocking subprocess.Popen call with
asyncio.create_subprocess_exec to avoid blocking the event loop: call
asyncio.create_subprocess_exec(*cmd, stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.STDOUT, cwd=str(worktree_path)) and await its creation
to get the async process object (replacing the current process variable). After
creation, perform a short startup verification (e.g., await asyncio.sleep(0) or
attempt one non-blocking read of process.stdout or check process.returncode) and
raise/log an error if the process exits immediately so failures during startup
are detected. Ensure all subsequent interactions use the returned asyncio
subprocess methods (communicate, wait, stdout.readline) instead of sync
subprocess APIs.

In @ui/src/components/ParallelAgentControl.tsx:
- Line 56: Add a persistent yolo_mode boolean to the parallel-agent status shape
and read it from server status rather than a local state: update the schema type
ParallelAgentInfo (server/schemas.py) to include yolo_mode: bool and ensure the
/status endpoint response populates that field from the
ParallelAgentStartRequest handling, add the same property to the TypeScript
ParallelAgentInfo type (ui/src/lib/types.ts), then in ParallelAgentControl.tsx
remove/stop using the local yoloEnabled state (yoloEnabled / setYoloEnabled) and
change the YOLO indicator to read from status.agents[0].yolo_mode (or a shared
yolo flag on the ParallelAgentsStatus object) so the UI reflects server-reported
YOLO mode across remounts.
🧹 Nitpick comments (4)
parallel_agents.py (2)

300-350: Pause/resume risks well-documented, but consider testing thoroughly.

The extensive docstring warnings about SIGSTOP/SIGCONT limitations with asyncio are excellent—they clearly communicate the risks to users (frozen event loop, burst callbacks on resume, potential async operation failures).

Since you've chosen to keep the psutil suspend/resume approach with warnings rather than implementing stop/start semantics, ensure thorough cross-platform testing with realistic workloads (ongoing API requests, database queries, file I/O) to validate behavior under suspension.

Optional: Consider stop/start as a more robust alternative

If pause/resume proves unreliable in production, the stop/start pattern would provide cleaner state transitions without asyncio event loop complications. The trade-off is losing the ability to "pause" mid-execution, but you gain predictable startup/shutdown semantics.


388-395: Consider concurrent merges for better performance with many agents.

The method sequentially merges each agent's worktree, which could be slow when running 10 agents. Since WorktreeManager.merge_worktree_changes runs git commands (which are I/O-bound), consider running merges concurrently:

♻️ Concurrent merge pattern
 async def merge_all_worktrees(self) -> dict[str, bool]:
     """Merge changes from all agent worktrees back to main."""
-    results = {}
-    for agent_id in self.agents:
-        success = self.worktree_manager.merge_worktree_changes(agent_id)
-        results[agent_id] = success
-    return results
+    loop = asyncio.get_running_loop()
+    
+    async def merge_one(agent_id: str) -> tuple[str, bool]:
+        success = await loop.run_in_executor(
+            None, 
+            self.worktree_manager.merge_worktree_changes, 
+            agent_id
+        )
+        return agent_id, success
+    
+    merge_results = await asyncio.gather(
+        *[merge_one(aid) for aid in self.agents],
+        return_exceptions=True
+    )
+    
+    results = {}
+    for item in merge_results:
+        if isinstance(item, Exception):
+            logger.error(f"Merge failed: {item}")
+        else:
+            agent_id, success = item
+            results[agent_id] = success
+    
+    return results
ui/src/components/ParallelAgentControl.tsx (2)

131-131: Improve responsive grid layout.

The agent status grid uses grid-cols-5 sm:grid-cols-5, which applies 5 columns at both mobile and desktop breakpoints. On smaller screens, 5 columns may be cramped for the agent badges.

📱 Suggested responsive improvement
-        <div className="grid grid-cols-5 sm:grid-cols-5 gap-2">
+        <div className="grid grid-cols-3 sm:grid-cols-5 gap-2">

This provides 3 columns on mobile and 5 on larger screens for better readability.


236-247: Consider adding text label to Cleanup button for consistency.

The Stop and Merge buttons include both icon and text labels (lines 221, 234), but the Cleanup button (line 246) has only an icon. While the title attribute provides a tooltip, a visible text label improves accessibility and consistency.

♿ Suggested improvement
             <button
               onClick={handleCleanup}
               disabled={isLoading}
-              className="neo-btn neo-btn-secondary text-sm py-2 px-3 flex items-center gap-1"
+              className="neo-btn neo-btn-secondary text-sm py-2 px-3 flex items-center gap-1 flex-1"
               title="Stop All & Cleanup Worktrees"
             >
               {cleanupAgents.isPending ? (
                 <Loader2 size={14} className="animate-spin" />
               ) : (
                 <Trash2 size={14} />
               )}
+              <span>Cleanup</span>
             </button>

This makes all three action buttons visually consistent and improves accessibility.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8372a37 and 8cf2029.

📒 Files selected for processing (5)
  • .claude/templates/coding_prompt.template.md
  • parallel_agents.py
  • ui/src/App.tsx
  • ui/src/components/FeatureCard.tsx
  • ui/src/components/ParallelAgentControl.tsx
🚧 Files skipped from review as they are similar to previous changes (2)
  • ui/src/App.tsx
  • ui/src/components/FeatureCard.tsx
🧰 Additional context used
🧬 Code graph analysis (1)
parallel_agents.py (3)
worktree.py (4)
  • ensure_git_repo (51-79)
  • create_worktree (85-150)
  • merge_worktree_changes (268-317)
  • cleanup_all_worktrees (319-340)
server/services/process_manager.py (2)
  • pid (141-143)
  • resume (341-362)
server/routers/parallel_agents.py (2)
  • merge_all_worktrees (203-213)
  • get_orchestrator (49-62)
🪛 markdownlint-cli2 (0.18.1)
.claude/templates/coding_prompt.template.md

115-115: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (7)
parallel_agents.py (1)

128-142: Concurrent startup implementation looks good!

The agents now start concurrently using asyncio.gather() with proper exception handling. This addresses the previous review feedback and will significantly improve startup time when launching multiple agents.

.claude/templates/coding_prompt.template.md (3)

6-19: Clear parallel agent mode documentation.

The introduction effectively explains the parallel agent concept and key requirements. The emphasis on atomic claiming to prevent race conditions is particularly important for users to understand.


113-122: Atomic claiming instructions are clear.

The distinction between parallel and single-agent modes is well-explained, and noting that feature_claim_next eliminates the need for a separate feature_mark_in_progress call prevents potential confusion.


324-325: Verify git-common-dir command works across worktree configurations.

The command cd "$(git rev-parse --git-common-dir)/.." assumes a specific directory structure to reach the main worktree. This may not work correctly in all worktree setups (e.g., bare repositories, worktrees created in non-standard locations).

Consider testing this command in various scenarios or simplifying to a more robust approach:

#!/bin/bash
# Test the main branch detection in different worktree scenarios

# Create test repo with worktrees in different configurations
mkdir -p /tmp/test-worktree-detection
cd /tmp/test-worktree-detection

# Scenario 1: Standard worktree
git init main-repo
cd main-repo
git commit --allow-empty -m "Initial"
git worktree add ../worktree-1 -b test-branch

# Test the command from worktree
cd ../worktree-1
echo "Testing from standard worktree:"
MAIN_BRANCH=$(cd "$(git rev-parse --git-common-dir)/.." && git branch --show-current 2>/dev/null || echo "main")
echo "Result: $MAIN_BRANCH"

# Scenario 2: Nested worktree
cd ../main-repo
mkdir -p nested
git worktree add nested/worktree-2 -b test-branch-2

cd nested/worktree-2
echo -e "\nTesting from nested worktree:"
MAIN_BRANCH=$(cd "$(git rev-parse --git-common-dir)/.." && git branch --show-current 2>/dev/null || echo "main")
echo "Result: $MAIN_BRANCH"

# Cleanup
cd /tmp
rm -rf test-worktree-detection

Alternative: Use git rev-parse --show-toplevel from the main worktree or query the main worktree's current branch directly via git worktree list.

ui/src/components/ParallelAgentControl.tsx (3)

1-49: LGTM: Imports, constants, and color helper are well-structured.

The AGENT_COLORS array correctly supports up to 10 agents (matching the max from PR objectives), and getAgentColor safely extracts the agent number with proper bounds checking via modulo.


51-109: LGTM: State management and handlers are well-structured.

The conditional status fetching (only when isParallelMode is true) is efficient, and the aggregated isLoading state correctly prevents concurrent mutations.


252-291: LGTM: Loading state and badge component are well-implemented.

The AgentStatusBadge provides clear visual feedback with color-coded borders, status dots, and conditional animation for running agents. The tooltip includes helpful debug information (worktree path).

Comment on lines +115 to +118
```
# Atomically claim the next feature (prevents race conditions with other agents)
Use the feature_claim_next tool with agent_id=$AGENT_ID
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add language specifier to code block for consistency.

The fenced code block is missing a language identifier, which reduces syntax highlighting and readability.

📝 Proposed fix
-```
+```text
 # Atomically claim the next feature (prevents race conditions with other agents)
 Use the feature_claim_next tool with agent_id=$AGENT_ID

Or use `bash` if these are meant to be shell commands, though "Use the feature_claim_next tool" suggests these are instructions rather than executable commands.

</details>

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 markdownlint-cli2 (0.18.1)</summary>

115-115: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

In @.claude/templates/coding_prompt.template.md around lines 115 - 118, The
fenced code block that starts with the line "Atomically claim the next feature
(prevents race conditions with other agents)" lacks a language specifier; update
its opening triple-backtick to include a language (e.g., change totext,
or to ```bash if you intend it as a shell command) so the code block becomes
fenced with a language identifier for proper syntax highlighting and consistency
in the template.


</details>

<!-- fingerprinting:phantom:poseidon:puma -->

<!-- This is an auto-generated comment by CodeRabbit -->

Comment on lines +189 to +195
try:
process = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
cwd=str(worktree_path),
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Blocking subprocess creation in async function.

Using subprocess.Popen blocks the event loop during process creation. The previous review recommended switching to asyncio.create_subprocess_exec for truly async operation and adding startup verification.

🤖 Prompt for AI Agents
In @parallel_agents.py around lines 189 - 195, Replace the blocking
subprocess.Popen call with asyncio.create_subprocess_exec to avoid blocking the
event loop: call asyncio.create_subprocess_exec(*cmd,
stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.STDOUT,
cwd=str(worktree_path)) and await its creation to get the async process object
(replacing the current process variable). After creation, perform a short
startup verification (e.g., await asyncio.sleep(0) or attempt one non-blocking
read of process.stdout or check process.returncode) and raise/log an error if
the process exits immediately so failures during startup are detected. Ensure
all subsequent interactions use the returned asyncio subprocess methods
(communicate, wait, stdout.readline) instead of sync subprocess APIs.

onModeChange,
}: ParallelAgentControlProps) {
const [numAgents, setNumAgents] = useState(2)
const [yoloEnabled, setYoloEnabled] = useState(false)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Check if ParallelAgentInfo type or API response includes yolo_mode tracking
rg -n --type=ts -C3 'interface ParallelAgentInfo|type ParallelAgentInfo'
rg -n --type=py -C3 'class ParallelAgentInfo|ParallelAgentInfo\('

Repository: leonvanzyl/autocoder

Length of output: 1350


🏁 Script executed:

# Find the ParallelAgentControl.tsx file and read it
fd ParallelAgentControl.tsx

Repository: leonvanzyl/autocoder

Length of output: 106


🏁 Script executed:

# Read the full ParallelAgentInfo interface definition in types.ts
sed -n '100,130p' ui/src/lib/types.ts

Repository: leonvanzyl/autocoder

Length of output: 660


🏁 Script executed:

# Read the full ParallelAgentInfo class definition in server/schemas.py
sed -n '132,160p' server/schemas.py

Repository: leonvanzyl/autocoder

Length of output: 983


🏁 Script executed:

# Read ParallelAgentControl.tsx - first get file size
wc -l ui/src/components/ParallelAgentControl.tsx

Repository: leonvanzyl/autocoder

Length of output: 110


🏁 Script executed:

# Read the component structure and line 56 area
sed -n '1,80p' ui/src/components/ParallelAgentControl.tsx

Repository: leonvanzyl/autocoder

Length of output: 2140


🏁 Script executed:

# Read lines 150-180 to see what's at 163-171
sed -n '150,180p' ui/src/components/ParallelAgentControl.tsx

Repository: leonvanzyl/autocoder

Length of output: 1225


🏁 Script executed:

# Read lines 190-210 to see the YOLO indicator at 200-205
sed -n '190,210p' ui/src/components/ParallelAgentControl.tsx

Repository: leonvanzyl/autocoder

Length of output: 969


🏁 Script executed:

# Search for handleStart function and how yoloEnabled is used
sed -n '80,150p' ui/src/components/ParallelAgentControl.tsx

Repository: leonvanzyl/autocoder

Length of output: 2224


🏁 Script executed:

# Search for all uses of yoloEnabled in the component
grep -n "yoloEnabled\|handleStart" ui/src/components/ParallelAgentControl.tsx

Repository: leonvanzyl/autocoder

Length of output: 513


🏁 Script executed:

# Check the useStartParallelAgents hook to see what parameters it accepts
fd useProjects -type f

Repository: leonvanzyl/autocoder

Length of output: 234


🏁 Script executed:

# Find the useProjects hook file
fd "useProjects" ui/src/hooks/

Repository: leonvanzyl/autocoder

Length of output: 91


🏁 Script executed:

# Search for the API endpoints that handle parallel agents
grep -rn "yoloMode\|yolo_mode" ui/src/ server/

Repository: leonvanzyl/autocoder

Length of output: 3145


🏁 Script executed:

# Check ParallelAgentStart and ParallelAgentStartResponse schemas
sed -n '100,145p' server/schemas.py

Repository: leonvanzyl/autocoder

Length of output: 1591


🏁 Script executed:

# Check how parallel agents endpoint returns the status
sed -n '60,110p' server/routers/parallel_agents.py

Repository: leonvanzyl/autocoder

Length of output: 1715


🏁 Script executed:

# Check the AgentStatus type in types.ts to see its yolo_mode field
grep -B5 -A5 'yolo_mode: boolean' ui/src/lib/types.ts

Repository: leonvanzyl/autocoder

Length of output: 278


Add yolo_mode to parallel agent status tracking.

The yoloEnabled state is local (line 56, defaults to false). When agents are started with YOLO mode enabled and the component unmounts/remounts (e.g., navigation or parent re-render), yoloEnabled resets to false, causing the YOLO indicator (lines 200-205) to disappear even though agents are still running in YOLO mode.

Unlike single agents (AgentStatus includes yolo_mode), ParallelAgentInfo lacks a yolo_mode field despite the server accepting and processing yolo_mode in ParallelAgentStartRequest. The YOLO indicator should read from the server status response instead of local state. Add yolo_mode: bool to ParallelAgentInfo in both server/schemas.py and ui/src/lib/types.ts, include it in the /status endpoint response, and update the YOLO indicator to use status.agents[0].yolo_mode or track a shared YOLO flag in ParallelAgentsStatus.

🤖 Prompt for AI Agents
In @ui/src/components/ParallelAgentControl.tsx at line 56, Add a persistent
yolo_mode boolean to the parallel-agent status shape and read it from server
status rather than a local state: update the schema type ParallelAgentInfo
(server/schemas.py) to include yolo_mode: bool and ensure the /status endpoint
response populates that field from the ParallelAgentStartRequest handling, add
the same property to the TypeScript ParallelAgentInfo type
(ui/src/lib/types.ts), then in ParallelAgentControl.tsx remove/stop using the
local yoloEnabled state (yoloEnabled / setYoloEnabled) and change the YOLO
indicator to read from status.agents[0].yolo_mode (or a shared yolo flag on the
ParallelAgentsStatus object) so the UI reflects server-reported YOLO mode across
remounts.

…arallel agents

- Fix assigned_agent_id not being returned in features API response
- Add pause/resume endpoints for parallel agents (/parallel-agents/pause, /resume)
- MCP server auto-claims features when AGENT_ID env var is set
- UI shows which agent is working on each feature with colored badges
- ParallelAgentControl panel now includes status indicator, YOLO toggle,
  and pause/resume/stop controls

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
mcp_server/feature_mcp.py (1)

337-356: TOCTOU race condition in agent assignment check.

The conflict check (lines 346-350) reads feature.assigned_agent_id at line 337, then updates it at lines 353-354 without row locking. If two agents call this simultaneously with different agent_id values on the same feature, both could pass the check and overwrite each other's assignment.

🔒 Proposed fix using row locking
     session = get_session()
     try:
-        feature = session.query(Feature).filter(Feature.id == feature_id).first()
+        feature = session.query(Feature).filter(Feature.id == feature_id).with_for_update().first()

         if feature is None:
             return json.dumps({"error": f"Feature with ID {feature_id} not found"})
🤖 Fix all issues with AI agents
In @mcp_server/feature_mcp.py:
- Around line 185-190: The auto-claim block sets feature.in_progress and
assigned_agent_id without acquiring a row lock, allowing concurrent agents to
read and claim the same Feature; modify the query used to fetch the Feature (the
`query` that returns `feature`) to call `with_for_update()` before `.first()` so
the DB row is locked for update (e.g., use
`query.order_by(...).with_for_update().first()`), then proceed to set
`feature.in_progress`, `feature.assigned_agent_id`, `session.commit()` and
`session.refresh(feature)` as before.
🧹 Nitpick comments (1)
mcp_server/feature_mcp.py (1)

465-479: Consider adding row locking for consistency.

While the ownership verification (lines 471-474) is correct, it's vulnerable to the same TOCTOU race condition as feature_mark_in_progress. For consistency and to prevent unexpected behavior in high-concurrency scenarios, consider adding with_for_update() to the query at line 465.

🔒 Proposed enhancement
     session = get_session()
     try:
-        feature = session.query(Feature).filter(Feature.id == feature_id).first()
+        feature = session.query(Feature).filter(Feature.id == feature_id).with_for_update().first()

         if feature is None:
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8cf2029 and c9ed960.

📒 Files selected for processing (6)
  • mcp_server/feature_mcp.py
  • server/routers/features.py
  • server/routers/parallel_agents.py
  • ui/src/components/ParallelAgentControl.tsx
  • ui/src/hooks/useProjects.ts
  • ui/src/lib/api.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • ui/src/hooks/useProjects.ts
🧰 Additional context used
🧬 Code graph analysis (3)
server/routers/parallel_agents.py (5)
server/schemas.py (4)
  • ParallelAgentStartRequest (128-132)
  • ParallelAgentsStatus (144-148)
  • ParallelAgentInfo (135-141)
  • ParallelAgentActionResponse (151-155)
server/routers/features.py (2)
  • _get_project_path (28-36)
  • validate_project_name (57-64)
registry.py (1)
  • get_project_path (195-213)
parallel_agents.py (10)
  • get_orchestrator (406-417)
  • healthcheck (369-386)
  • start_agents (101-142)
  • stop_all_agents (290-298)
  • pause_agent (300-325)
  • resume_agent (327-349)
  • start_agent (144-217)
  • stop_agent (251-288)
  • merge_all_worktrees (388-394)
  • cleanup (396-399)
server/services/process_manager.py (1)
  • pid (141-143)
ui/src/lib/api.ts (1)
server/schemas.py (2)
  • ParallelAgentsStatus (144-148)
  • ParallelAgentActionResponse (151-155)
mcp_server/feature_mcp.py (2)
api/database.py (2)
  • Feature (19-46)
  • to_dict (34-46)
ui/src/lib/types.ts (1)
  • Feature (60-67)
🔇 Additional comments (7)
server/routers/features.py (1)

93-93: LGTM! Correctly exposes agent assignment in API responses.

The addition of assigned_agent_id to the feature response enables the UI to display which agent is working on each feature, supporting the parallel agents feature. The field will be None for unassigned features, which FastAPI serializes as null automatically.

mcp_server/feature_mcp.py (2)

231-262: LGTM!

The function correctly clears both in_progress and assigned_agent_id when marking a feature as passing, ensuring clean state transitions.


395-444: Well-implemented atomic feature claiming!

The use of with_for_update() (line 425) provides proper row-level locking, and the filter logic correctly ensures a feature is available by checking both in_progress and assigned_agent_id with AND logic. The exception handling with rollback (lines 439-441) is also appropriate.

server/routers/parallel_agents.py (1)

65-274: LGTM! Endpoint implementations are consistent and well-structured.

All endpoints follow a consistent pattern: get orchestrator, call orchestrator method, return appropriate response model. Error handling is properly delegated to the orchestrator methods, and response models match the defined schemas.

ui/src/components/ParallelAgentControl.tsx (2)

115-309: Well-structured conditional rendering and UI controls!

The component properly handles different states (collapsed, no agents, running agents, paused agents) with appropriate controls for each. The agent count bounds (1-10), loading states, and status indicators are all correctly implemented.


311-383: Subcomponents are clean and focused.

Both StatusIndicator and AgentStatusBadge are well-implemented with clear responsibilities. The color mapping with getAgentColor ensures visual consistency across the UI.

ui/src/lib/api.ts (1)

152-211: LGTM! Parallel agent API functions are well-implemented.

All functions follow the established pattern, properly encode parameters, use correct HTTP methods, and match server-side schemas. The snake_case conversion for request bodies (num_agents, yolo_mode) correctly aligns with Python backend conventions.

Comment on lines +185 to +190
# In parallel mode, automatically claim the feature to prevent race conditions
if agent_id and not feature.in_progress:
feature.in_progress = True
feature.assigned_agent_id = agent_id
session.commit()
session.refresh(feature)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, search for calls to feature_get_next vs feature_claim_next
rg -n 'feature_get_next|feature_claim_next' --type py -C2 -g '!mcp_server/feature_mcp.py'

Repository: leonvanzyl/autocoder

Length of output: 1765


🏁 Script executed:

# Examine the feature_mcp.py file around the auto-claim logic
cat -n mcp_server/feature_mcp.py | sed -n '140,200p'

Repository: leonvanzyl/autocoder

Length of output: 2832


🏁 Script executed:

# Search for with_for_update usage in the file to see locking patterns
rg -n 'with_for_update' mcp_server/feature_mcp.py -B2 -A2

Repository: leonvanzyl/autocoder

Length of output: 239


🏁 Script executed:

# Check if feature_claim_next also uses auto-claiming or has explicit locking
rg -n 'def feature_claim_next' mcp_server/feature_mcp.py -A30

Repository: leonvanzyl/autocoder

Length of output: 1418


Add row-level locking to prevent race condition in auto-claim logic.

The auto-claim at lines 185-190 lacks row-level locking. The query at line 180 retrieves the feature without with_for_update(), creating a race condition window: two agents can read the same feature before either commits, causing both to claim it simultaneously.

The docstring (lines 154-155) states this auto-claim "prevents race conditions," but the implementation does not. In contrast, feature_claim_next (line 425) correctly uses with_for_update() before the .first() call to achieve atomicity.

Add with_for_update() to line 180 before the .first() call:

Suggested fix
feature = query.order_by(Feature.priority.asc(), Feature.id.asc()).with_for_update().first()
🤖 Prompt for AI Agents
In @mcp_server/feature_mcp.py around lines 185 - 190, The auto-claim block sets
feature.in_progress and assigned_agent_id without acquiring a row lock, allowing
concurrent agents to read and claim the same Feature; modify the query used to
fetch the Feature (the `query` that returns `feature`) to call
`with_for_update()` before `.first()` so the DB row is locked for update (e.g.,
use `query.order_by(...).with_for_update().first()`), then proceed to set
`feature.in_progress`, `feature.assigned_agent_id`, `session.commit()` and
`session.refresh(feature)` as before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Launch multiple subagents

3 participants