-
Notifications
You must be signed in to change notification settings - Fork 276
feat: Add parallel subagent execution support #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Implements the ability to run multiple Claude agents in parallel to speed up project completion (GitHub issue leonvanzyl#5). ## Key Features: - **Git worktrees**: Each agent gets isolated working directory - **Atomic task claiming**: Prevents race conditions between agents - **Agent assignment tracking**: Features show which agent is working on them - **Configurable parallelism**: --num-agents flag for CLI, API for UI ## New Files: - worktree.py: Git worktree management for agent isolation - parallel_agents.py: Orchestrator for managing multiple agents - parallel_agent_runner.py: Individual agent process runner - server/routers/parallel_agents.py: REST API for parallel agent control ## Changes: - Database schema: Added assigned_agent_id column to features - MCP server: New atomic feature_claim_next() tool - UI: Agent badges on feature cards - CLI: --num-agents option for parallel execution Closes leonvanzyl#5 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add support for managing parallel agents in the UI, enhancing project execution capabilities. Key changes include: - Introduced `ParallelAgentControl` component for toggling parallel mode. - Updated `App` component to conditionally render agent controls based on parallel mode. - Added hooks for querying and mutating parallel agent statuses in `useProjects`. - Implemented API functions for starting, stopping, merging, and cleaning up parallel agents. This update improves the user experience by allowing simultaneous agent operations, streamlining project workflows.
|
nice! Look forward to trying this |
|
will this work on the UI side as well? |
Resolved conflicts in: - server/main.py: Combined cleaner multi-line import format with parallel_agents_router - server/routers/__init__.py: Added parallel_agents_router import in alphabetical order 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
📝 WalkthroughWalkthroughAdds first-class parallel agents: per-agent git worktrees, an orchestrator and runner, agent_id propagation through client/MCP, DB column Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant User
participant API as Parallel Agents API
participant Orch as ParallelAgentOrchestrator
participant WT as WorktreeManager
participant Runner as parallel_agent_runner (subproc)
participant Agent as run_autonomous_agent
participant MCP as Feature MCP
participant DB as Database
User->>API: POST /start (project, num_agents)
API->>Orch: start_agents(num_agents, yolo_mode, model)
Orch->>WT: create_worktree(agent_id)
WT-->>Orch: worktree_path
Orch->>Runner: spawn subprocess (agent_id, worktree_path)
Runner->>Agent: start run_autonomous_agent(..., agent_id)
loop Per-agent work loop
Agent->>MCP: feature_claim_next(agent_id)
MCP->>DB: atomic SELECT/UPDATE (set in_progress, assigned_agent_id)
DB-->>MCP: claimed feature
MCP-->>Agent: feature payload (includes assigned_agent_id)
Agent->>WT: make commits in worktree
Agent->>MCP: feature_mark_passing(feature_id)
MCP->>DB: clear assigned_agent_id, mark passed
end
User->>API: POST /merge
API->>Orch: merge_all_worktrees()
Orch->>WT: merge_worktree_changes(agent_id) for each agent
WT-->>Orch: merge results
Orch-->>API: summary
API-->>User: response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
🤖 Fix all issues with AI agents
In @mcp_server/feature_mcp.py:
- Around line 166-171: The filter on Feature.assigned_agent_id inside the
query.filter call should not include the empty-string check; remove the
(Feature.assigned_agent_id == "") clause so the filter only checks for None or
the matching agent_id. Update both places where this pattern appears (the
query.filter that currently uses (Feature.assigned_agent_id == None) |
(Feature.assigned_agent_id == "") | (Feature.assigned_agent_id == agent_id)) to
instead use (Feature.assigned_agent_id == None) | (Feature.assigned_agent_id ==
agent_id) to match the codebase’s use of None when clearing assignments.
- Around line 400-412: The claim filter uses OR which allows a row with
in_progress=True and assigned_agent_id==None to be claimed; change the predicate
so in_progress must be False AND the assigned_agent_id must be either None/empty
or equal to agent_id. Update the .filter block that references
Feature.in_progress and Feature.assigned_agent_id (and uses agent_id) to require
Feature.in_progress == False AND (Feature.assigned_agent_id is None OR == "" OR
== agent_id) while keeping the outer passes == False, ordering,
with_for_update(), and .first() intact.
In @parallel_agents.py:
- Around line 101-132: start_agents is launching agents sequentially by awaiting
start_agent inside the for loop; change it to start all agent startups
concurrently by creating tasks for each start_agent call (using
asyncio.create_task or a comprehension) keyed by generate_agent_id, then await
them together with asyncio.gather (with return_exceptions=True) and populate
results mapping agent_id to boolean success while converting exceptions to
False; keep the prior git repo check and ensure any per-task exceptions are
logged/handled rather than letting one failure block others.
- Around line 285-315: pause_agent and resume_agent currently call
psutil.Process.suspend()/resume() (psutil.Process.suspend,
psutil.Process.resume) which uses SIGSTOP/SIGCONT and will freeze the entire
asyncio event loop and cause unpredictable behavior; replace these methods with
stop/start semantics instead: remove calls to suspend()/resume(), implement
pause_agent to perform a graceful shutdown/stop of the agent process (call the
existing stop logic or a new stop_agent/_terminate_agent flow that closes API
clients, cancels outstanding async tasks, and waits for process termination),
and implement resume_agent as start_agent (or spawn a fresh process via your
existing start logic) that re-initializes the agent state and then calls
_notify_status(agent_id, "running"); if true pause/resume semantics must be
kept, add explicit documentation/comments in pause_agent/resume_agent about
platform-specific asyncio risks and add extensive tests/recovery logic for
interrupted async operations instead of using psutil suspend/resume.
In @server/routers/parallel_agents.py:
- Around line 150-155: The return builds the status by indexing
orchestrator.agents and then accessing .status on a fallback {}, which raises
AttributeError; change it to safely fetch the agent object and then read its
status with a default (e.g., use agent = orchestrator.agents.get(agent_id) and
then status = getattr(agent, "status", "unknown") or similar), and return that
status under the "status" key so missing agents yield "unknown" without raising.
In @ui/tsconfig.tsbuildinfo:
- Line 1: The tsconfig.tsbuildinfo file is a generated TypeScript build artifact
that should not be committed; remove tsconfig.tsbuildinfo from version control
(unstage/delete tracked file) and add the pattern *.tsbuildinfo to the
.gitignore under the "Build outputs" section alongside ui/dist/ and .vite/ so
future builds are ignored by VCS.
In @worktree.py:
- Around line 68-77: The git init code currently always returns True even if
"git add" or "git commit" fail; update the initialization in the method that
calls self._run_git for "add" and "commit" so that if either command returns
non-zero it logs the stderr and returns False (or raises an exception) instead
of True; specifically, change the block that runs self._run_git("add", "-A") and
self._run_git("commit", "-m", "Initial commit for parallel agents",
"--allow-empty") to check result.returncode and immediately return False when a
failure occurs (include result.stderr in the logger.warning/error), and ensure
create_worktree checks the initializer's return value before relying on HEAD.
- Around line 147-175: The _setup_shared_resources function currently swallows
symlink failures for features.db which breaks shared DB coordination required by
feature_claim_next(agent_id); change the fallback to enforce accessibility: in
the except OSError block for worktree_db.symlink_to(features_db) either
(preferable) raise an exception to abort worktree creation with a clear error
mentioning features_db/worktree_db and MCP server configuration, or
(alternative) write the absolute path of features_db into the worktree MCP
server config and validate that the new worktree process can open the DB before
returning; update the error message/log to be explicit and reference
features_db, worktree_db and MCP server so callers know to use the main project
path when symlinks aren’t supported.
- Around line 248-285: merge_worktree_changes currently merges the agent branch
into whatever branch is checked out (using _run_git("merge", branch)) which can
merge into a feature branch; update merge_worktree_changes to explicitly
checkout the intended target branch (use the computed main_branch from
_run_git("branch", "--show-current") or fallback "main") in the main repo via
_run_git("checkout", main_branch) before running _run_git("merge", branch), and
ensure you pass the correct cwd for main repo operations (not worktree_path)
when checking out and merging; additionally, when a merge conflict occurs
(result.returncode != 0) keep the merge aborted but leave a clear return path:
call _run_git("merge", "--abort") for cleanup, log a user-facing message that
the worktree has conflicts and must be resolved manually (include
result.stderr), and do not silently leave the worktree in an uncommitted state —
optionally return False so caller can prompt conflict resolution.
🧹 Nitpick comments (13)
ui/src/App.tsx (1)
165-182: Consider using controlled component pattern for mode state.The current implementation works correctly:
ParallelAgentControlmaintains internalisParallelModestate and notifies the parent viaonModeChange, while the parent'sparallelModecontrols visibility ofAgentControl. The states stay synchronized through the callback pattern.However, for better predictability and maintenance, consider making
ParallelAgentControla fully controlled component by passingparallelModeas a prop instead of having it manage internal state. This would eliminate potential synchronization issues and make the data flow more explicit.Additionally, you may want to consider whether the parallel mode preference should persist when users switch projects, or if each project should start in single-agent mode.
♻️ Optional refactoring to controlled component pattern
In ParallelAgentControl.tsx, accept
isParallelModeas a prop:interface ParallelAgentControlProps { projectName: string isParallelMode: boolean // Add this prop onModeChange?: (mode: boolean) => void } export function ParallelAgentControl({ projectName, isParallelMode, // Receive from parent instead of internal state onModeChange, }: ParallelAgentControlProps) { // Remove: const [isParallelMode, setIsParallelMode] = useState(false) const handleToggleMode = () => { onModeChange?.(!isParallelMode) // Just notify parent } // ... rest of component }In App.tsx, pass the state as a prop:
<ParallelAgentControl projectName={selectedProject} + isParallelMode={parallelMode} onModeChange={setParallelMode} />ui/src/components/FeatureCard.tsx (1)
30-48: Extract shared agent color logic to avoid duplication.The
getAgentColorfunction andAGENT_COLORSarray are duplicated in bothFeatureCard.tsxandParallelAgentControl.tsx(lines 28-44). Consider extracting this logic to a shared utility file (e.g.,ui/src/lib/agentColors.ts) to maintain consistency and reduce maintenance burden.♻️ Suggested refactor
Create a new file
ui/src/lib/agentColors.ts:const AGENT_COLORS = [ '#00b4d8', // cyan - agent-1 '#70e000', // green - agent-2 '#8338ec', // purple - agent-3 '#ff5400', // orange - agent-4 '#ff006e', // pink - agent-5 ] export function getAgentColor(agentId: string): string { const match = agentId.match(/\d+/) if (match) { const num = parseInt(match[0], 10) - 1 return AGENT_COLORS[num % AGENT_COLORS.length] } return AGENT_COLORS[0] }Then import and use it in both components:
-// Generate consistent color for agent ID -function getAgentColor(agentId: string): string { - const colors = [ - '#00b4d8', // cyan - agent-1 - '#70e000', // green - agent-2 - '#8338ec', // purple - agent-3 - '#ff5400', // orange - agent-4 - '#ff006e', // pink - agent-5 - ] - - // Extract number from agent-N format - const match = agentId.match(/\d+/) - if (match) { - const num = parseInt(match[0], 10) - 1 - return colors[num % colors.length] - } - - return colors[0] -} +import { getAgentColor } from '../lib/agentColors'parallel_agent_runner.py (1)
79-100: Consider optional input validation improvements.The script could benefit from additional validation:
- Verify
agent_idmatches expected format (e.g.,agent-\d+)- Confirm
worktree_diris actually a git worktree (e.g., check for.gitfile pointing to parent)These validations would catch configuration errors earlier and provide clearer error messages.
💡 Optional validation enhancements
def main() -> None: """Main entry point.""" args = parse_args() project_dir = Path(args.project_dir).resolve() worktree_dir = Path(args.worktree_dir).resolve() if not project_dir.exists(): print(f"Error: Project directory does not exist: {project_dir}") return if not worktree_dir.exists(): print(f"Error: Worktree directory does not exist: {worktree_dir}") return + + # Validate agent_id format + import re + if not re.match(r'^agent-\d+$', args.agent_id): + print(f"Error: Invalid agent_id format: {args.agent_id} (expected: agent-N)") + return + + # Verify worktree_dir is a git worktree + git_file = worktree_dir / '.git' + if not git_file.exists() or not git_file.is_file(): + print(f"Warning: {worktree_dir} may not be a git worktree") print(f"[{args.agent_id}] Starting agent")api/database.py (1)
78-91: Consider adding transaction safety to migration.The migration function lacks explicit error handling and transaction rollback. While adding a nullable column is low-risk, wrapping the migration in a try/except block would prevent partial failures from leaving the database in an inconsistent state.
♻️ Enhanced error handling
def _migrate_add_assigned_agent_id_column(engine) -> None: """Add assigned_agent_id column to existing databases that don't have it.""" from sqlalchemy import text with engine.connect() as conn: - # Check if column exists - result = conn.execute(text("PRAGMA table_info(features)")) - columns = [row[1] for row in result.fetchall()] - - if "assigned_agent_id" not in columns: - # Add the column (nullable, no default) - conn.execute(text("ALTER TABLE features ADD COLUMN assigned_agent_id VARCHAR(50)")) - conn.commit() + try: + # Check if column exists + result = conn.execute(text("PRAGMA table_info(features)")) + columns = [row[1] for row in result.fetchall()] + + if "assigned_agent_id" not in columns: + # Add the column (nullable, no default) + conn.execute(text("ALTER TABLE features ADD COLUMN assigned_agent_id VARCHAR(50)")) + conn.commit() + except Exception as e: + conn.rollback() + raise RuntimeError(f"Failed to migrate assigned_agent_id column: {e}") from eNote: The existing
_migrate_add_in_progress_columnfunction could benefit from the same enhancement.ui/src/components/ParallelAgentControl.tsx (2)
106-254: Add error feedback for failed operations.The component doesn't display error messages when mutations fail. Users should be informed when actions like starting or stopping agents encounter errors. Consider adding error display using the
errorproperty from each mutation hook.💬 Error display example
return ( <div className="flex flex-col gap-3 p-4 bg-white border-3 border-[var(--color-neo-border)] shadow-neo"> {/* Header */} <div className="flex items-center justify-between"> ... </div> + {/* Error Display */} + {(startAgents.error || stopAgents.error || mergeWorktrees.error || cleanupAgents.error) && ( + <div className="p-2 bg-red-100 border border-red-400 text-red-700 text-sm rounded"> + {startAgents.error?.message || stopAgents.error?.message || + mergeWorktrees.error?.message || cleanupAgents.error?.message} + </div> + )} {/* Agent Status Grid */} {status && status.agents.length > 0 && (
140-154: Consider using dynamic max from backend.The agent count selector hardcodes a maximum of 5 agents (line 149), but the backend provides
status.max_agents. Consider using the backend value to keep the UI synchronized with server-side constraints.🔧 Dynamic max agents
+ const maxAgents = status?.max_agents ?? 5 + <button - onClick={() => setNumAgents(Math.min(5, numAgents + 1))} + onClick={() => setNumAgents(Math.min(maxAgents, numAgents + 1))} className="neo-btn neo-btn-secondary p-1" - disabled={numAgents >= 5} + disabled={numAgents >= maxAgents} > <Plus size={14} /> </button>autonomous_agent_demo.py (2)
163-188: Consider propagatingmax_iterationsto parallel agents.The parallel execution path does not use
args.max_iterations, while the single-agent path does. If--max-iterationsis intended to work with parallel mode, it should be passed to the orchestrator or individual agents.
189-195: RedundantKeyboardInterrupthandling.The
KeyboardInterruptis already caught insiderun_parallel()at line 181. The outer catch at line 191 will only trigger if the interrupt occurs duringasyncio.run()startup/teardown, which is an edge case. This is acceptable but could be simplified.ui/src/lib/api.ts (1)
181-187: Consider usingParallelAgentActionResponsetype.The inline return type matches
ParallelAgentActionResponsefrom./types. Using the shared type would improve consistency and maintainability.♻️ Suggested refactor
export async function mergeParallelWorktrees( projectName: string -): Promise<{ success: boolean; agents: Record<string, boolean>; message: string }> { +): Promise<ParallelAgentActionResponse> { return fetchJSON(`/projects/${encodeURIComponent(projectName)}/parallel-agents/merge`, { method: 'POST', }) }server/routers/parallel_agents.py (1)
22-30: Consider moving import to module level.The dynamic
sys.pathmanipulation inside_get_project_pathis a code smell. Consider restructuring imports at module level or using a proper package structure.worktree.py (2)
100-103: Consider validating existing worktree state.When a worktree already exists, the method returns the path immediately without validating that the worktree is in a usable state (e.g., not corrupted, branch exists, git operations work). While rare, a corrupted worktree could cause agent startup failures that are difficult to diagnose.
Consider adding basic validation:
if worktree_path.exists(): # Verify worktree is registered with git result = self._run_git("worktree", "list", "--porcelain") if str(worktree_path) not in result.stdout: logger.warning(f"Worktree path exists but not registered, recreating") # Clean up and recreate else: logger.info(f"Worktree already exists for agent {agent_id}") return worktree_path
287-306: Strengthen worktree filtering logic.Lines 293-295 rely on string matching and prefix removal to identify agent worktrees. This works but is fragile if naming conventions change. Consider using a more robust approach.
for wt in worktrees: path = Path(wt.get("path", "")) - if path != self.project_dir and str(self.worktrees_dir) in str(path): + # Only remove worktrees we manage (under our worktrees directory) + try: + path.relative_to(self.worktrees_dir) + is_managed = True + except ValueError: + is_managed = False + + if is_managed and path != self.project_dir: # Extract agent_id from path - agent_id = path.name.replace("agent_", "") + if path.name.startswith("agent_"): + agent_id = path.name[6:] # Remove "agent_" prefix + else: + continue self.remove_worktree(agent_id)parallel_agents.py (1)
354-360: Wrap blocking git operations in executor.The method calls
self.worktree_manager.merge_worktree_changes()(line 358), which runs synchronous git commands via subprocess. In an async function, these blocking operations should be wrapped inrun_in_executorto avoid blocking the event loop.async def merge_all_worktrees(self) -> dict[str, bool]: """Merge changes from all agent worktrees back to main.""" results = {} + loop = asyncio.get_running_loop() for agent_id in self.agents: - success = self.worktree_manager.merge_worktree_changes(agent_id) + success = await loop.run_in_executor( + None, + self.worktree_manager.merge_worktree_changes, + agent_id + ) results[agent_id] = success return results
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (20)
CLAUDE.mdagent.pyapi/database.pyautonomous_agent_demo.pyclient.pymcp_server/feature_mcp.pyparallel_agent_runner.pyparallel_agents.pyserver/main.pyserver/routers/__init__.pyserver/routers/parallel_agents.pyserver/schemas.pyui/src/App.tsxui/src/components/FeatureCard.tsxui/src/components/ParallelAgentControl.tsxui/src/hooks/useProjects.tsui/src/lib/api.tsui/src/lib/types.tsui/tsconfig.tsbuildinfoworktree.py
🧰 Additional context used
🧬 Code graph analysis (8)
ui/src/components/ParallelAgentControl.tsx (2)
ui/src/hooks/useProjects.ts (5)
useParallelAgentsStatus(158-165)useStartParallelAgents(167-178)useStopParallelAgents(180-190)useMergeParallelWorktrees(192-201)useCleanupParallelAgents(203-213)server/schemas.py (1)
ParallelAgentInfo(134-140)
parallel_agent_runner.py (3)
agent.py (1)
run_autonomous_agent(109-239)registry.py (1)
get_project_path(195-213)autonomous_agent_demo.py (2)
parse_args(45-108)main(111-213)
ui/src/App.tsx (2)
ui/src/components/AgentControl.tsx (1)
AgentControl(17-128)ui/src/components/ParallelAgentControl.tsx (1)
ParallelAgentControl(46-255)
ui/src/lib/api.ts (1)
server/schemas.py (2)
ParallelAgentsStatus(143-147)ParallelAgentActionResponse(150-154)
ui/src/lib/types.ts (2)
server/schemas.py (3)
ParallelAgentInfo(134-140)ParallelAgentsStatus(143-147)ParallelAgentActionResponse(150-154)server/services/process_manager.py (3)
status(87-88)status(91-95)pid(139-140)
server/schemas.py (1)
server/services/process_manager.py (3)
status(87-88)status(91-95)pid(139-140)
agent.py (1)
client.py (1)
create_client(79-217)
mcp_server/feature_mcp.py (2)
api/database.py (2)
Feature(19-46)to_dict(34-46)ui/src/lib/types.ts (1)
Feature(60-67)
🔇 Additional comments (25)
agent.py (3)
114-114: LGTM! Clean parameter addition.The
agent_idparameter is properly typed and documented, enabling agent identification in parallel execution scenarios.
126-146: LGTM! Clear agent identification in output.The prefix pattern consistently labels all output with the agent ID when running in parallel mode, making logs easy to follow when multiple agents are running.
187-187: LGTM! Proper propagation to client.The
agent_idis correctly passed through tocreate_client, which will set theAGENT_IDenvironment variable for the MCP servers (as shown in the relevant code snippet from client.py).server/routers/__init__.py (1)
12-12: LGTM! Standard router registration.The parallel agents router is properly imported and exposed, following the same pattern as existing routers.
Also applies to: 20-20
CLAUDE.md (1)
74-109: LGTM! Comprehensive parallel agents documentation.The documentation clearly explains:
- CLI usage with
--num-agentsflag- API endpoints for control
- MCP tools for coordination
- Operational model (worktrees, shared DB, atomic claiming)
This will help users understand and use the parallel execution feature effectively.
server/main.py (1)
23-23: LGTM! Proper router registration.The parallel agents router is correctly imported and included in the FastAPI app, following the same pattern as other routers. The comment clearly indicates its purpose.
Also applies to: 93-93
ui/src/App.tsx (1)
11-11: LGTM! UI support for parallel mode toggle.The import and state initialization enable the parallel agents UI feature.
Also applies to: 38-38
ui/src/lib/types.ts (1)
69-69: LGTM! Type definitions align with server schemas.The new
assigned_agent_idfield and parallel agent types are correctly defined and match the backend schemas shown in the relevant snippets.Also applies to: 102-121
api/database.py (1)
32-32: LGTM! Database schema changes are well-structured.The
assigned_agent_idcolumn is properly defined with appropriate constraints (nullable, indexed) and correctly exposed in theto_dictmethod for API serialization.Also applies to: 45-45
ui/src/components/ParallelAgentControl.tsx (1)
46-255: Well-structured component with good UX patterns.The component provides clear visual feedback, appropriate loading states, and intuitive controls. The mode toggle, agent status badges, and action buttons are well-organized and user-friendly.
autonomous_agent_demo.py (1)
101-106: LGTM!The
--num-agentsargument is well-defined with sensible defaults and a clear description. The max limit of 10 agents is enforced both here in the help text and on line 140.server/schemas.py (2)
134-140: LGTM!The
ParallelAgentInfoschema correctly extends the base status literals with"unknown"to handle edge cases where agent state cannot be determined. This aligns with the router's error handling patterns.
87-87: LGTM!Adding
assigned_agent_idtoFeatureResponsealigns with the database model changes inapi/database.pyand enables the UI to display agent assignment badges on feature cards.ui/src/hooks/useProjects.ts (2)
158-164: LGTM!The hook follows the established pattern with the same 3-second polling interval as
useAgentStatus. Query key naming is consistent with other hooks.
167-213: LGTM!The mutation hooks correctly invalidate both
parallel-agents-statusandagent-statusqueries where appropriate. TheuseMergeParallelWorktreeshook correctly only invalidates parallel-agents-status since merging doesn't affect agent running state.client.py (2)
173-189: LGTM!The centralized
mcp_envdictionary cleanly handles environment propagation for the MCP server. Conditionally addingAGENT_IDonly when provided avoids polluting the environment with empty values.
28-30: All three tools are properly registered in the MCP server with matching names. Verification confirms:
feature_claim_nextregistered at feature_mcp.py:382feature_releaseregistered at feature_mcp.py:432feature_clear_in_progressregistered at feature_mcp.py:350All tools are decorated with
@mcp.tool()and the tool names in client.py match exactly.ui/src/lib/api.ts (1)
156-179: LGTM!The API functions properly encode project names, use correct HTTP methods, and align with the server endpoint definitions in
server/routers/parallel_agents.py.server/routers/parallel_agents.py (2)
65-91: LGTM!The status endpoint correctly runs a healthcheck before reporting and properly constructs the
ParallelAgentInfoobjects from orchestrator state.
49-62: Caching is properly implemented. The_get_orchestratorfunction inparallel_agents.pymaintains a module-level_orchestratorsdictionary that caches orchestrators per project name (lines 379-383). Each orchestrator is created once and reused across requests, correctly preserving agent state.mcp_server/feature_mcp.py (2)
381-428: Good use of row-level locking for atomic claims.The
with_for_update()clause provides proper database-level locking to prevent race conditions between parallel agents. The exception handling with rollback is also appropriate.
431-472: LGTM!The
feature_releasefunction correctly validates ownership whenagent_idis provided and properly clears bothin_progressandassigned_agent_idfields.parallel_agents.py (3)
73-96: LGTM: Robust callback implementation.The callback registration and notification methods properly handle errors in individual callbacks without affecting others (lines 84-87, 92-95), which prevents one faulty callback from breaking the entire notification system.
236-273: LGTM: Proper process termination with graceful fallback.The
stop_agentmethod correctly implements a two-stage shutdown: terminate with a 5-second timeout, then kill if necessary (lines 252-262). The output task is properly canceled and awaited (lines 244-249), preventing resource leaks.
335-352: Good practice: Proactive crashed state detection.The healthcheck method (lines 335-352) polls process status and proactively updates the agent status to "crashed" when a process exits unexpectedly. This provides better observability than waiting for the next operation to discover the failure.
| if agent_id: | ||
| query = query.filter( | ||
| (Feature.assigned_agent_id == None) | | ||
| (Feature.assigned_agent_id == "") | | ||
| (Feature.assigned_agent_id == agent_id) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check all places where assigned_agent_id is set
rg -n 'assigned_agent_id\s*=' --type pyRepository: leonvanzyl/autocoder
Length of output: 1096
Remove unnecessary empty-string check from filter conditions.
The filter checks for both None and "" to handle assigned_agent_id, but the code exclusively uses None when clearing the assignment (lines 242, 372, 462). The empty-string check in the filter (lines 168-169 and 405-406) is inconsistent with the actual clearing behavior and appears to be unnecessary dead code. Remove the (Feature.assigned_agent_id == "") condition from both filter blocks to align with the consistent use of None throughout the codebase.
🤖 Prompt for AI Agents
In @mcp_server/feature_mcp.py around lines 166 - 171, The filter on
Feature.assigned_agent_id inside the query.filter call should not include the
empty-string check; remove the (Feature.assigned_agent_id == "") clause so the
filter only checks for None or the matching agent_id. Update both places where
this pattern appears (the query.filter that currently uses
(Feature.assigned_agent_id == None) | (Feature.assigned_agent_id == "") |
(Feature.assigned_agent_id == agent_id)) to instead use
(Feature.assigned_agent_id == None) | (Feature.assigned_agent_id == agent_id) to
match the codebase’s use of None when clearing assignments.
| feature = ( | ||
| session.query(Feature) | ||
| .filter(Feature.passes == False) | ||
| .filter( | ||
| (Feature.in_progress == False) | | ||
| (Feature.assigned_agent_id == None) | | ||
| (Feature.assigned_agent_id == "") | | ||
| (Feature.assigned_agent_id == agent_id) | ||
| ) | ||
| .order_by(Feature.priority.asc(), Feature.id.asc()) | ||
| .with_for_update() # Lock the row | ||
| .first() | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential race condition in claim filter logic.
The current filter allows claiming a feature if in_progress=False OR assigned_agent_id is None/empty. This means a feature with in_progress=True and assigned_agent_id=None (e.g., set by non-parallel agent) could be claimed by a parallel agent, causing conflicts.
Consider using AND logic to ensure the feature is truly available:
🐛 Proposed fix
feature = (
session.query(Feature)
.filter(Feature.passes == False)
.filter(
- (Feature.in_progress == False) |
- (Feature.assigned_agent_id == None) |
- (Feature.assigned_agent_id == "") |
- (Feature.assigned_agent_id == agent_id)
+ (Feature.in_progress == False) &
+ ((Feature.assigned_agent_id == None) | (Feature.assigned_agent_id == ""))
+ ) | (
+ (Feature.assigned_agent_id == agent_id) # Allow re-claiming own feature
)
.order_by(Feature.priority.asc(), Feature.id.asc())
.with_for_update() # Lock the row
.first()
)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In @mcp_server/feature_mcp.py around lines 400 - 412, The claim filter uses OR
which allows a row with in_progress=True and assigned_agent_id==None to be
claimed; change the predicate so in_progress must be False AND the
assigned_agent_id must be either None/empty or equal to agent_id. Update the
.filter block that references Feature.in_progress and Feature.assigned_agent_id
(and uses agent_id) to require Feature.in_progress == False AND
(Feature.assigned_agent_id is None OR == "" OR == agent_id) while keeping the
outer passes == False, ordering, with_for_update(), and .first() intact.
| try: | ||
| process = subprocess.Popen( | ||
| cmd, | ||
| stdout=subprocess.PIPE, | ||
| stderr=subprocess.STDOUT, | ||
| cwd=str(worktree_path), | ||
| ) | ||
|
|
||
| agent_info = AgentInfo( | ||
| agent_id=agent_id, | ||
| process=process, | ||
| worktree_path=worktree_path, | ||
| status="running", | ||
| started_at=datetime.now(), | ||
| ) | ||
| self.agents[agent_id] = agent_info | ||
|
|
||
| # Start output streaming | ||
| agent_info.output_task = asyncio.create_task( | ||
| self._stream_output(agent_id) | ||
| ) | ||
|
|
||
| await self._notify_status(agent_id, "running") | ||
| logger.info(f"Started agent {agent_id} (PID {process.pid})") | ||
| return True | ||
|
|
||
| except Exception as e: | ||
| logger.exception(f"Failed to start agent {agent_id}") | ||
| return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use asyncio subprocess instead of blocking Popen.
Using subprocess.Popen (lines 175-180) in an async function blocks the event loop during process creation. For a parallel agent orchestrator, this defeats the purpose of async/await. Additionally, there's no verification that the process actually started—if Popen fails silently or the process exits immediately, the method still returns True.
🔄 Recommended fix using asyncio subprocess
try:
- process = subprocess.Popen(
- cmd,
- stdout=subprocess.PIPE,
- stderr=subprocess.STDOUT,
- cwd=str(worktree_path),
- )
+ process = await asyncio.create_subprocess_exec(
+ *cmd,
+ stdout=asyncio.subprocess.PIPE,
+ stderr=asyncio.subprocess.STDOUT,
+ cwd=str(worktree_path),
+ )
+
+ # Give process a moment to start
+ await asyncio.sleep(0.1)
+
+ # Verify process didn't immediately exit
+ if process.returncode is not None:
+ logger.error(f"Agent {agent_id} process exited immediately with code {process.returncode}")
+ return False
agent_info = AgentInfo(
agent_id=agent_id,Note: This will require updating AgentInfo.process type from subprocess.Popen to asyncio.subprocess.Process and adjusting _stream_output accordingly.
Committable suggestion skipped: line range outside the PR's diff.
| async def pause_agent(self, agent_id: str) -> bool: | ||
| """Pause an agent using psutil.""" | ||
| agent = self.agents.get(agent_id) | ||
| if not agent or not agent.process or agent.status != "running": | ||
| return False | ||
|
|
||
| try: | ||
| proc = psutil.Process(agent.process.pid) | ||
| proc.suspend() | ||
| agent.status = "paused" | ||
| await self._notify_status(agent_id, "paused") | ||
| return True | ||
| except Exception as e: | ||
| logger.exception(f"Failed to pause agent {agent_id}") | ||
| return False | ||
|
|
||
| async def resume_agent(self, agent_id: str) -> bool: | ||
| """Resume a paused agent.""" | ||
| agent = self.agents.get(agent_id) | ||
| if not agent or not agent.process or agent.status != "paused": | ||
| return False | ||
|
|
||
| try: | ||
| proc = psutil.Process(agent.process.pid) | ||
| proc.resume() | ||
| agent.status = "running" | ||
| await self._notify_status(agent_id, "running") | ||
| return True | ||
| except Exception as e: | ||
| logger.exception(f"Failed to resume agent {agent_id}") | ||
| return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check psutil version and platform support notes
# Check installed psutil version
python -c "import psutil; print(f'psutil version: {psutil.__version__}')" 2>/dev/null || echo "psutil not found"
# Check if suspend/resume are available (they are on most platforms)
python -c "
import psutil
p = psutil.Process()
print(f'suspend available: {hasattr(p, \"suspend\")}')
print(f'resume available: {hasattr(p, \"resume\")}')
print(f'Platform: {psutil.os.name}')
" 2>/dev/null || echo "Error checking psutil methods"Repository: leonvanzyl/autocoder
Length of output: 150
🌐 Web query:
Python asyncio SIGSTOP SIGCONT process suspension handling side effects
💡 Result:
Short summary and key side effects
-
SIGSTOP cannot be caught, blocked, or ignored; it immediately suspends the entire process (all threads) at the kernel level. The process (and the asyncio event loop) is frozen until a SIGCONT (or similar) resumes it. [1][2]
-
SIGCONT will resume the process; SIGCONT itself can be caught by a handler, but the resume action is performed by the kernel when SIGCONT is generated — the process is continued even if the signal is blocked or ignored; pending signals are then processed when execution resumes. That ordering (resume then deliver/handle pending signals) is defined by POSIX and observed on Linux. [3][4]
-
Practical effects for asyncio programs
- While stopped by SIGSTOP the event loop cannot run, so no callbacks, tasks, timers or subprocess-watcher callbacks execute while the process is suspended.
- On resume, pending signals (including SIGCHLD) and any scheduled timeouts or delayed callbacks may be processed immediately. This can create surprising behavior: timeouts that you expected to elapse gradually will instead be handled at once when the process resumes, and signal-driven logic (child watchers, signal handlers) may run in a burst. [1][3]
- Asyncio’s child-watcher implementations and signal-handling code interact with OS signal semantics; there are known races and complexities (e.g., SIGCHLD handling / child watchers) you should be aware of when suspending/resuming processes. See asyncio child-watcher docs and related bug reports for details. [5][6]
-
Things to watch for / mitigations
- Don’t rely on sleeping or periodic timers to behave smoothly across a SIGSTOP/SIGCONT cycle; design code to handle timers firing immediately after resume (use explicit monotonic timestamps and check elapsed intervals in your coroutine logic).
- If you need to run cleanup on suspend/resume, install a SIGCONT handler (SIGSTOP can’t be caught) to perform resume-time recovery/adjustments — but remember SIGCONT’s resume happens at kernel level and handlers only run once execution continues. [3][4]
- Be cautious with subprocess management and rely on robust child-watcher choices (see asyncio docs) because SIGCHLD delivery and waiting can race with stop/continue cycles. [5][6]
References
- Python signal module docs (execution of Python signal handlers). [signal — Set handlers for asynchronous events — Python docs]. [turn0search2]
- POSIX / general: SIGKILL and SIGSTOP cannot be caught or ignored. [Open Group / POSIX; common signals references]. [turn1search4][turn1search9]
- POSIX & Linux notes on SIGCONT behavior (resume semantics and pending-signal handling). [turn1search4][turn1search7]
- Discussion: SIGCONT is effective to resume even if blocked/ignored; order of delivery and resume explained. [turn1search3][turn1search7]
- asyncio child-watcher and policy docs (effects for subprocess and SIGCHLD handling). [asyncio policy / child watcher docs]. [turn0search3][turn0search6]
- Python issue / discussions on races in asyncio signal/child watcher handling. [Python issue 45718]. [turn0search1]
If you want, I can give a short example showing how to install a SIGCONT handler in asyncio and a pattern for adjusting timers after resume.
Reconsider pause/resume implementation due to documented asyncio compatibility issues.
The pause and resume operations use psutil.Process.suspend() and .resume() (lines 292-293, 308-309), which translate to SIGSTOP/SIGCONT on Unix and kernel-level suspension on Windows. Research on asyncio behavior shows significant practical issues:
- SIGSTOP freezes the entire process and asyncio event loop, blocking all callbacks, timers, and I/O handlers
- On resume via SIGCONT, pending signals (SIGCHLD, etc.), timeouts, and callbacks all fire at once in a burst rather than being processed normally—this causes timeouts expected to elapse gradually to trigger immediately
- Asyncio's child-watcher and signal handling interact with these stop/continue cycles in ways that create known races and unpredictable ordering
- Outstanding async operations (API requests, database queries, I/O) may timeout, fail, or encounter inconsistent state when the process is suspended mid-execution
For a multi-agent system managing Claude API clients and external processes, this creates significant reliability risks. Either:
- Replace pause/resume with stop/start (cleaner state transition), or
- If pause/resume is required, extensively document the limitations and test thoroughly across target platforms, including recovery logic for interrupted async operations and signal handling.
🤖 Prompt for AI Agents
In @parallel_agents.py around lines 285 - 315, pause_agent and resume_agent
currently call psutil.Process.suspend()/resume() (psutil.Process.suspend,
psutil.Process.resume) which uses SIGSTOP/SIGCONT and will freeze the entire
asyncio event loop and cause unpredictable behavior; replace these methods with
stop/start semantics instead: remove calls to suspend()/resume(), implement
pause_agent to perform a graceful shutdown/stop of the agent process (call the
existing stop logic or a new stop_agent/_terminate_agent flow that closes API
clients, cancels outstanding async tasks, and waits for process termination),
and implement resume_agent as start_agent (or spawn a fresh process via your
existing start logic) that re-initializes the agent state and then calls
_notify_status(agent_id, "running"); if true pause/resume semantics must be
kept, add explicit documentation/comments in pause_agent/resume_agent about
platform-specific asyncio risks and add extensive tests/recovery logic for
interrupted async operations instead of using psutil suspend/resume.
| def merge_worktree_changes(self, agent_id: str, commit_message: Optional[str] = None) -> bool: | ||
| """ | ||
| Merge changes from an agent's worktree back to main branch. | ||
| Args: | ||
| agent_id: Unique identifier for the agent | ||
| commit_message: Optional commit message | ||
| Returns: | ||
| True if successful, False otherwise | ||
| """ | ||
| worktree_path = self.get_worktree_path(agent_id) | ||
| branch = f"agent-{agent_id}" | ||
|
|
||
| if not worktree_path.exists(): | ||
| logger.error(f"Worktree for agent {agent_id} does not exist") | ||
| return False | ||
|
|
||
| # Commit any uncommitted changes in the worktree | ||
| self._run_git("add", "-A", cwd=worktree_path) | ||
| msg = commit_message or f"Agent {agent_id} changes" | ||
| self._run_git("commit", "-m", msg, "--allow-empty", cwd=worktree_path) | ||
|
|
||
| # Get current branch in main repo | ||
| result = self._run_git("branch", "--show-current") | ||
| main_branch = result.stdout.strip() or "main" | ||
|
|
||
| # Merge the agent branch into main | ||
| result = self._run_git("merge", branch, "--no-edit") | ||
|
|
||
| if result.returncode != 0: | ||
| logger.warning(f"Merge conflict for agent {agent_id}: {result.stderr}") | ||
| # Abort the merge | ||
| self._run_git("merge", "--abort") | ||
| return False | ||
|
|
||
| logger.info(f"Merged changes from agent {agent_id}") | ||
| return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure merge targets the correct branch.
The method merges agent changes into whatever branch is currently checked out in the main repository (line 276), without first checking out the intended target branch. If a user is on a feature branch when merge is triggered, agent changes will merge into that branch instead of the main development branch.
Additionally, when merge conflicts occur (lines 278-282), the merge is aborted but the worktree remains in an uncommitted state, and no guidance is provided to resolve the conflict.
🔧 Proposed fix
# Get current branch in main repo
result = self._run_git("branch", "--show-current")
main_branch = result.stdout.strip() or "main"
+
+ # Checkout the main branch before merging
+ result = self._run_git("checkout", main_branch)
+ if result.returncode != 0:
+ logger.error(f"Failed to checkout {main_branch}: {result.stderr}")
+ return False
# Merge the agent branch into main
result = self._run_git("merge", branch, "--no-edit")
if result.returncode != 0:
logger.warning(f"Merge conflict for agent {agent_id}: {result.stderr}")
+ logger.warning(f"Manual resolution required. Agent branch: {branch}")
# Abort the merge
self._run_git("merge", "--abort")
return False🤖 Prompt for AI Agents
In @worktree.py around lines 248 - 285, merge_worktree_changes currently merges
the agent branch into whatever branch is checked out (using _run_git("merge",
branch)) which can merge into a feature branch; update merge_worktree_changes to
explicitly checkout the intended target branch (use the computed main_branch
from _run_git("branch", "--show-current") or fallback "main") in the main repo
via _run_git("checkout", main_branch) before running _run_git("merge", branch),
and ensure you pass the correct cwd for main repo operations (not worktree_path)
when checking out and merging; additionally, when a merge conflict occurs
(result.returncode != 0) keep the merge aborted but leave a clear return path:
call _run_git("merge", "--abort") for cleanup, log a user-facing message that
the worktree has conflicts and must be resolved manually (include
result.stderr), and do not silently leave the worktree in an uncommitted state —
optionally return False so caller can prompt conflict resolution.
- Fix race condition in feature_claim_next by using AND logic instead of OR - Remove unnecessary empty-string checks from filter conditions - Make agent startup concurrent with asyncio.gather() for faster init - Add docstrings warning about SIGSTOP/SIGCONT limitations with asyncio - Fix AttributeError when accessing non-existent agent status - Improve git init error handling (return False on add/commit failure) - Better symlink failure handling with marker file fallback - Fix merge to checkout target branch before merging - Add *.tsbuildinfo to .gitignore and remove from tracking 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @server/routers/parallel_agents.py:
- Around line 49-62: The import of parallel_agents in get_orchestrator is
fragile because it occurs before ensuring the repository root is on sys.path;
add the same sys.path management used in _get_project_path: compute root as
Path(__file__).parent.parent.parent, insert str(root) into sys.path if missing,
then import parallel_agents.get_orchestrator as _get_orchestrator; keep the rest
of the function (calls to validate_project_name, _get_project_path, and
returning _get_orchestrator(project_name, project_dir, ROOT_DIR)) unchanged.
🧹 Nitpick comments (6)
worktree.py (1)
314-333:cleanup_all_worktreesrelies on string matching for path filtering.The condition at line 320 uses
str(self.worktrees_dir) in str(path)which could match unintended paths if the worktrees directory name appears as a substring elsewhere. Consider usingpath.is_relative_to(self.worktrees_dir)for stricter matching (Python 3.9+).♻️ Safer path comparison
- if path != self.project_dir and str(self.worktrees_dir) in str(path): + if path != self.project_dir and path.is_relative_to(self.worktrees_dir):server/routers/parallel_agents.py (3)
22-30: Inline import withsys.pathmanipulation is fragile.The
_get_project_pathfunction modifiessys.pathat runtime (lines 24-27). This approach works but is fragile in production. Consider restructuring imports at module level or using proper package configuration.
117-137: Stop endpoint assumes all stops succeed without verification.Lines 131-136 build results assuming all running agents stopped successfully, but
stop_all_agentsdoesn't return individual success/failure status. If an agent fails to stop, the response still indicates success.Consider returning actual stop results from the orchestrator or verifying agent statuses after stopping:
♻️ Verify actual stop status
await orchestrator.stop_all_agents() - # Build results dict - results = {aid: True for aid in running_agents} + # Verify actual status after stop + results = { + aid: orchestrator.agents[aid].status == "stopped" + for aid in running_agents + if aid in orchestrator.agents + } return ParallelAgentActionResponse( - success=True, + success=all(results.values()) if results else True, agents=results, message=f"Stopped {len(running_agents)} agents", )
215-225: Cleanup endpoint could mask partial failures.The endpoint returns
success: Trueunconditionally (line 223) without checking if all agents actually stopped or if worktree cleanup succeeded. Consider returning actual cleanup status.parallel_agents.py (2)
381-387:merge_all_worktreescalls blocking I/O in async function.
worktree_manager.merge_worktree_changes(line 385) is a synchronous method that runs git commands. Calling it directly in an async function blocks the event loop during each merge.♻️ Use executor for blocking git operations
async def merge_all_worktrees(self) -> dict[str, bool]: """Merge changes from all agent worktrees back to main.""" results = {} + loop = asyncio.get_running_loop() for agent_id in self.agents: - success = self.worktree_manager.merge_worktree_changes(agent_id) + success = await loop.run_in_executor( + None, self.worktree_manager.merge_worktree_changes, agent_id + ) results[agent_id] = success return results
395-410: Global orchestrator registry may cause stale state.The
_orchestratorscache (line 396) persists orchestrators indefinitely. If a project's directory changes or an orchestrator enters a bad state, callers continue receiving the stale instance. Consider adding a mechanism to invalidate or refresh orchestrators.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
.gitignoremcp_server/feature_mcp.pyparallel_agents.pyserver/routers/parallel_agents.pyui/tsconfig.tsbuildinfoworktree.py
💤 Files with no reviewable changes (1)
- ui/tsconfig.tsbuildinfo
🧰 Additional context used
🧬 Code graph analysis (3)
mcp_server/feature_mcp.py (1)
api/database.py (2)
Feature(19-46)to_dict(34-46)
server/routers/parallel_agents.py (4)
server/schemas.py (3)
ParallelAgentStartRequest(128-131)ParallelAgentInfo(134-140)ParallelAgentActionResponse(150-154)registry.py (1)
get_project_path(195-213)parallel_agents.py (2)
get_orchestrator(399-410)healthcheck(362-379)server/services/process_manager.py (1)
pid(139-140)
parallel_agents.py (3)
worktree.py (4)
ensure_git_repo(51-79)create_worktree(85-147)merge_worktree_changes(263-312)cleanup_all_worktrees(314-333)server/services/process_manager.py (2)
pid(139-140)resume(338-359)server/routers/parallel_agents.py (2)
merge_all_worktrees(202-212)get_orchestrator(49-62)
🔇 Additional comments (23)
.gitignore (1)
101-101: ✅ Good practice to ignore TypeScript build artifacts.Adding
*.tsbuildinfoto the "Build outputs" section is appropriate and follows standard practice for TypeScript projects. Build info files are generated during compilation and should not be committed.mcp_server/feature_mcp.py (6)
145-179: LGTM onfeature_get_nextwith agent filtering.The updated logic correctly filters features by excluding those assigned to other agents when
agent_idis provided. The filter uses onlyNonechecks (no empty-string condition), which aligns with the codebase's consistent use ofNonefor clearing assignments.
303-345: LGTM onfeature_mark_in_progresswith agent assignment.The logic properly checks if a feature is already in progress by another agent before allowing the assignment. The condition at lines 330-335 correctly prevents one agent from claiming a feature already assigned to a different agent.
380-428:feature_claim_nextcorrectly implements atomic claiming.The filter logic now uses AND (
&) to ensure features are truly available:in_progress=FalseANDassigned_agent_id=None, or already assigned to the requesting agent. Thewith_for_update()row lock provides atomicity against concurrent claims. The exception handler properly rolls back on failure.Previous review concerns about race conditions have been addressed.
431-472: LGTM onfeature_release.The ownership validation at lines 455-459 correctly prevents agents from releasing features they don't own. The response includes helpful status information.
216-247: LGTM onfeature_mark_passingclearing agent assignment.Clearing
assigned_agent_idon completion (line 241) ensures the feature is fully released when marked as passing.
348-377: LGTM onfeature_clear_in_progressclearing agent assignment.The update correctly clears both
in_progressandassigned_agent_id(lines 370-371), returning the feature to an available state.worktree.py (5)
51-79: Error handling inensure_git_repois now correct.The method now returns
Falsewhengit addorgit commitfail (lines 70-77), addressing the previous concern about returningTruedespite failures.
225-261: Potential off-by-one inlist_worktreesparsing.The parsing loop correctly handles the porcelain format, but if the output ends without a trailing blank line, the final worktree entry is still captured at lines 258-259. This looks correct.
263-312: Merge logic now explicitly checks out target branch first.Lines 290-294 now ensure the main repository is on the target branch before merging, and conflict handling provides actionable guidance (lines 300-308). This addresses the previous concern about merging into the wrong branch.
85-147: LGTM oncreate_worktreewith fallback for existing branches.The logic correctly handles both new branch creation (lines 123-128) and fallback to existing branches (lines 131-136). The early return for existing worktrees (lines 103-105) prevents duplicate work.
191-223: LGTM onremove_worktreewith manual fallback.The fallback to
shutil.rmtree(lines 212-217) when git removal fails is a reasonable approach for cleanup scenarios.server/routers/parallel_agents.py (4)
39-46: Project name validation is appropriate for path traversal prevention.The regex at line 41 restricts names to alphanumeric, underscore, and hyphen characters with a 50-character limit, which prevents path traversal attacks like
../or null bytes.
65-91: LGTM on status endpoint.The endpoint correctly runs a healthcheck before reporting status, and properly constructs
ParallelAgentInfoobjects with all required fields.
140-156: Agent status retrieval is now correct.The previous issue with accessing
.statuson an empty dict fallback has been fixed. Line 151 now usesagent = orchestrator.agents.get(agent_id)and line 155 correctly accessesagent.status if agent else "unknown".
201-212: LGTM on merge endpoint.The endpoint properly aggregates results and reports success/failure counts.
parallel_agents.py (7)
101-140:start_agentsnow uses concurrent startup.The implementation correctly uses
asyncio.gather(*tasks, return_exceptions=True)(line 131) to start agents concurrently rather than sequentially. Exception handling properly converts failures toFalseresults.
293-342: Pause/resume limitations are now documented.The extensive docstrings (lines 297-304, 324-328) clearly warn about SIGSTOP/SIGCONT behavior with asyncio. This addresses the previous concern by documenting the caveats, though the underlying implementation risks remain if users rely on pause/resume with long-running async operations.
212-242:_stream_outputcorrectly uses executor for non-blocking I/O.The
run_in_executorpattern (lines 221-223) properly offloads blockingreadline()calls. The finally block correctly updates agent status based on exit code.
244-281: LGTM onstop_agentwith graceful termination.The implementation properly cancels the output task, terminates the process with a timeout, and falls back to
kill()if needed (lines 268-270).
362-379: LGTM on healthcheck.The healthcheck correctly detects crashed processes via
poll()and updates status accordingly.
26-35: LGTM onAgentInfodataclass.The dataclass cleanly captures all agent state with appropriate defaults.
47-71: LGTM on orchestrator initialization.The constructor properly initializes the WorktreeManager and tracking structures.
| def get_orchestrator(project_name: str): | ||
| """Get or create the parallel agent orchestrator for a project.""" | ||
| from parallel_agents import get_orchestrator as _get_orchestrator | ||
|
|
||
| project_name = validate_project_name(project_name) | ||
| project_dir = _get_project_path(project_name) | ||
|
|
||
| if not project_dir: | ||
| raise HTTPException(status_code=404, detail=f"Project '{project_name}' not found in registry") | ||
|
|
||
| if not project_dir.exists(): | ||
| raise HTTPException(status_code=404, detail=f"Project directory not found: {project_dir}") | ||
|
|
||
| return _get_orchestrator(project_name, project_dir, ROOT_DIR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, let's understand the project structure and locate parallel_agents module
echo "=== Project structure ==="
ls -la
echo -e "\n=== Search for parallel_agents module ==="
find . -type f -name "*parallel_agents*" | head -20
echo -e "\n=== Search for ParallelAgentOrchestrator class ==="
rg -l 'class ParallelAgentOrchestrator' --type py
echo -e "\n=== Search for parallel_agents imports ==="
rg 'from parallel_agents import|import parallel_agents' --type pyRepository: leonvanzyl/autocoder
Length of output: 2589
🏁 Script executed:
#!/bin/bash
# Check the current file and its imports
echo "=== Content of server/routers/parallel_agents.py ==="
cat -n server/routers/parallel_agents.py | head -70
echo -e "\n=== Check if __init__.py exists in server directory ==="
ls -la server/ | grep __init__
echo -e "\n=== Check if __init__.py exists in server/routers directory ==="
ls -la server/routers/ | grep __init__Repository: leonvanzyl/autocoder
Length of output: 2904
🏁 Script executed:
#!/bin/bash
# Look for sys.path configuration or setup.py
echo "=== Search for sys.path modifications or setup ==="
rg 'sys.path' --type py -B 2 -A 2
echo -e "\n=== Check for setup.py, pyproject.toml, or similar ==="
find . -maxdepth 2 -type f \( -name "setup.py" -o -name "pyproject.toml" -o -name "setup.cfg" -o -name "MANIFEST.in" \)Repository: leonvanzyl/autocoder
Length of output: 4048
Add sys.path management before importing parallel_agents to match the established pattern.
Line 51 imports parallel_agents without ensuring the root directory is in sys.path. The function _get_project_path (lines 22-30) and other functions throughout the codebase properly manage sys.path before importing root-level modules. While the current implementation may work if _get_project_path is called first (line 54), this creates a fragile implicit dependency. Add the same sys.path setup pattern used in _get_project_path to ensure the import is reliable regardless of execution context:
def get_orchestrator(project_name: str):
"""Get or create the parallel agent orchestrator for a project."""
import sys
root = Path(__file__).parent.parent.parent
if str(root) not in sys.path:
sys.path.insert(0, str(root))
from parallel_agents import get_orchestrator as _get_orchestrator
project_name = validate_project_name(project_name)
project_dir = _get_project_path(project_name)
...🤖 Prompt for AI Agents
In @server/routers/parallel_agents.py around lines 49 - 62, The import of
parallel_agents in get_orchestrator is fragile because it occurs before ensuring
the repository root is on sys.path; add the same sys.path management used in
_get_project_path: compute root as Path(__file__).parent.parent.parent, insert
str(root) into sys.path if missing, then import parallel_agents.get_orchestrator
as _get_orchestrator; keep the rest of the function (calls to
validate_project_name, _get_project_path, and returning
_get_orchestrator(project_name, project_dir, ROOT_DIR)) unchanged.
worktree.py
Outdated
| def _setup_shared_resources(self, worktree_path: Path) -> None: | ||
| """ | ||
| Set up shared resources in the worktree. | ||
| The features.db is shared across all worktrees via symlink. | ||
| This is critical for atomic feature claiming via feature_claim_next(). | ||
| Raises: | ||
| RuntimeError: If features.db cannot be shared with the worktree. | ||
| """ | ||
| # Create symlink to shared features.db | ||
| features_db = self.project_dir / "features.db" | ||
| worktree_db = worktree_path / "features.db" | ||
|
|
||
| if features_db.exists() and not worktree_db.exists(): | ||
| try: | ||
| # Try symlink first (preferred) | ||
| worktree_db.symlink_to(features_db) | ||
| logger.debug(f"Created symlink to features.db in {worktree_path}") | ||
| except OSError as e: | ||
| # Symlink failed - this breaks shared DB coordination | ||
| # The MCP server uses PROJECT_DIR env var which points to worktree, | ||
| # so we need the features.db to be accessible there | ||
| logger.warning( | ||
| f"Symlink creation failed ({e}). " | ||
| f"Ensure MCP server is configured to use {features_db} directly." | ||
| ) | ||
| # Write a marker file to indicate the absolute DB path | ||
| db_path_file = worktree_path / ".features_db_path" | ||
| db_path_file.write_text(str(features_db)) | ||
| logger.info(f"Wrote features.db path to {db_path_file}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Search for .features_db_path usage
rg -n '\.features_db_path' --type pyRepository: leonvanzyl/autocoder
Length of output: 146
🏁 Script executed:
# Also check for any references to features.db path handling in parallel_agent_runner.py
rg -n 'features.db' --type py | head -20Repository: leonvanzyl/autocoder
Length of output: 1635
🏁 Script executed:
# Check the MCP server initialization files
fd -e py -name '*mcp*' -o -name '*server*' | head -20Repository: leonvanzyl/autocoder
Length of output: 294
🏁 Script executed:
# Search for any code that reads from marker files or reads .features_db_path
rg -n 'read_text|open.*\.features_db_path' --type pyRepository: leonvanzyl/autocoder
Length of output: 1378
🏁 Script executed:
# Look at the features.py router to see how it accesses features.db
cat -n server/routers/features.py | head -50Repository: leonvanzyl/autocoder
Length of output: 1617
🏁 Script executed:
# Search for parallel_agent_runner.py
fd parallel_agent_runnerRepository: leonvanzyl/autocoder
Length of output: 88
🏁 Script executed:
# Check if there's any code that reads marker files from worktree_path
rg -n 'worktree_path.*read|read.*marker|\.features_db_path' --type pyRepository: leonvanzyl/autocoder
Length of output: 146
🏁 Script executed:
# Check parallel_agent_runner.py to see if it handles the marker file
cat -n parallel_agent_runner.py | head -100Repository: leonvanzyl/autocoder
Length of output: 3240
🏁 Script executed:
# Search for how parallel_agent_runner.py sets up the database
rg -n 'features.db|PROJECT_DIR|worktree' parallel_agent_runner.py -A 3 -B 1Repository: leonvanzyl/autocoder
Length of output: 1513
The .features_db_path marker file is never read — the fallback is incomplete.
The marker file (lines 177-178) is written when symlink creation fails, but no code in the repository reads it. The database coordination currently works only because parallel_agent_runner.py sets PROJECT_DIR to the main project directory (line 99), allowing the router and agent to access the shared features.db. If the symlink fails, the written marker file provides no actual recovery path since no MCP server, runner, or application code is configured to read it. Either remove the unused marker file or implement the reading mechanism in the code that needs to handle the fallback case.
- Use Path.is_relative_to() instead of string matching in worktree cleanup to prevent false positives from substring matches - Add missing docstrings to improve coverage above 80% threshold: - run_parallel() in autonomous_agent_demo.py - read_file(), write_file() in server/routers/projects.py - ConnectionManager.__init__() in server/websocket.py - status getter/setter, pid property in process_manager.py 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In @autonomous_agent_demo.py:
- Around line 163-170: The parallel runner ignores the --max-iterations flag;
modify run_parallel to forward the max_iterations value to
orchestrator.start_agents and ensure the orchestrator and its agent-launching
code accept and pass that parameter into each agent's run_autonomous_agent call
(or change orchestrator.start_agents signature to accept max_iterations and
propagate it when creating/starting agents), so parallel agents honor the same
iteration limit as the single-agent path.
- Around line 173-180: The healthcheck loop treats False as both crashed and
finished, so change the loop to query per-agent statuses (e.g., add or use
orchestrator.get_agent_statuses() or similar) and distinguish terminal
"finished" states from "crashed"/"error" states before printing the completion
message; specifically, when running == 0 call orchestrator.get_agent_statuses()
(or introduce that method if it doesn't exist) and if all agents report a
terminal success state print "All agents have finished", otherwise log an error
or "Agents crashed" with the list of failed agent IDs/names.
In @worktree.py:
- Around line 113-118: The fallback empty commit call using
self._run_git("commit", "--allow-empty", "-m", "Initial commit") is not checked;
capture its return value (e.g., commit_result = self._run_git(...)) and verify
commit_result.returncode before proceeding to run rev-parse; if the commit
failed log a descriptive error with the commit_result output (using
logger.error) and return None (or raise) instead of continuing to call
self._run_git("rev-parse", "HEAD").
🧹 Nitpick comments (4)
server/services/process_manager.py (1)
99-111: Consider logging when notifications are silently dropped.The
except RuntimeError: passblock silently discards status change notifications when no event loop is running. While current usage ensures an event loop exists (all status mutations are in async methods), a debug-level log here would aid troubleshooting if this assumption changes in the future.🔧 Optional: Add debug logging for dropped notifications
except RuntimeError: # No running event loop - pass + logger.debug("Status change notification dropped: no running event loop")autonomous_agent_demo.py (2)
101-106: Consider adding input validation for--num-agents.Negative or zero values will silently fall through to single-agent mode due to the
num_agents > 1check on line 142. While not incorrect, this may confuse users who expect an error for invalid input.🔧 Optional: Add minimum value validation
parser.add_argument( "--num-agents", type=int, default=1, - help="Number of parallel agents to run (default: 1, max: 10)", + choices=range(1, 11), + metavar="N", + help="Number of parallel agents to run (1-10, default: 1)", )Alternatively, validate after parsing and show a clear error message.
174-180: Consider adding periodic status output during polling.The 5-second polling loop is silent. For long-running parallel agent sessions, users have no visibility into progress. Consider logging agent count or a brief status message periodically.
🔧 Optional: Add status logging
while True: health = await orchestrator.healthcheck() running = sum(1 for v in health.values() if v) if running == 0: print("\nAll agents have finished.") break + # Provide periodic visibility + print(f"\r{running}/{len(health)} agents running...", end="", flush=True) await asyncio.sleep(5)worktree.py (1)
163-179: Remove the unused marker file code.Lines 177-178 write a
.features_db_pathmarker file that is never read by any code in the repository. Per the earlier review discussion, the system works becauseparallel_agent_runner.pysetsPROJECT_DIRto the main project directory, allowing direct access tofeatures.db. Consider removing the dead marker file code to improve maintainability.♻️ Proposed cleanup
except OSError as e: # Symlink failed - this breaks shared DB coordination # The MCP server uses PROJECT_DIR env var which points to worktree, # so we need the features.db to be accessible there logger.warning( f"Symlink creation failed ({e}). " f"Ensure MCP server is configured to use {features_db} directly." ) - # Write a marker file to indicate the absolute DB path - db_path_file = worktree_path / ".features_db_path" - db_path_file.write_text(str(features_db)) - logger.info(f"Wrote features.db path to {db_path_file}")
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
autonomous_agent_demo.pyserver/routers/projects.pyserver/services/process_manager.pyserver/websocket.pyworktree.py
🧰 Additional context used
🧬 Code graph analysis (1)
autonomous_agent_demo.py (3)
parallel_agents.py (5)
ParallelAgentOrchestrator(37-392)start_agents(101-140)healthcheck(362-379)stop_all_agents(283-291)cleanup(389-392)server/services/process_manager.py (1)
healthcheck(364-384)agent.py (1)
run_autonomous_agent(109-239)
🔇 Additional comments (20)
server/websocket.py (1)
54-54: LGTM! Documentation improvement.The added docstring clearly describes the initialization purpose and improves code documentation.
server/routers/projects.py (2)
279-280: LGTM! Clear documentation added.The docstring accurately describes the helper function's behavior of returning an empty string when the file is not found or cannot be read.
314-315: LGTM! Clear documentation added.The docstring accurately describes the helper function's conditional behavior of only writing when content is provided (not None).
server/services/process_manager.py (4)
86-89: LGTM!Clear and concise docstring for the property getter.
91-97: LGTM!The setter correctly compares old and new status to avoid unnecessary notifications, and the docstring accurately describes the behavior. Since all status mutations occur within async methods on the event loop thread, the lack of locking around the read-compare-write is acceptable.
113-118: LGTM!Solid defensive programming. The method safely wraps callback execution and prevents callback errors from propagating to the caller. Warning-level logging is appropriate for this use case.
140-143: LGTM!Clear docstring accurately describes the property's return behavior.
autonomous_agent_demo.py (1)
197-214: LGTM!Single-agent mode is correctly preserved as an else branch with appropriate error handling that matches the parallel mode pattern.
worktree.py (12)
1-16: LGTM!Module structure and imports are clean and appropriate.
26-34: LGTM!Initialization logic is sound. Using
resolve()ensures absolute paths, and placing worktrees in the parent directory keeps them isolated from the main repository.
36-44: LGTM!The helper method correctly captures output and allows flexible working directory specification.
46-49: LGTM!Standard and reliable method to check for Git repository presence.
51-79: LGTM!Error handling has been properly addressed. The method now correctly returns
Falsewhengit addorgit commitfail, preventing downstream issues with worktree creation.
81-83: LGTM!Simple and clear path construction.
122-147: LGTM!Worktree creation logic properly handles the case where the branch already exists, and shared resources are set up correctly.
181-190: LGTM!The prompts directory setup includes a proper fallback mechanism using
shutil.copytreewhen symlinks are not supported.
191-223: LGTM!Robust worktree removal with appropriate fallback handling when Git commands fail.
225-261: LGTM!Correct parsing of Git's porcelain output format for worktree listing.
263-312: LGTM!The merge logic has been properly fixed. The method now correctly checks out the target branch before merging and provides clear guidance when conflicts occur.
314-335: The code at lines 314–335 is compatible with the project's Python version requirements. Thepyproject.tomltargets Python 3.11 (via ruff'starget-version = "py311"), andPath.is_relative_to()has been available since Python 3.9. No Python version compatibility issue exists here.Likely an incorrect or invalid review comment.
| # Wait for all agents to complete (or user interrupt) | ||
| while True: | ||
| health = await orchestrator.healthcheck() | ||
| running = sum(1 for v in health.values() if v) | ||
| if running == 0: | ||
| print("\nAll agents have finished.") | ||
| break | ||
| await asyncio.sleep(5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Health check doesn't distinguish completion from crash.
The loop exits when running == 0, but this occurs both when agents complete successfully and when they crash. The message "All agents have finished" is misleading if agents crashed.
From the healthcheck() implementation, False is returned for both crashed and stopped agents. Consider checking actual agent statuses before printing the completion message.
🔧 Suggested improvement
# Wait for all agents to complete (or user interrupt)
while True:
health = await orchestrator.healthcheck()
running = sum(1 for v in health.values() if v)
if running == 0:
- print("\nAll agents have finished.")
+ # Check if any agents crashed
+ statuses = orchestrator.get_agent_statuses()
+ crashed = [aid for aid, s in statuses.items() if s == "crashed"]
+ if crashed:
+ print(f"\nAgents finished. Crashed: {crashed}")
+ else:
+ print("\nAll agents completed successfully.")
break
await asyncio.sleep(5)This requires exposing agent statuses from the orchestrator (e.g., a get_agent_statuses() method).
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In @autonomous_agent_demo.py around lines 173 - 180, The healthcheck loop treats
False as both crashed and finished, so change the loop to query per-agent
statuses (e.g., add or use orchestrator.get_agent_statuses() or similar) and
distinguish terminal "finished" states from "crashed"/"error" states before
printing the completion message; specifically, when running == 0 call
orchestrator.get_agent_statuses() (or introduce that method if it doesn't exist)
and if all agents report a terminal success state print "All agents have
finished", otherwise log an error or "Agents crashed" with the list of failed
agent IDs/names.
- Add max_iterations parameter to parallel agent system (was ignored) - Improve health check to distinguish completion vs crash - Check fallback commit return code in worktree creation - Clarify symlink fallback behavior (env var is the actual solution) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
worktree.py (1)
26-34: Consider validating parent directory accessibility.The worktrees directory is created in the parent of
project_dir(line 34), which could fail if the parent is read-only or ifproject_diris at the filesystem root. While uncommon in typical usage, this assumption could cause unexpected failures.💡 Suggested improvement
Add a validation check or fallback location:
def __init__(self, project_dir: Path): """ Initialize the worktree manager. Args: project_dir: The main project directory (must be a git repo) """ self.project_dir = project_dir.resolve() + # Validate parent is accessible + if not project_dir.parent.exists() or not os.access(project_dir.parent, os.W_OK): + raise ValueError(f"Parent directory not writable: {project_dir.parent}") self.worktrees_dir = self.project_dir.parent / f".{self.project_dir.name}_worktrees"parallel_agents.py (1)
189-217: Consider using asyncio subprocess for non-blocking process creation.The current implementation uses
subprocess.Popen(line 190), which blocks the event loop during process creation. While process creation is typically fast, usingasyncio.create_subprocess_execwould be more consistent with async/await patterns and prevent any potential blocking.This would require updating
AgentInfo.processtype (line 30) and adjusting_stream_outputaccordingly.⚡ Suggested improvement
try: - process = subprocess.Popen( - cmd, - stdout=subprocess.PIPE, - stderr=subprocess.STDOUT, - cwd=str(worktree_path), - ) + process = await asyncio.create_subprocess_exec( + *cmd, + stdout=asyncio.subprocess.PIPE, + stderr=asyncio.subprocess.STDOUT, + cwd=str(worktree_path), + )Note: This also requires updating the
AgentInfodataclass and_stream_outputmethod to work withasyncio.subprocess.Processinstead ofsubprocess.Popen.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
autonomous_agent_demo.pyparallel_agents.pyserver/routers/parallel_agents.pyserver/schemas.pyworktree.py
🧰 Additional context used
🧬 Code graph analysis (2)
server/schemas.py (1)
server/services/process_manager.py (3)
status(87-89)status(92-97)pid(141-143)
parallel_agents.py (3)
worktree.py (4)
ensure_git_repo(51-79)create_worktree(85-150)merge_worktree_changes(268-317)cleanup_all_worktrees(319-340)server/services/process_manager.py (2)
pid(141-143)resume(341-362)server/routers/parallel_agents.py (2)
merge_all_worktrees(203-213)get_orchestrator(49-62)
| def _setup_shared_resources(self, worktree_path: Path) -> None: | ||
| """ | ||
| Set up shared resources in the worktree. | ||
| The features.db is shared across all worktrees via symlink. | ||
| This is critical for atomic feature claiming via feature_claim_next(). | ||
| Raises: | ||
| RuntimeError: If features.db cannot be shared with the worktree. | ||
| """ | ||
| # Create symlink to shared features.db | ||
| features_db = self.project_dir / "features.db" | ||
| worktree_db = worktree_path / "features.db" | ||
|
|
||
| if features_db.exists() and not worktree_db.exists(): | ||
| try: | ||
| # Try symlink first (preferred) | ||
| worktree_db.symlink_to(features_db) | ||
| logger.debug(f"Created symlink to features.db in {worktree_path}") | ||
| except OSError as e: | ||
| # Symlink failed (common on Windows without admin privileges) | ||
| # This is OK because parallel_agent_runner.py sets PROJECT_DIR | ||
| # environment variable to the main project directory, and the | ||
| # MCP server uses that env var to locate features.db | ||
| logger.info( | ||
| f"Symlink creation failed ({e}), but this is OK - " | ||
| f"MCP server uses PROJECT_DIR env var to access {features_db}" | ||
| ) | ||
| # Write a marker file for debugging purposes only | ||
| # (not read by MCP server, just useful for troubleshooting) | ||
| db_path_file = worktree_path / ".features_db_path" | ||
| db_path_file.write_text(str(features_db)) | ||
| logger.debug(f"Wrote features.db path marker to {db_path_file}") | ||
|
|
||
| # Copy prompts directory if it exists | ||
| prompts_dir = self.project_dir / "prompts" | ||
| worktree_prompts = worktree_path / "prompts" | ||
| if prompts_dir.exists() and not worktree_prompts.exists(): | ||
| try: | ||
| worktree_prompts.symlink_to(prompts_dir) | ||
| except OSError: | ||
| import shutil | ||
| shutil.copytree(prompts_dir, worktree_prompts) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update docstring to match actual behavior.
The docstring (lines 159-160) states the function raises RuntimeError if features.db cannot be shared, but the implementation never raises this exception. Instead, it logs an informational message and writes a marker file when symlink creation fails.
Either update the docstring to reflect the actual behavior or implement the documented exception.
🔧 Suggested fix
def _setup_shared_resources(self, worktree_path: Path) -> None:
"""
Set up shared resources in the worktree.
The features.db is shared across all worktrees via symlink.
This is critical for atomic feature claiming via feature_claim_next().
-
- Raises:
- RuntimeError: If features.db cannot be shared with the worktree.
+
+ If symlink creation fails (e.g., on Windows without admin privileges),
+ the MCP server uses the PROJECT_DIR environment variable to access
+ the shared features.db from the main project directory.
"""|
Need to address dynamic/assigned frontend and the backend ports for each agent. Each agent needs their own port numbers or they will not be seeing their changes made. Also each worktree needs a copy of the database. I'll post some suggestions in a minute |
|
Added dynamic port allocation to enable parallel agents to run simultaneously without port conflicts. Each agent environment now gets its own dedicated frontend and backend ports based on the Port Allocation Scheme
How It Works
Changes
|
|
The other issue I saw with this is related to the database files and worktrees. When parallel agents are started, each agent's worktree is missing the backend database ( Root CauseThe
It does not copy the backend application database, which is an untracked file (in Impact
FixAdded
|
start_ui.sh and start_ui.bat were using system Python directly, causing ModuleNotFoundError for dotenv. Now they create and activate the venv before running, matching the pattern in start.sh. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
@jpbester Yes, this works on the UI side as well! The PR includes full UI support: New Components:
UI Features:
You can enable parallel mode in the UI before starting the agent, then control all agents from the dashboard. |
- Add parallel agent mode instructions to coding prompt template - Update prompt to use feature_claim_next(agent_id) for atomic claiming - Add worktree-safe auto-merge instructions after each feature - Increase max agents from 3 to 10 (backend and UI) - Add colors for agents 6-10 in FeatureCard and ParallelAgentControl - Fix React component identity bug causing double-click on Parallel button Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In @.claude/templates/coding_prompt.template.md:
- Around line 115-118: The fenced code block that starts with the line
"Atomically claim the next feature (prevents race conditions with other agents)"
lacks a language specifier; update its opening triple-backtick to include a
language (e.g., change ``` to ```text, or to ```bash if you intend it as a shell
command) so the code block becomes fenced with a language identifier for proper
syntax highlighting and consistency in the template.
In @parallel_agents.py:
- Around line 189-195: Replace the blocking subprocess.Popen call with
asyncio.create_subprocess_exec to avoid blocking the event loop: call
asyncio.create_subprocess_exec(*cmd, stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.STDOUT, cwd=str(worktree_path)) and await its creation
to get the async process object (replacing the current process variable). After
creation, perform a short startup verification (e.g., await asyncio.sleep(0) or
attempt one non-blocking read of process.stdout or check process.returncode) and
raise/log an error if the process exits immediately so failures during startup
are detected. Ensure all subsequent interactions use the returned asyncio
subprocess methods (communicate, wait, stdout.readline) instead of sync
subprocess APIs.
In @ui/src/components/ParallelAgentControl.tsx:
- Line 56: Add a persistent yolo_mode boolean to the parallel-agent status shape
and read it from server status rather than a local state: update the schema type
ParallelAgentInfo (server/schemas.py) to include yolo_mode: bool and ensure the
/status endpoint response populates that field from the
ParallelAgentStartRequest handling, add the same property to the TypeScript
ParallelAgentInfo type (ui/src/lib/types.ts), then in ParallelAgentControl.tsx
remove/stop using the local yoloEnabled state (yoloEnabled / setYoloEnabled) and
change the YOLO indicator to read from status.agents[0].yolo_mode (or a shared
yolo flag on the ParallelAgentsStatus object) so the UI reflects server-reported
YOLO mode across remounts.
🧹 Nitpick comments (4)
parallel_agents.py (2)
300-350: Pause/resume risks well-documented, but consider testing thoroughly.The extensive docstring warnings about SIGSTOP/SIGCONT limitations with asyncio are excellent—they clearly communicate the risks to users (frozen event loop, burst callbacks on resume, potential async operation failures).
Since you've chosen to keep the psutil suspend/resume approach with warnings rather than implementing stop/start semantics, ensure thorough cross-platform testing with realistic workloads (ongoing API requests, database queries, file I/O) to validate behavior under suspension.
Optional: Consider stop/start as a more robust alternative
If pause/resume proves unreliable in production, the stop/start pattern would provide cleaner state transitions without asyncio event loop complications. The trade-off is losing the ability to "pause" mid-execution, but you gain predictable startup/shutdown semantics.
388-395: Consider concurrent merges for better performance with many agents.The method sequentially merges each agent's worktree, which could be slow when running 10 agents. Since
WorktreeManager.merge_worktree_changesruns git commands (which are I/O-bound), consider running merges concurrently:♻️ Concurrent merge pattern
async def merge_all_worktrees(self) -> dict[str, bool]: """Merge changes from all agent worktrees back to main.""" - results = {} - for agent_id in self.agents: - success = self.worktree_manager.merge_worktree_changes(agent_id) - results[agent_id] = success - return results + loop = asyncio.get_running_loop() + + async def merge_one(agent_id: str) -> tuple[str, bool]: + success = await loop.run_in_executor( + None, + self.worktree_manager.merge_worktree_changes, + agent_id + ) + return agent_id, success + + merge_results = await asyncio.gather( + *[merge_one(aid) for aid in self.agents], + return_exceptions=True + ) + + results = {} + for item in merge_results: + if isinstance(item, Exception): + logger.error(f"Merge failed: {item}") + else: + agent_id, success = item + results[agent_id] = success + + return resultsui/src/components/ParallelAgentControl.tsx (2)
131-131: Improve responsive grid layout.The agent status grid uses
grid-cols-5 sm:grid-cols-5, which applies 5 columns at both mobile and desktop breakpoints. On smaller screens, 5 columns may be cramped for the agent badges.📱 Suggested responsive improvement
- <div className="grid grid-cols-5 sm:grid-cols-5 gap-2"> + <div className="grid grid-cols-3 sm:grid-cols-5 gap-2">This provides 3 columns on mobile and 5 on larger screens for better readability.
236-247: Consider adding text label to Cleanup button for consistency.The Stop and Merge buttons include both icon and text labels (lines 221, 234), but the Cleanup button (line 246) has only an icon. While the
titleattribute provides a tooltip, a visible text label improves accessibility and consistency.♿ Suggested improvement
<button onClick={handleCleanup} disabled={isLoading} - className="neo-btn neo-btn-secondary text-sm py-2 px-3 flex items-center gap-1" + className="neo-btn neo-btn-secondary text-sm py-2 px-3 flex items-center gap-1 flex-1" title="Stop All & Cleanup Worktrees" > {cleanupAgents.isPending ? ( <Loader2 size={14} className="animate-spin" /> ) : ( <Trash2 size={14} /> )} + <span>Cleanup</span> </button>This makes all three action buttons visually consistent and improves accessibility.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
.claude/templates/coding_prompt.template.mdparallel_agents.pyui/src/App.tsxui/src/components/FeatureCard.tsxui/src/components/ParallelAgentControl.tsx
🚧 Files skipped from review as they are similar to previous changes (2)
- ui/src/App.tsx
- ui/src/components/FeatureCard.tsx
🧰 Additional context used
🧬 Code graph analysis (1)
parallel_agents.py (3)
worktree.py (4)
ensure_git_repo(51-79)create_worktree(85-150)merge_worktree_changes(268-317)cleanup_all_worktrees(319-340)server/services/process_manager.py (2)
pid(141-143)resume(341-362)server/routers/parallel_agents.py (2)
merge_all_worktrees(203-213)get_orchestrator(49-62)
🪛 markdownlint-cli2 (0.18.1)
.claude/templates/coding_prompt.template.md
115-115: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (7)
parallel_agents.py (1)
128-142: Concurrent startup implementation looks good!The agents now start concurrently using
asyncio.gather()with proper exception handling. This addresses the previous review feedback and will significantly improve startup time when launching multiple agents..claude/templates/coding_prompt.template.md (3)
6-19: Clear parallel agent mode documentation.The introduction effectively explains the parallel agent concept and key requirements. The emphasis on atomic claiming to prevent race conditions is particularly important for users to understand.
113-122: Atomic claiming instructions are clear.The distinction between parallel and single-agent modes is well-explained, and noting that
feature_claim_nexteliminates the need for a separatefeature_mark_in_progresscall prevents potential confusion.
324-325: Verify git-common-dir command works across worktree configurations.The command
cd "$(git rev-parse --git-common-dir)/.."assumes a specific directory structure to reach the main worktree. This may not work correctly in all worktree setups (e.g., bare repositories, worktrees created in non-standard locations).Consider testing this command in various scenarios or simplifying to a more robust approach:
#!/bin/bash # Test the main branch detection in different worktree scenarios # Create test repo with worktrees in different configurations mkdir -p /tmp/test-worktree-detection cd /tmp/test-worktree-detection # Scenario 1: Standard worktree git init main-repo cd main-repo git commit --allow-empty -m "Initial" git worktree add ../worktree-1 -b test-branch # Test the command from worktree cd ../worktree-1 echo "Testing from standard worktree:" MAIN_BRANCH=$(cd "$(git rev-parse --git-common-dir)/.." && git branch --show-current 2>/dev/null || echo "main") echo "Result: $MAIN_BRANCH" # Scenario 2: Nested worktree cd ../main-repo mkdir -p nested git worktree add nested/worktree-2 -b test-branch-2 cd nested/worktree-2 echo -e "\nTesting from nested worktree:" MAIN_BRANCH=$(cd "$(git rev-parse --git-common-dir)/.." && git branch --show-current 2>/dev/null || echo "main") echo "Result: $MAIN_BRANCH" # Cleanup cd /tmp rm -rf test-worktree-detectionAlternative: Use
git rev-parse --show-toplevelfrom the main worktree or query the main worktree's current branch directly via git worktree list.ui/src/components/ParallelAgentControl.tsx (3)
1-49: LGTM: Imports, constants, and color helper are well-structured.The
AGENT_COLORSarray correctly supports up to 10 agents (matching the max from PR objectives), andgetAgentColorsafely extracts the agent number with proper bounds checking via modulo.
51-109: LGTM: State management and handlers are well-structured.The conditional status fetching (only when
isParallelModeis true) is efficient, and the aggregatedisLoadingstate correctly prevents concurrent mutations.
252-291: LGTM: Loading state and badge component are well-implemented.The
AgentStatusBadgeprovides clear visual feedback with color-coded borders, status dots, and conditional animation for running agents. The tooltip includes helpful debug information (worktree path).
| ``` | ||
| # Atomically claim the next feature (prevents race conditions with other agents) | ||
| Use the feature_claim_next tool with agent_id=$AGENT_ID | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add language specifier to code block for consistency.
The fenced code block is missing a language identifier, which reduces syntax highlighting and readability.
📝 Proposed fix
-```
+```text
# Atomically claim the next feature (prevents race conditions with other agents)
Use the feature_claim_next tool with agent_id=$AGENT_ID
Or use `bash` if these are meant to be shell commands, though "Use the feature_claim_next tool" suggests these are instructions rather than executable commands.
</details>
<details>
<summary>🧰 Tools</summary>
<details>
<summary>🪛 markdownlint-cli2 (0.18.1)</summary>
115-115: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
</details>
</details>
<details>
<summary>🤖 Prompt for AI Agents</summary>
In @.claude/templates/coding_prompt.template.md around lines 115 - 118, The
fenced code block that starts with the line "Atomically claim the next feature
(prevents race conditions with other agents)" lacks a language specifier; update
its opening triple-backtick to include a language (e.g., change totext,
or to ```bash if you intend it as a shell command) so the code block becomes
fenced with a language identifier for proper syntax highlighting and consistency
in the template.
</details>
<!-- fingerprinting:phantom:poseidon:puma -->
<!-- This is an auto-generated comment by CodeRabbit -->
| try: | ||
| process = subprocess.Popen( | ||
| cmd, | ||
| stdout=subprocess.PIPE, | ||
| stderr=subprocess.STDOUT, | ||
| cwd=str(worktree_path), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blocking subprocess creation in async function.
Using subprocess.Popen blocks the event loop during process creation. The previous review recommended switching to asyncio.create_subprocess_exec for truly async operation and adding startup verification.
🤖 Prompt for AI Agents
In @parallel_agents.py around lines 189 - 195, Replace the blocking
subprocess.Popen call with asyncio.create_subprocess_exec to avoid blocking the
event loop: call asyncio.create_subprocess_exec(*cmd,
stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.STDOUT,
cwd=str(worktree_path)) and await its creation to get the async process object
(replacing the current process variable). After creation, perform a short
startup verification (e.g., await asyncio.sleep(0) or attempt one non-blocking
read of process.stdout or check process.returncode) and raise/log an error if
the process exits immediately so failures during startup are detected. Ensure
all subsequent interactions use the returned asyncio subprocess methods
(communicate, wait, stdout.readline) instead of sync subprocess APIs.
| onModeChange, | ||
| }: ParallelAgentControlProps) { | ||
| const [numAgents, setNumAgents] = useState(2) | ||
| const [yoloEnabled, setYoloEnabled] = useState(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Check if ParallelAgentInfo type or API response includes yolo_mode tracking
rg -n --type=ts -C3 'interface ParallelAgentInfo|type ParallelAgentInfo'
rg -n --type=py -C3 'class ParallelAgentInfo|ParallelAgentInfo\('Repository: leonvanzyl/autocoder
Length of output: 1350
🏁 Script executed:
# Find the ParallelAgentControl.tsx file and read it
fd ParallelAgentControl.tsxRepository: leonvanzyl/autocoder
Length of output: 106
🏁 Script executed:
# Read the full ParallelAgentInfo interface definition in types.ts
sed -n '100,130p' ui/src/lib/types.tsRepository: leonvanzyl/autocoder
Length of output: 660
🏁 Script executed:
# Read the full ParallelAgentInfo class definition in server/schemas.py
sed -n '132,160p' server/schemas.pyRepository: leonvanzyl/autocoder
Length of output: 983
🏁 Script executed:
# Read ParallelAgentControl.tsx - first get file size
wc -l ui/src/components/ParallelAgentControl.tsxRepository: leonvanzyl/autocoder
Length of output: 110
🏁 Script executed:
# Read the component structure and line 56 area
sed -n '1,80p' ui/src/components/ParallelAgentControl.tsxRepository: leonvanzyl/autocoder
Length of output: 2140
🏁 Script executed:
# Read lines 150-180 to see what's at 163-171
sed -n '150,180p' ui/src/components/ParallelAgentControl.tsxRepository: leonvanzyl/autocoder
Length of output: 1225
🏁 Script executed:
# Read lines 190-210 to see the YOLO indicator at 200-205
sed -n '190,210p' ui/src/components/ParallelAgentControl.tsxRepository: leonvanzyl/autocoder
Length of output: 969
🏁 Script executed:
# Search for handleStart function and how yoloEnabled is used
sed -n '80,150p' ui/src/components/ParallelAgentControl.tsxRepository: leonvanzyl/autocoder
Length of output: 2224
🏁 Script executed:
# Search for all uses of yoloEnabled in the component
grep -n "yoloEnabled\|handleStart" ui/src/components/ParallelAgentControl.tsxRepository: leonvanzyl/autocoder
Length of output: 513
🏁 Script executed:
# Check the useStartParallelAgents hook to see what parameters it accepts
fd useProjects -type fRepository: leonvanzyl/autocoder
Length of output: 234
🏁 Script executed:
# Find the useProjects hook file
fd "useProjects" ui/src/hooks/Repository: leonvanzyl/autocoder
Length of output: 91
🏁 Script executed:
# Search for the API endpoints that handle parallel agents
grep -rn "yoloMode\|yolo_mode" ui/src/ server/Repository: leonvanzyl/autocoder
Length of output: 3145
🏁 Script executed:
# Check ParallelAgentStart and ParallelAgentStartResponse schemas
sed -n '100,145p' server/schemas.pyRepository: leonvanzyl/autocoder
Length of output: 1591
🏁 Script executed:
# Check how parallel agents endpoint returns the status
sed -n '60,110p' server/routers/parallel_agents.pyRepository: leonvanzyl/autocoder
Length of output: 1715
🏁 Script executed:
# Check the AgentStatus type in types.ts to see its yolo_mode field
grep -B5 -A5 'yolo_mode: boolean' ui/src/lib/types.tsRepository: leonvanzyl/autocoder
Length of output: 278
Add yolo_mode to parallel agent status tracking.
The yoloEnabled state is local (line 56, defaults to false). When agents are started with YOLO mode enabled and the component unmounts/remounts (e.g., navigation or parent re-render), yoloEnabled resets to false, causing the YOLO indicator (lines 200-205) to disappear even though agents are still running in YOLO mode.
Unlike single agents (AgentStatus includes yolo_mode), ParallelAgentInfo lacks a yolo_mode field despite the server accepting and processing yolo_mode in ParallelAgentStartRequest. The YOLO indicator should read from the server status response instead of local state. Add yolo_mode: bool to ParallelAgentInfo in both server/schemas.py and ui/src/lib/types.ts, include it in the /status endpoint response, and update the YOLO indicator to use status.agents[0].yolo_mode or track a shared YOLO flag in ParallelAgentsStatus.
🤖 Prompt for AI Agents
In @ui/src/components/ParallelAgentControl.tsx at line 56, Add a persistent
yolo_mode boolean to the parallel-agent status shape and read it from server
status rather than a local state: update the schema type ParallelAgentInfo
(server/schemas.py) to include yolo_mode: bool and ensure the /status endpoint
response populates that field from the ParallelAgentStartRequest handling, add
the same property to the TypeScript ParallelAgentInfo type
(ui/src/lib/types.ts), then in ParallelAgentControl.tsx remove/stop using the
local yoloEnabled state (yoloEnabled / setYoloEnabled) and change the YOLO
indicator to read from status.agents[0].yolo_mode (or a shared yolo flag on the
ParallelAgentsStatus object) so the UI reflects server-reported YOLO mode across
remounts.
…arallel agents - Fix assigned_agent_id not being returned in features API response - Add pause/resume endpoints for parallel agents (/parallel-agents/pause, /resume) - MCP server auto-claims features when AGENT_ID env var is set - UI shows which agent is working on each feature with colored badges - ParallelAgentControl panel now includes status indicator, YOLO toggle, and pause/resume/stop controls Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
mcp_server/feature_mcp.py (1)
337-356: TOCTOU race condition in agent assignment check.The conflict check (lines 346-350) reads
feature.assigned_agent_idat line 337, then updates it at lines 353-354 without row locking. If two agents call this simultaneously with differentagent_idvalues on the same feature, both could pass the check and overwrite each other's assignment.🔒 Proposed fix using row locking
session = get_session() try: - feature = session.query(Feature).filter(Feature.id == feature_id).first() + feature = session.query(Feature).filter(Feature.id == feature_id).with_for_update().first() if feature is None: return json.dumps({"error": f"Feature with ID {feature_id} not found"})
🤖 Fix all issues with AI agents
In @mcp_server/feature_mcp.py:
- Around line 185-190: The auto-claim block sets feature.in_progress and
assigned_agent_id without acquiring a row lock, allowing concurrent agents to
read and claim the same Feature; modify the query used to fetch the Feature (the
`query` that returns `feature`) to call `with_for_update()` before `.first()` so
the DB row is locked for update (e.g., use
`query.order_by(...).with_for_update().first()`), then proceed to set
`feature.in_progress`, `feature.assigned_agent_id`, `session.commit()` and
`session.refresh(feature)` as before.
🧹 Nitpick comments (1)
mcp_server/feature_mcp.py (1)
465-479: Consider adding row locking for consistency.While the ownership verification (lines 471-474) is correct, it's vulnerable to the same TOCTOU race condition as
feature_mark_in_progress. For consistency and to prevent unexpected behavior in high-concurrency scenarios, consider addingwith_for_update()to the query at line 465.🔒 Proposed enhancement
session = get_session() try: - feature = session.query(Feature).filter(Feature.id == feature_id).first() + feature = session.query(Feature).filter(Feature.id == feature_id).with_for_update().first() if feature is None:
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
mcp_server/feature_mcp.pyserver/routers/features.pyserver/routers/parallel_agents.pyui/src/components/ParallelAgentControl.tsxui/src/hooks/useProjects.tsui/src/lib/api.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- ui/src/hooks/useProjects.ts
🧰 Additional context used
🧬 Code graph analysis (3)
server/routers/parallel_agents.py (5)
server/schemas.py (4)
ParallelAgentStartRequest(128-132)ParallelAgentsStatus(144-148)ParallelAgentInfo(135-141)ParallelAgentActionResponse(151-155)server/routers/features.py (2)
_get_project_path(28-36)validate_project_name(57-64)registry.py (1)
get_project_path(195-213)parallel_agents.py (10)
get_orchestrator(406-417)healthcheck(369-386)start_agents(101-142)stop_all_agents(290-298)pause_agent(300-325)resume_agent(327-349)start_agent(144-217)stop_agent(251-288)merge_all_worktrees(388-394)cleanup(396-399)server/services/process_manager.py (1)
pid(141-143)
ui/src/lib/api.ts (1)
server/schemas.py (2)
ParallelAgentsStatus(144-148)ParallelAgentActionResponse(151-155)
mcp_server/feature_mcp.py (2)
api/database.py (2)
Feature(19-46)to_dict(34-46)ui/src/lib/types.ts (1)
Feature(60-67)
🔇 Additional comments (7)
server/routers/features.py (1)
93-93: LGTM! Correctly exposes agent assignment in API responses.The addition of
assigned_agent_idto the feature response enables the UI to display which agent is working on each feature, supporting the parallel agents feature. The field will beNonefor unassigned features, which FastAPI serializes asnullautomatically.mcp_server/feature_mcp.py (2)
231-262: LGTM!The function correctly clears both
in_progressandassigned_agent_idwhen marking a feature as passing, ensuring clean state transitions.
395-444: Well-implemented atomic feature claiming!The use of
with_for_update()(line 425) provides proper row-level locking, and the filter logic correctly ensures a feature is available by checking bothin_progressandassigned_agent_idwith AND logic. The exception handling with rollback (lines 439-441) is also appropriate.server/routers/parallel_agents.py (1)
65-274: LGTM! Endpoint implementations are consistent and well-structured.All endpoints follow a consistent pattern: get orchestrator, call orchestrator method, return appropriate response model. Error handling is properly delegated to the orchestrator methods, and response models match the defined schemas.
ui/src/components/ParallelAgentControl.tsx (2)
115-309: Well-structured conditional rendering and UI controls!The component properly handles different states (collapsed, no agents, running agents, paused agents) with appropriate controls for each. The agent count bounds (1-10), loading states, and status indicators are all correctly implemented.
311-383: Subcomponents are clean and focused.Both
StatusIndicatorandAgentStatusBadgeare well-implemented with clear responsibilities. The color mapping withgetAgentColorensures visual consistency across the UI.ui/src/lib/api.ts (1)
152-211: LGTM! Parallel agent API functions are well-implemented.All functions follow the established pattern, properly encode parameters, use correct HTTP methods, and match server-side schemas. The snake_case conversion for request bodies (
num_agents,yolo_mode) correctly aligns with Python backend conventions.
| # In parallel mode, automatically claim the feature to prevent race conditions | ||
| if agent_id and not feature.in_progress: | ||
| feature.in_progress = True | ||
| feature.assigned_agent_id = agent_id | ||
| session.commit() | ||
| session.refresh(feature) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, search for calls to feature_get_next vs feature_claim_next
rg -n 'feature_get_next|feature_claim_next' --type py -C2 -g '!mcp_server/feature_mcp.py'Repository: leonvanzyl/autocoder
Length of output: 1765
🏁 Script executed:
# Examine the feature_mcp.py file around the auto-claim logic
cat -n mcp_server/feature_mcp.py | sed -n '140,200p'Repository: leonvanzyl/autocoder
Length of output: 2832
🏁 Script executed:
# Search for with_for_update usage in the file to see locking patterns
rg -n 'with_for_update' mcp_server/feature_mcp.py -B2 -A2Repository: leonvanzyl/autocoder
Length of output: 239
🏁 Script executed:
# Check if feature_claim_next also uses auto-claiming or has explicit locking
rg -n 'def feature_claim_next' mcp_server/feature_mcp.py -A30Repository: leonvanzyl/autocoder
Length of output: 1418
Add row-level locking to prevent race condition in auto-claim logic.
The auto-claim at lines 185-190 lacks row-level locking. The query at line 180 retrieves the feature without with_for_update(), creating a race condition window: two agents can read the same feature before either commits, causing both to claim it simultaneously.
The docstring (lines 154-155) states this auto-claim "prevents race conditions," but the implementation does not. In contrast, feature_claim_next (line 425) correctly uses with_for_update() before the .first() call to achieve atomicity.
Add with_for_update() to line 180 before the .first() call:
Suggested fix
feature = query.order_by(Feature.priority.asc(), Feature.id.asc()).with_for_update().first()🤖 Prompt for AI Agents
In @mcp_server/feature_mcp.py around lines 185 - 190, The auto-claim block sets
feature.in_progress and assigned_agent_id without acquiring a row lock, allowing
concurrent agents to read and claim the same Feature; modify the query used to
fetch the Feature (the `query` that returns `feature`) to call
`with_for_update()` before `.first()` so the DB row is locked for update (e.g.,
use `query.order_by(...).with_for_update().first()`), then proceed to set
`feature.in_progress`, `feature.assigned_agent_id`, `session.commit()` and
`session.refresh(feature)` as before.
Summary
This PR implements the ability to run multiple Claude agents in parallel to speed up project completion, as requested in issue #5.
Key Features
feature_claim_next(agent_id)MCP tool prevents race conditions--num-agentsCLI flag, REST API for UI controlNew Components
worktree.pyparallel_agents.pyparallel_agent_runner.pyserver/routers/parallel_agents.pyDatabase Changes
assigned_agent_idcolumn to Feature modelUsage
Screenshots
Agent badges appear on feature cards showing which agent (#1, #2, etc.) is working on each feature.
CodeRabbitAI Review Fixes
The following issues from the CodeRabbitAI review have been addressed:
Critical/Major Fixes
feature_claim_next: Changed from OR to AND logic to ensure features must be bothin_progress=FalseANDassigned_agent_id=Noneto be claimedasyncio.gather()for concurrent startup, reducing init time.statusaccess on non-existent agentsFalseif add/commit fails during initializationMinor Fixes
None)*.tsbuildinfoto.gitignoreand removed from trackingAdditional Fixes (Second Pass)
str(dir) in str(path)) withPath.is_relative_to()to prevent false positives from substring matchesrun_parallel()inautonomous_agent_demo.pyread_file(),write_file()inserver/routers/projects.pyConnectionManager.__init__()inserver/websocket.pystatusgetter/setter,pidproperty inserver/services/process_manager.pyAdditional Fixes (Third Pass)
--max-iterationsignored in parallel mode: Addedmax_iterationsparameter toParallelAgentOrchestrator.start_agents()andstart_agent(), passing it through toparallel_agent_runner.py. Also added to API schemaParallelAgentStartRequest.parallel_agent_runner.pysets thePROJECT_DIRenvironment variable, which the MCP server uses to locatefeatures.db. The marker file is now documented as debugging-only.Acknowledged but Not Changed
subprocess.Popenapproach works correctly withrun_in_executorfor output streaming. Converting toasyncio.create_subprocess_execwould require significant refactoring for minimal benefit.Test plan
python -m py_compileon all new Python files--num-agents 1or default)--num-agents 2on a sample projectCloses #5
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.