-
Notifications
You must be signed in to change notification settings - Fork 575
Description
Bug Report: Features Stuck in "in_progress" After Server Restart
Summary
Features remain stuck in "in_progress" status after server restart, making them unresumable. The UI shows "Resume" buttons, but clicking them fails with "already running" error even though no agent process is actually running.
Environment
- OS: Ubuntu 24.04 LTS (WSL2)
- Docker: Docker Desktop with WSL2 backend
- Automaker Version: v0.14.0rc (branch)
- Deployment: Production Docker (
docker-compose.yml)
Steps to Reproduce
- Start Automaker via Docker:
docker compose up -d - Create a project and add features to the backlog
- Start one or more features (they begin processing with status "in_progress")
- While features are running, restart the server:
docker restart automaker-server- Or let computer sleep/wake (causes WebSocket disconnect)
- Or
docker compose down && docker compose up -d
- Refresh the UI
- Observe features show "Resume" button
- Click "Resume" - fails with error or nothing happens
Expected Behavior
After server restart, the server should:
- Detect that features marked "in_progress" have no active agent process
- Mark these features as "interrupted" or "resumable"
- Allow the user to resume them successfully
Actual Behavior
- Features remain with
"status": "in_progress"in their state files - Server reads this stale state and believes features are still running
- Clicking "Resume" triggers error:
Error: already running - Features are effectively stuck and cannot be resumed or restarted without manual intervention
Error Logs
[ERROR] [AutoMode] Resume feature ai-chat-interface error: Error: already running
at AutoModeService.resumeFeature (file:///app/apps/server/dist/services/auto-mode-service.js:1143:14)
at file:///app/apps/server/dist/routes/auto-mode/routes/resume-feature.js:21:18
Root Cause Analysis
Feature state is persisted to disk in .automaker/features/<feature-id>/feature.json:
{
"id": "ai-chat-interface",
"title": "AI Chat Interface for Idea Exploration",
"status": "in_progress",
"updatedAt": "2026-01-25T12:37:11.720Z"
}When the server restarts:
- The actual agent process is terminated (container restart kills all child processes)
- The state file is NOT updated (no graceful shutdown handler)
- On startup, server reads the stale state file
- Server assumes the feature is "running" based solely on the state file
- No process liveness check is performed
Workaround
Manually reset stuck features by changing status in the JSON files:
docker exec automaker-server bash -c '
for f in /projects/<project>/.automaker/features/*/feature.json; do
if grep -q "in_progress" "$f"; then
sed -i "s/\"status\": \"in_progress\"/\"status\": \"pending\"/" "$f"
fi
done
'Suggested Fix
Option A: Orphan Detection on Startup (Recommended)
On server startup, scan all projects for features with "in_progress" status and:
- Check if an actual agent process exists for that feature
- If not, mark the feature as "interrupted" (preserving any partial progress)
- Allow user to resume from the interrupted state
Option B: Graceful Shutdown Handler
Register a shutdown handler (SIGTERM, SIGINT) that:
- Marks all "in_progress" features as "interrupted"
- Saves state before exit
Note: Option B alone is insufficient because it doesn't handle crashes, OOM kills, or kill -9.
Option C: Process Tracking with PIDs
Store the agent process PID in the feature state, then on startup:
- Check if that PID is still running
- If not, mark as interrupted
Frequency
This bug occurs every time the server restarts while features are in progress. It's 100% reproducible.
Impact
- Severity: Medium-High
- User Impact: Features become permanently stuck, requiring manual file editing to fix
- Workaround Available: Yes, but requires CLI access and technical knowledge
Related Files
apps/server/dist/services/auto-mode-service.js- Resume logic with "already running" check.automaker/features/<id>/feature.json- Persisted feature state
Reported by: @JasonBroderick
Discovered with assistance from: Claude Code