-
Notifications
You must be signed in to change notification settings - Fork 289
fix: prevent coding agent from skipping refactoring features #52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
- Add dedicated "REFACTORING FEATURES" section to coding prompt with clear instructions on how to handle refactoring tasks - Add refactoring entries to "NEVER skip because" table - Update feature_skip MCP tool docstring to explicitly prohibit skipping for refactoring or "unclear requirements" The agent was skipping refactoring features because: 1. The prompt was entirely focused on "test-driven development" where features are test cases - refactoring doesn't fit this mental model 2. The skip documentation mentioned "unclear requirements" as valid, which refactoring features could be misinterpreted as Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Root cause: Agent was skipping refactoring features because: 1. Initializer only defined "functional" and "style" categories 2. No explicit statement that ALL features are mandatory 3. Agent interpreted refactoring as "not a requirement" Changes: - Add "refactoring" as valid category in initializer prompt with example - Add prominent "ALL FEATURES ARE MANDATORY REQUIREMENTS" section stating that functional, style, AND refactoring features must all be implemented without exception - Add "This is not a functional requirement" to NEVER skip table - Update both regular and YOLO prompts with same guidance The agent now has explicit instructions that: - Every feature in the database is mandatory - Refactoring features are just as important as functional features - "Whatever feature_get_next returns, you implement it. No exceptions." Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The agent was using the original app_spec.txt as an excuse to skip refactoring features (e.g., "the spec says JavaScript, I can't migrate to TypeScript"). Added explicit rules: - "CRITICAL: Refactoring features OVERRIDE the original spec" - "The feature database is the living source of truth" - "The original spec was a starting point" - "NO EXCUSES. If the feature says to refactor, you refactor. Period." Added to NEVER skip table: - "The spec says to use X technology" → do the migration - "This contradicts the original requirements" → feature DB is truth - "This is too big a change" → break into steps and start Examples added showing wrong reasoning like: - "The spec says JavaScript, I can't change the techstack" → WRONG - "This is too big a change" → WRONG Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Agent was refusing to refactor because "the app is currently in JavaScript" even though that's exactly why the refactoring feature exists. Added explicit rules: - "CRITICAL: The CURRENT STATE of the codebase is NOT an excuse" - "The app is currently in JavaScript" → That's WHY you're refactoring - "The whole point of refactoring is to change the current state" - "The current state is the PROBLEM, not an excuse" Added to NEVER skip table: - "The app is currently in JavaScript" → change it to TypeScript - "The codebase currently uses X" → migrate it Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThe PR adds a mandatory-feature policy across AI coding templates requiring implementation of all features (functional, style, refactoring), expands refactoring workflow and verification guidance, tightens skip semantics to external blockers only, and updates documentation and initializer payloads to include "refactoring" as a category. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. 📜 Recent review detailsConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (1)
🧰 Additional context used📓 Path-based instructions (1).claude/templates/**/*.template.md📄 CodeRabbit inference engine (CLAUDE.md)
Files:
🧠 Learnings (1)📚 Learning: 2026-01-10T08:23:04.012ZApplied to files:
🪛 LanguageToolCLAUDE.md[style] ~149-~149: Consider replacing ‘only’ with a different word to let your writing stand out. (ONLY_EXCLUSIVELY_STYLE) 🔇 Additional comments (6)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @.claude/templates/coding_prompt.template.md:
- Around line 112-126: Update the user-facing documentation (README and any user
guide pages) to explicitly state the precedence rules described in the template:
that after initialization the feature database becomes authoritative,
refactoring features override app_spec.txt, and the current codebase is not a
constraint (include the exact phrasing from the template such as "Refactoring
features OVERRIDE the original spec" and concrete examples like "migrate to
TypeScript" superseding "use JavaScript"); add a short "Behavior & Precedence"
section explaining these points, the workflow impact (features take priority),
and a couple of examples so users aren’t surprised when a feature contradicts
app_spec.txt, and update any user-facing templates (e.g.,
coding_prompt.template.md references) to point readers to that README section.
- Around line 84-93: Update the "ALL FEATURES ARE MANDATORY REQUIREMENTS"
section to remove the absolute "no exceptions" wording and explicitly reference
the feature_skip tool as the mechanism to flag legitimate external blockers
(e.g., missing API credentials, unavailable services, hardware limits) while
keeping the mandate to implement returned items from feature_get_next; also add
brief guidance to surface architectural concerns by (a) describing a simple
conflict-detection note for contradictory features (e.g., conflicting refactor
directives), (b) adding a validation/clarification step to flag ambiguous or
poorly specified feature descriptions, and (c) recommending a flagging system
for circular dependencies or genuinely uncompletable items so agents use
feature_skip when appropriate.
🧹 Nitpick comments (1)
.claude/templates/coding_prompt.template.md (1)
194-201: Consider consolidating similar table entries.Entries at lines 200-201 are very similar:
- Line 200: "The app is currently in JavaScript" → "That's WHY you're refactoring - change it to TypeScript"
- Line 201: "The codebase currently uses X" → "That's the problem you're solving - migrate it"
Entry at line 201 appears to be a generalized version of line 200. You could simplify by keeping only the general case, or differentiate them more clearly (e.g., line 200 for language changes, line 201 for library/framework changes).
♻️ Optional consolidation
| "Feature X needs to be done first" | Skip | Build feature X as part of this feature | | "This is a refactoring feature" | Skip | Implement the refactoring, verify with build/lint/tests | | "Refactoring requirements are vague" | Skip | Interpret the intent, implement, verify code compiles | | "This is not a functional requirement" | Skip | ALL features are requirements - implement it | | "The spec says to use X technology" | Skip | Refactoring features override the spec - do the migration | | "This contradicts the original requirements" | Skip | Feature database is the living truth - implement it | | "This is too big a change" | Skip | Break it into steps and start - no change is too big | -| "The app is currently in JavaScript" | Skip | That's WHY you're refactoring - change it to TypeScript | -| "The codebase currently uses X" | Skip | That's the problem you're solving - migrate it | +| "The app currently uses X (language/library/framework)" | Skip | That's the problem you're solving - migrate to Y as specified |However, keeping both entries may be intentional for emphasis through repetition (common in prompt engineering), so this consolidation is optional.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.claude/templates/coding_prompt.template.md.claude/templates/coding_prompt_yolo.template.md.claude/templates/initializer_prompt.template.mdmcp_server/feature_mcp.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use SQLite with SQLAlchemy ORM for database access and feature management
Files:
mcp_server/feature_mcp.py
mcp_server/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use MCP (Model Context Protocol) servers for agent tools: feature_mcp.py for feature management, with tools: feature_get_stats, feature_get_next, feature_get_for_regression, feature_mark_passing, feature_skip, feature_create_bulk
Files:
mcp_server/feature_mcp.py
.claude/templates/**/*.template.md
📄 CodeRabbit inference engine (CLAUDE.md)
Prompt templates should follow a fallback chain: project-specific in {project_dir}/prompts/{name}.md, then base template in .claude/templates/{name}.template.md
Files:
.claude/templates/coding_prompt_yolo.template.md.claude/templates/coding_prompt.template.md.claude/templates/initializer_prompt.template.md
🧠 Learnings (1)
📚 Learning: 2026-01-10T08:23:04.012Z
Learnt from: CR
Repo: leonvanzyl/autocoder PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-10T08:23:04.012Z
Learning: Applies to mcp_server/**/*.py : Use MCP (Model Context Protocol) servers for agent tools: feature_mcp.py for feature management, with tools: feature_get_stats, feature_get_next, feature_get_for_regression, feature_mark_passing, feature_skip, feature_create_bulk
Applied to files:
mcp_server/feature_mcp.py
🔇 Additional comments (6)
mcp_server/feature_mcp.py (1)
249-258: LGTM! Clear guidance on skip criteria.The updated docstring effectively clarifies when skipping is appropriate and explicitly prohibits skipping for refactoring features. The distinction between valid external blockers and invalid internal reasons (including refactoring) aligns well with the PR objectives.
.claude/templates/initializer_prompt.template.md (2)
52-62: Good addition of refactoring category example.The new refactoring example follows the same structure as the functional and style examples and provides clear guidance for initializer agents. The generic verification steps ("Verify code compiles and tests pass") are appropriate for a template.
78-78: LGTM! Categories list updated correctly.The addition of "refactoring" to the categories list is consistent with the new example above and aligns with the PR objectives.
.claude/templates/coding_prompt_yolo.template.md (2)
64-100: Excellent clarification on mandatory features and refactoring.The new sections effectively address the core issue of agents skipping refactoring features. Key strengths:
- Clear hierarchy: Explicitly states all three categories (functional, style, refactoring) are equally mandatory
- Concrete examples: TypeScript migration, GraphQL refactor, Zustand migration, and Tailwind refactor provide clear guidance
- Addresses common misconception: The "current state is not an excuse" section (lines 85-90) directly counters the problematic reasoning pattern
- Override clarity: Explicitly states refactoring features override the original spec (lines 79-84)
The strong, directive language ("CRITICAL", "NEVER skip", "NO EXCUSES") is appropriate given the PR's objective to fix systematic skipping behavior.
137-144: Comprehensive coverage of refactoring skip scenarios.The eight new table entries effectively enumerate common excuses for skipping refactoring features and provide clear corrective actions. Notable entries:
- Lines 137-139: Directly address the misconception that refactoring is optional
- Lines 143-144: Counter the "current state" reasoning pattern (consistent with lines 85-90)
- Line 140: Clarifies that refactoring features override the original spec
- Line 142: Addresses scope concerns with "no change is too big"
The table format provides quick reference for agents, complementing the detailed guidance in the refactoring section above.
.claude/templates/coding_prompt.template.md (1)
127-157: Well-structured refactoring workflow and verification criteria.The workflow (lines 127-136) and verification criteria (lines 137-142) are clear, comprehensive, and actionable. The three examples (lines 143-157) effectively demonstrate the expected behavior across different refactoring scenarios:
- Auth system changes (JWT)
- Tech stack migration (JavaScript → TypeScript)
- Code organization (extract shared utilities)
The emphatic language and repetition are appropriate for prompt engineering, where multiple reinforcements improve AI compliance.
| **CRITICAL: Refactoring features OVERRIDE the original spec.** If a refactoring feature contradicts `app_spec.txt`, the refactoring feature takes precedence. Examples: | ||
| - Spec says "use JavaScript" but feature says "migrate to TypeScript" → **Do the TypeScript migration** | ||
| - Spec says "use REST API" but feature says "refactor to GraphQL" → **Do the GraphQL refactor** | ||
| - Spec says "use Context API" but feature says "migrate to Zustand" → **Do the Zustand migration** | ||
| - Spec says "use CSS modules" but feature says "refactor to Tailwind" → **Do the Tailwind refactor** | ||
|
|
||
| **CRITICAL: The CURRENT STATE of the codebase is NOT an excuse.** If the code is currently in JavaScript but a feature says "migrate to TypeScript", your job is to CHANGE IT: | ||
| - "The app is currently in JavaScript" → **That's WHY you're refactoring - change it to TypeScript** | ||
| - "The codebase uses REST" → **That's WHY you're refactoring - change it to GraphQL** | ||
| - "We're currently using X" → **That's WHY you're refactoring - migrate to Y** | ||
|
|
||
| The whole point of refactoring is to change the current state. The current state is the PROBLEM, not an excuse. | ||
|
|
||
| **The feature database is the living source of truth.** The original spec was a starting point. Refactoring features represent evolved requirements that supersede the original spec. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for documentation about feature precedence and spec overrides
# Find all markdown files and check for mentions of precedence/override
echo "=== Searching for precedence/override documentation ==="
rg "precedence|override|source of truth|feature.*app_spec|spec.*override" .claude/templates/ -C 2
echo -e "\n=== Checking root-level documentation ==="
fd -e md -e txt . --max-depth 1
echo -e "\n=== Searching README and doc files ==="
fd "README|GUIDE|USAGE" -e md -e txt
echo -e "\n=== Checking initializer prompt for feature system documentation ==="
cat -n .claude/templates/initializer_prompt.template.md | head -100Repository: leonvanzyl/autocoder
Length of output: 9645
Document the precedence change in user-facing documentation.
The feature override behavior (refactoring features superseding app_spec.txt) is documented only in internal AI prompts. Users creating projects or feature databases may not discover that adding a "migrate to TypeScript" feature will override a JavaScript spec, or that the feature database becomes the living source of truth after initialization.
This precedence shift should be documented in the README or a user guide explaining:
- After initialization, the feature database becomes authoritative
- Refactoring features override the original spec
- Examples of how this affects workflow
Without this in user-facing docs, users may be surprised when their specifications are contradicted by features they add.
🤖 Prompt for AI Agents
In @.claude/templates/coding_prompt.template.md around lines 112 - 126, Update
the user-facing documentation (README and any user guide pages) to explicitly
state the precedence rules described in the template: that after initialization
the feature database becomes authoritative, refactoring features override
app_spec.txt, and the current codebase is not a constraint (include the exact
phrasing from the template such as "Refactoring features OVERRIDE the original
spec" and concrete examples like "migrate to TypeScript" superseding "use
JavaScript"); add a short "Behavior & Precedence" section explaining these
points, the workflow impact (features take priority), and a couple of examples
so users aren’t surprised when a feature contradicts app_spec.txt, and update
any user-facing templates (e.g., coding_prompt.template.md references) to point
readers to that README section.
1. Remove "no exceptions" absolute wording (lines 84-93)
- Reference feature_skip for legitimate external blockers
- Add guidance for edge cases: conflicting features, ambiguous
requirements, and circular dependencies
2. Add "Feature Behavior & Precedence" section to CLAUDE.md
- Document that feature database becomes authoritative after init
- Explain refactoring features override original spec
- Clarify current codebase state is not a constraint
- Note all categories are mandatory, skipping is for external blockers
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Verified open-autocoder --help displays usage information - All commands listed: serve, web, init, start, status, features, provider, projects, service, config - Options documented: -V/--version, -p/--port, -H/--host, -h/--help - Subcommand help verified (projects, service, provider) - Screenshot: feature-52-cli-help-verification.png Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
The coding agent was systematically skipping refactoring features due to several prompt-level issues. This PR fixes all identified causes:
Changes
coding_prompt.template.mdcoding_prompt_yolo.template.mdinitializer_prompt.template.mdmcp_server/feature_mcp.pyKey additions to prompts
Test plan
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Improvements
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.