Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 39 additions & 4 deletions .claude/commands/create-spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,27 @@ Ask the user about their involvement preference:

**For Detailed Mode users**, ask specific tech questions about frontend, backend, database, etc.

### Phase 3b: Database Requirements (MANDATORY)

**Always ask this question regardless of mode:**

> "One foundational question about data storage:
>
> **Does this application need to store user data persistently?**
>
> 1. **Yes, needs a database** - Users create, save, and retrieve data (most apps)
> 2. **No, stateless** - Pure frontend, no data storage needed (calculators, static sites)
> 3. **Not sure** - Let me describe what I need and you decide"

**Branching logic:**

- **If "Yes" or "Not sure"**: Continue normally. The spec will include database in tech stack and the initializer will create 5 mandatory Infrastructure features (indices 0-4) to verify database connectivity and persistence.

- **If "No, stateless"**: Note this in the spec. Skip database from tech stack. Infrastructure features will be simplified (no database persistence tests). Mark this clearly:
```xml
<database>none - stateless application</database>
```
Comment on lines +98 to +117
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# List files in .claude directory structure
find .claude -type f -name "*.md" | head -20

Repository: leonvanzyl/autocoder

Length of output: 607


🏁 Script executed:

#!/bin/bash
# Check if the mentioned files exist and get their sizes
ls -lh .claude/commands/create-spec.md .claude/templates/initializer_prompt.template.md 2>&1

Repository: leonvanzyl/autocoder

Length of output: 240


🏁 Script executed:

#!/bin/bash
# Examine the context around lines 98-117 in create-spec.md
sed -n '95,120p' .claude/commands/create-spec.md

Repository: leonvanzyl/autocoder

Length of output: 1151


🏁 Script executed:

#!/bin/bash
# Search for stateless/conditional logic in initializer_prompt.template.md
rg -n "stateless|no database|skip infrastructure|Phase 3b" .claude/templates/initializer_prompt.template.md -C 3

Repository: leonvanzyl/autocoder

Length of output: 46


🏁 Script executed:

#!/bin/bash
# Read initializer_prompt.template.md to see what it contains
wc -l .claude/templates/initializer_prompt.template.md

Repository: leonvanzyl/autocoder

Length of output: 116


🏁 Script executed:

#!/bin/bash
# Read initializer_prompt.template.md contents
cat -n .claude/templates/initializer_prompt.template.md

Repository: leonvanzyl/autocoder

Length of output: 19205


🏁 Script executed:

#!/bin/bash
# Search for infrastructure references in both files
rg -n "infrastructure|Infrastructure" .claude/commands/create-spec.md .claude/templates/initializer_prompt.template.md -C 2

Repository: leonvanzyl/autocoder

Length of output: 8294


🏁 Script executed:

#!/bin/bash
# Search broadly for stateless in the entire .claude directory
rg -r "stateless" .claude/

Repository: leonvanzyl/autocoder

Length of output: 1267


Reconcile stateless app handling between spec creation and initializer implementation.

The spec creation process mentions "simplified infrastructure" for stateless apps, but the initializer template doesn't implement this scenario:

  1. Undefined simplification (line 114): States infrastructure will be "simplified (no database persistence tests)" for stateless apps, but doesn't specify:

    • How many infrastructure features should be created (0? Fewer than 5?)
    • What "simplified" features look like
    • How this affects overall feature count
  2. Unclear XML tag placement (lines 115-117): The XML tag <database>none - stateless application</database> is shown but its location in the spec structure is not defined.

  3. Missing initializer implementation: The initializer template (lines 39-41, 119-131, 225) has no conditional logic for stateless apps. It always mandates exactly 5 infrastructure features (indices 0-4) for all complexity tiers, regardless of whether the app needs a database. This creates an inconsistency: the spec process promises simplified infrastructure for stateless apps, but the initializer can't deliver it.

Resolve by either:

  • Defining what "simplified infrastructure" means (and updating the initializer to handle it), or
  • Clarifying that stateless apps still require the 5 infrastructure features but with different test steps
🤖 Prompt for AI Agents
In @.claude/commands/create-spec.md around lines 98 - 117, The spec text "Phase
3b: Database Requirements" currently promises a "simplified infrastructure" for
stateless apps but doesn't define it and the initializer always creates exactly
5 infrastructure features (indices 0-4); fix by defining the simplified infra
and adding conditional initializer logic: explicitly state that for
"<database>none - stateless application</database>" the spec will create 0
infrastructure persistence tests (or e.g., 2 lightweight infra checks like build
and health endpoints) and show exactly where the <database> tag sits (under the
spec root/database element), then update the initializer template to branch on
the database requirement flag (e.g., needsDatabase or the <database> value) so
that when the value is "none - stateless application" it generates the
simplified feature set instead of indices 0-4, otherwise generate the full five
persistence tests.


## Phase 4: Features (THE MAIN PHASE)

This is where you spend most of your time. Ask questions in plain language that anyone can answer.
Expand Down Expand Up @@ -207,12 +228,23 @@ After gathering all features, **you** (the agent) should tally up the testable f

**Typical ranges for reference:**

- **Simple apps** (todo list, calculator, notes): ~20-50 features
- **Medium apps** (blog, task manager with auth): ~100 features
- **Advanced apps** (e-commerce, CRM, full SaaS): ~150-200 features
- **Simple apps** (todo list, calculator, notes): ~25-55 features (includes 5 infrastructure)
- **Medium apps** (blog, task manager with auth): ~105 features (includes 5 infrastructure)
- **Advanced apps** (e-commerce, CRM, full SaaS): ~155-205 features (includes 5 infrastructure)
Comment on lines +231 to +233
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Extract feature count references from both files

echo "=== create-spec.md feature counts ==="
rg "Simple apps|Medium apps|Advanced apps|Complex apps" .claude/commands/create-spec.md -B 1 -A 1

echo ""
echo "=== initializer_prompt.template.md feature counts ==="
rg "Simple|Medium|Complex|Advanced" .claude/templates/initializer_prompt.template.md | grep -E "apps|features|tests|Simple|Medium|Complex|Advanced" | head -30

Repository: leonvanzyl/autocoder

Length of output: 735


Feature count inconsistency between create-spec.md and initializer_prompt.template.md.

The feature/test counts for corresponding app tiers don't align:

Tier create-spec.md initializer_prompt.template.md
Simple ~25-55 features ~155 tests
Medium ~105 features ~255 tests
Advanced/Complex ~155-205 features ~405+ tests

Additionally, the tiers use different naming ("Advanced" vs "Complex"), making it unclear whether they represent the same tier. The magnitude differences suggest these may be counting different things or based on different scope definitions. Clarify which counts are authoritative and align both templates accordingly.

🤖 Prompt for AI Agents
In @.claude/commands/create-spec.md around lines 231 - 233, The feature/test
counts and tier naming are inconsistent between .claude/commands/create-spec.md
(sections "Simple apps", "Medium apps", "Advanced apps") and
initializer_prompt.template.md (tiers with ~155/255/405+ tests and "Complex"
naming); choose the authoritative source (or define a single scope: "features"
vs "tests"), standardize the tier name ("Advanced" or "Complex"), and update
both files so the numeric ranges and tier labels match exactly — e.g., decide
whether counts represent features or tests, adjust the numbers in create-spec.md
or initializer_prompt.template.md accordingly, and ensure the headings and any
cross-references use the same tier names.


These are just reference points - your actual count should come from the requirements discussed.

**MANDATORY: Infrastructure Features**

If the app requires a database (Phase 3b answer was "Yes" or "Not sure"), you MUST include 5 Infrastructure features (indices 0-4):
1. Database connection established
2. Database schema applied correctly
3. Data persists across server restart
4. No mock data patterns in codebase
5. Backend API queries real database

These features ensure the coding agent implements a real database, not mock data or in-memory storage.

**How to count features:**
For each feature area discussed, estimate the number of discrete, testable behaviors:

Expand All @@ -225,17 +257,20 @@ For each feature area discussed, estimate the number of discrete, testable behav

> "Based on what we discussed, here's my feature breakdown:
>
> - **Infrastructure (required)**: 5 features (database setup, persistence verification)
> - [Category 1]: ~X features
> - [Category 2]: ~Y features
> - [Category 3]: ~Z features
> - ...
>
> **Total: ~N features**
> **Total: ~N features** (including 5 infrastructure)
>
> Does this seem right, or should I adjust?"

Let the user confirm or adjust. This becomes your `feature_count` for the spec.

**Important:** The first 5 features (indices 0-4) created by the initializer MUST be the Infrastructure category with no dependencies. All other features depend on these.

## Phase 5: Technical Details (DERIVED OR DISCUSSED)

**For Quick Mode users:**
Expand Down
93 changes: 89 additions & 4 deletions .claude/templates/coding_prompt.template.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,9 @@ Use browser automation tools:
- [ ] Deleted the test data - verified it's gone everywhere
- [ ] NO unexplained data appeared (would indicate mock data)
- [ ] Dashboard/counts reflect real numbers after my changes
- [ ] **Ran extended mock data grep (STEP 5.6) - no hits in src/ (excluding tests)**
- [ ] **Verified no globalThis, devStore, or dev-store patterns**
- [ ] **Server restart test passed (STEP 5.7) - data persists across restart**

#### Navigation Verification

Expand All @@ -174,10 +177,92 @@ Use browser automation tools:

### STEP 5.6: MOCK DATA DETECTION (Before marking passing)

1. **Search code:** `grep -r "mockData\|fakeData\|TODO\|STUB" --include="*.ts" --include="*.tsx"`
2. **Runtime test:** Create unique data (e.g., "TEST_12345") → verify in UI → delete → verify gone
3. **Check database:** All displayed data must come from real DB queries
4. If unexplained data appears, it's mock data - fix before marking passing.
**Run ALL these grep checks. Any hits in src/ (excluding test files) require investigation:**

```bash
# Common exclusions for test files
EXCLUDE="--exclude=*.test.* --exclude=*.spec.* --exclude=*__test__* --exclude=*__mocks__*"

# 1. In-memory storage patterns (CRITICAL - catches dev-store)
grep -r "globalThis\." --include="*.ts" --include="*.tsx" --include="*.js" $EXCLUDE src/
grep -r "dev-store\|devStore\|DevStore\|mock-db\|mockDb" --include="*.ts" --include="*.tsx" --include="*.js" $EXCLUDE src/

# 2. Mock data variables
grep -r "mockData\|fakeData\|sampleData\|dummyData\|testData" --include="*.ts" --include="*.tsx" --include="*.js" $EXCLUDE src/

# 3. TODO/incomplete markers
grep -r "TODO.*real\|TODO.*database\|TODO.*API\|STUB\|MOCK" --include="*.ts" --include="*.tsx" --include="*.js" $EXCLUDE src/

# 4. Development-only conditionals
grep -r "isDevelopment\|isDev\|process\.env\.NODE_ENV.*development" --include="*.ts" --include="*.tsx" --include="*.js" $EXCLUDE src/

# 5. In-memory collections as data stores
grep -r "new Map\(\)\|new Set\(\)" --include="*.ts" --include="*.tsx" --include="*.js" $EXCLUDE src/ 2>/dev/null
Comment on lines +183 to +200
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Quote/array-ize EXCLUDE to prevent glob expansion.
Unquoted --exclude=*.test.* will expand in the shell, breaking the grep args and defeating test exclusions.

♻️ Suggested fix
-# Common exclusions for test files
-EXCLUDE="--exclude=*.test.* --exclude=*.spec.* --exclude=*__test__* --exclude=*__mocks__*"
+# Common exclusions for test files
+EXCLUDE=(--exclude='*.test.*' --exclude='*.spec.*' --exclude='*__test__*' --exclude='*__mocks__*')

 # 1. In-memory storage patterns (CRITICAL - catches dev-store)
-grep -r "globalThis\." --include="*.ts" --include="*.tsx" --include="*.js" $EXCLUDE src/
-grep -r "dev-store\|devStore\|DevStore\|mock-db\|mockDb" --include="*.ts" --include="*.tsx" --include="*.js" $EXCLUDE src/
+grep -r "globalThis\." --include="*.ts" --include="*.tsx" --include="*.js" "${EXCLUDE[@]}" src/
+grep -r "dev-store\|devStore\|DevStore\|mock-db\|mockDb" --include="*.ts" --include="*.tsx" --include="*.js" "${EXCLUDE[@]}" src/

 # 2. Mock data variables
-grep -r "mockData\|fakeData\|sampleData\|dummyData\|testData" --include="*.ts" --include="*.tsx" --include="*.js" $EXCLUDE src/
+grep -r "mockData\|fakeData\|sampleData\|dummyData\|testData" --include="*.ts" --include="*.tsx" --include="*.js" "${EXCLUDE[@]}" src/

 # 3. TODO/incomplete markers
-grep -r "TODO.*real\|TODO.*database\|TODO.*API\|STUB\|MOCK" --include="*.ts" --include="*.tsx" --include="*.js" $EXCLUDE src/
+grep -r "TODO.*real\|TODO.*database\|TODO.*API\|STUB\|MOCK" --include="*.ts" --include="*.tsx" --include="*.js" "${EXCLUDE[@]}" src/

 # 4. Development-only conditionals
-grep -r "isDevelopment\|isDev\|process\.env\.NODE_ENV.*development" --include="*.ts" --include="*.tsx" --include="*.js" $EXCLUDE src/
+grep -r "isDevelopment\|isDev\|process\.env\.NODE_ENV.*development" --include="*.ts" --include="*.tsx" --include="*.js" "${EXCLUDE[@]}" src/

 # 5. In-memory collections as data stores
-grep -r "new Map\(\)\|new Set\(\)" --include="*.ts" --include="*.tsx" --include="*.js" $EXCLUDE src/ 2>/dev/null
+grep -r "new Map\(\)\|new Set\(\)" --include="*.ts" --include="*.tsx" --include="*.js" "${EXCLUDE[@]}" src/ 2>/dev/null
🤖 Prompt for AI Agents
In @.claude/templates/coding_prompt.template.md around lines 183 - 200, The
EXCLUDE variable is unquoted and can trigger shell glob expansion (see EXCLUDE
and the subsequent grep invocations); update the definition to quote each
pattern or convert EXCLUDE into a shell array to preserve the literal --exclude
patterns (e.g. EXCLUDE="--exclude='*.test.*' --exclude='*.spec.*' ..." or
EXCLUDE=(--exclude="*.test.*" --exclude="*.spec.*" ...) and then change the grep
calls to use the quoted variable/array expansion (use "$EXCLUDE" for a quoted
string or "${EXCLUDE[@]}" for an array) so patterns are passed to grep literally
without shell expansion.

```

**Rule:** If ANY grep returns results in production code → investigate → FIX before marking passing.

**Runtime verification:**
1. Create unique data (e.g., "TEST_12345") → verify in UI → delete → verify gone
2. Check database directly - all displayed data must come from real DB queries
3. If unexplained data appears, it's mock data - fix before marking passing.

### STEP 5.7: SERVER RESTART PERSISTENCE TEST (MANDATORY for data features)

**When required:** Any feature involving CRUD operations or data persistence.

**This test is NON-NEGOTIABLE. It catches in-memory storage implementations that pass all other tests.**

**Steps:**

1. Create unique test data via UI or API (e.g., item named "RESTART_TEST_12345")
2. Verify data appears in UI and API response

3. **STOP the server completely:**
```bash
# Kill by port (safer - only kills the dev server, not VS Code/Claude Code/etc.)
# Unix/macOS:
lsof -ti :${PORT:-3000} | xargs kill -TERM 2>/dev/null || true
sleep 3
lsof -ti :${PORT:-3000} | xargs kill -9 2>/dev/null || true
sleep 2

# Windows alternative (use if lsof not available):
# netstat -ano | findstr :${PORT:-3000} | findstr LISTENING
# taskkill /F /PID <pid_from_above> 2>nul

# Verify server is stopped
if lsof -ti :${PORT:-3000} > /dev/null 2>&1; then
echo "ERROR: Server still running on port ${PORT:-3000}!"
exit 1
fi
```

4. **RESTART the server:**
```bash
./init.sh &
sleep 15 # Allow server to fully start
# Verify server is responding
if ! curl -f http://localhost:${PORT:-3000}/api/health && ! curl -f http://localhost:${PORT:-3000}; then
echo "ERROR: Server failed to start after restart"
exit 1
fi
```

5. **Query for test data - it MUST still exist**
- Via UI: Navigate to data location, verify data appears
- Via API: `curl http://localhost:${PORT:-3000}/api/items` - verify data in response

6. **If data is GONE:** Implementation uses in-memory storage → CRITICAL FAIL
- Run all grep commands from STEP 5.6 to identify the mock pattern
- You MUST fix the in-memory storage implementation before proceeding
- Replace in-memory storage with real database queries

7. **Clean up test data** after successful verification

**Why this test exists:** In-memory stores like `globalThis.devStore` pass all other tests because data persists during a single server run. Only a full server restart reveals this bug. Skipping this step WILL allow dev-store implementations to slip through.

**YOLO Mode Note:** Even in YOLO mode, this verification is MANDATORY for data features. Use curl instead of browser automation.

### STEP 6: UPDATE FEATURE STATUS (CAREFULLY!)

Expand Down
153 changes: 125 additions & 28 deletions .claude/templates/initializer_prompt.template.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,9 +36,9 @@ Use the feature_create_bulk tool to add all features at once. You can create fea

- Feature count must match the `feature_count` specified in app_spec.txt
- Reference tiers for other projects:
- **Simple apps**: ~150 tests
- **Medium apps**: ~250 tests
- **Complex apps**: ~400+ tests
- **Simple apps**: ~165 tests (includes 5 infrastructure)
- **Medium apps**: ~265 tests (includes 5 infrastructure)
- **Advanced apps**: ~405+ tests (includes 5 infrastructure)
- Both "functional" and "style" categories
- Mix of narrow tests (2-5 steps) and comprehensive tests (10+ steps)
- At least 25 tests MUST have 10+ steps each (more for complex apps)
Expand All @@ -60,8 +60,9 @@ Dependencies enable **parallel execution** of independent features. When specifi
2. **Can only depend on EARLIER features** (index must be less than current position)
3. **No circular dependencies** allowed
4. **Maximum 20 dependencies** per feature
5. **Foundation features (index 0-9)** should have NO dependencies
6. **60% of features after index 10** should have at least one dependency
5. **Infrastructure features (indices 0-4)** have NO dependencies - they run FIRST
6. **ALL features after index 4** MUST depend on `[0, 1, 2, 3, 4]` (infrastructure)
7. **60% of features after index 10** should have additional dependencies beyond infrastructure

### Dependency Types

Expand All @@ -82,30 +83,113 @@ Create WIDE dependency graphs, not linear chains:

```json
[
// FOUNDATION TIER (indices 0-2, no dependencies) - run first
{ "name": "App loads without errors", "category": "functional" },
{ "name": "Navigation bar displays", "category": "style" },
{ "name": "Homepage renders correctly", "category": "functional" },

// AUTH TIER (indices 3-5, depend on foundation) - run in parallel
{ "name": "User can register", "depends_on_indices": [0] },
{ "name": "User can login", "depends_on_indices": [0, 3] },
{ "name": "User can logout", "depends_on_indices": [4] },

// CORE CRUD TIER (indices 6-9) - WIDE GRAPH: all 4 depend on login
// All 4 start as soon as login passes!
{ "name": "User can create todo", "depends_on_indices": [4] },
{ "name": "User can view todos", "depends_on_indices": [4] },
{ "name": "User can edit todo", "depends_on_indices": [4, 6] },
{ "name": "User can delete todo", "depends_on_indices": [4, 6] },

// ADVANCED TIER (indices 10-11) - both depend on view, not each other
{ "name": "User can filter todos", "depends_on_indices": [7] },
{ "name": "User can search todos", "depends_on_indices": [7] }
// INFRASTRUCTURE TIER (indices 0-4, no dependencies) - MUST run first
{ "name": "Database connection established", "category": "functional" },
{ "name": "Database schema applied correctly", "category": "functional" },
{ "name": "Data persists across server restart", "category": "functional" },
{ "name": "No mock data patterns in codebase", "category": "functional" },
{ "name": "Backend API queries real database", "category": "functional" },

// FOUNDATION TIER (indices 5-7, depend on infrastructure)
{ "name": "App loads without errors", "category": "functional", "depends_on_indices": [0, 1, 2, 3, 4] },
{ "name": "Navigation bar displays", "category": "style", "depends_on_indices": [0, 1, 2, 3, 4] },
{ "name": "Homepage renders correctly", "category": "functional", "depends_on_indices": [0, 1, 2, 3, 4] },

// AUTH TIER (indices 8-10, depend on foundation + infrastructure)
{ "name": "User can register", "depends_on_indices": [0, 1, 2, 3, 4, 5] },
{ "name": "User can login", "depends_on_indices": [0, 1, 2, 3, 4, 5, 8] },
{ "name": "User can logout", "depends_on_indices": [0, 1, 2, 3, 4, 9] },

// CORE CRUD TIER (indices 11-14) - WIDE GRAPH: all 4 depend on login
{ "name": "User can create todo", "depends_on_indices": [0, 1, 2, 3, 4, 9] },
{ "name": "User can view todos", "depends_on_indices": [0, 1, 2, 3, 4, 9] },
{ "name": "User can edit todo", "depends_on_indices": [0, 1, 2, 3, 4, 9, 11] },
{ "name": "User can delete todo", "depends_on_indices": [0, 1, 2, 3, 4, 9, 11] },

// ADVANCED TIER (indices 15-16) - both depend on view, not each other
{ "name": "User can filter todos", "depends_on_indices": [0, 1, 2, 3, 4, 12] },
{ "name": "User can search todos", "depends_on_indices": [0, 1, 2, 3, 4, 12] }
]
```

**Result:** With 3 parallel agents, this 12-feature project completes in ~5-6 cycles instead of 12 sequential cycles.
**Result:** With 3 parallel agents, this project completes efficiently with proper database validation first.

---

## MANDATORY INFRASTRUCTURE FEATURES (Indices 0-4)

**CRITICAL:** Create these FIRST, before any functional features. These features ensure the application uses a real database, not mock data or in-memory storage.

| Index | Name | Test Steps |
|-------|------|------------|
| 0 | Database connection established | Start server → check logs for DB connection → health endpoint returns DB status |
| 1 | Database schema applied correctly | Connect to DB directly → list tables → verify schema matches spec |
| 2 | Data persists across server restart | Create via API → STOP server completely → START server → query API → data still exists |
| 3 | No mock data patterns in codebase | Run grep for prohibited patterns → must return empty |
| 4 | Backend API queries real database | Check server logs → SQL/DB queries appear for API calls |

**ALL other features MUST depend on indices [0, 1, 2, 3, 4].**

### Infrastructure Feature Descriptions

**Feature 0 - Database connection established:**
```text
Steps:
1. Start the development server
2. Check server logs for database connection message
3. Call health endpoint (e.g., GET /api/health)
4. Verify response includes database status: connected
```

**Feature 1 - Database schema applied correctly:**
```text
Steps:
1. Connect to database directly (sqlite3, psql, etc.)
2. List all tables in the database
3. Verify tables match what's defined in app_spec.txt
4. Verify key columns exist on each table
```

**Feature 2 - Data persists across server restart (CRITICAL):**
```text
Steps:
1. Create unique test data via API (e.g., POST /api/items with name "RESTART_TEST_12345")
2. Verify data appears in API response (GET /api/items)
3. STOP the server completely (kill by port to avoid killing unrelated Node processes):
- Unix/macOS: lsof -ti :$PORT | xargs kill -9 2>/dev/null || true && sleep 5
- Windows: FOR /F "tokens=5" %a IN ('netstat -aon ^| find ":$PORT"') DO taskkill /F /PID %a 2>nul
- Note: Replace $PORT with actual port (e.g., 3000)
4. Verify server is stopped: lsof -ti :$PORT returns nothing (or netstat on Windows)
5. RESTART the server: ./init.sh & sleep 15
6. Query API again: GET /api/items
7. Verify "RESTART_TEST_12345" still exists
8. If data is GONE → CRITICAL FAILURE (in-memory storage detected)
9. Clean up test data
```

**Feature 3 - No mock data patterns in codebase:**
```text
Steps:
1. Run: grep -r "globalThis\." --include="*.ts" --include="*.tsx" --include="*.js" src/
2. Run: grep -r "dev-store\|devStore\|DevStore\|mock-db\|mockDb" --include="*.ts" --include="*.tsx" --include="*.js" src/
3. Run: grep -r "mockData\|testData\|fakeData\|sampleData\|dummyData" --include="*.ts" --include="*.tsx" --include="*.js" src/
4. Run: grep -r "TODO.*real\|TODO.*database\|TODO.*API\|STUB\|MOCK" --include="*.ts" --include="*.tsx" --include="*.js" src/
5. Run: grep -r "isDevelopment\|isDev\|process\.env\.NODE_ENV.*development" --include="*.ts" --include="*.tsx" --include="*.js" src/
6. Run: grep -r "new Map\(\)\|new Set\(\)" --include="*.ts" --include="*.tsx" --include="*.js" src/ 2>/dev/null
7. Run: grep -E "json-server|miragejs|msw" package.json
8. ALL grep commands must return empty (exit code 1)
9. If any returns results → investigate and fix before passing
```

**Feature 4 - Backend API queries real database:**
```text
Steps:
1. Start server with verbose logging
2. Make API call (e.g., GET /api/items)
3. Check server logs
4. Verify SQL query appears (SELECT, INSERT, etc.) or ORM query log
5. If no DB queries in logs → implementation is using mock data
```

---

Expand All @@ -115,8 +199,9 @@ The feature_list.json **MUST** include tests from ALL 20 categories. Minimum cou

### Category Distribution by Complexity Tier

| Category | Simple | Medium | Complex |
| Category | Simple | Medium | Advanced |
| -------------------------------- | ------- | ------- | -------- |
| **0. Infrastructure (REQUIRED)** | 5 | 5 | 5 |
| A. Security & Access Control | 5 | 20 | 40 |
| B. Navigation Integrity | 15 | 25 | 40 |
| C. Real Data Verification | 20 | 30 | 50 |
Expand All @@ -137,12 +222,14 @@ The feature_list.json **MUST** include tests from ALL 20 categories. Minimum cou
| R. Concurrency & Race Conditions | 5 | 8 | 15 |
| S. Export/Import | 5 | 6 | 10 |
| T. Performance | 5 | 5 | 10 |
| **TOTAL** | **150** | **250** | **400+** |
| **TOTAL** | **165** | **265** | **405+** |

---

### Category Descriptions

**0. Infrastructure (REQUIRED - Priority 0)** - Database connectivity, schema existence, data persistence across server restart, absence of mock patterns. These features MUST pass before any functional features can begin. All tiers require exactly 5 infrastructure features (indices 0-4).

**A. Security & Access Control** - Test unauthorized access blocking, permission enforcement, session management, role-based access, and data isolation between users.

**B. Navigation Integrity** - Test all buttons, links, menus, breadcrumbs, deep links, back button behavior, 404 handling, and post-login/logout redirects.
Expand Down Expand Up @@ -205,6 +292,16 @@ The feature_list.json must include tests that **actively verify real data** and
- `setTimeout` simulating API delays with static data
- Static returns instead of database queries

**Additional prohibited patterns (in-memory stores):**

- `globalThis.` (in-memory storage pattern)
- `dev-store`, `devStore`, `DevStore` (development stores)
- `json-server`, `mirage`, `msw` (mock backends)
- `Map()` or `Set()` used as primary data store
- Environment checks like `if (process.env.NODE_ENV === 'development')` for data routing

**Why this matters:** In-memory stores (like `globalThis.devStore`) will pass simple tests because data persists during a single server run. But data is LOST on server restart, which is unacceptable for production. The Infrastructure features (0-4) specifically test for this by requiring data to survive a full server restart.

---

**CRITICAL INSTRUCTION:**
Expand Down
Loading