Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 65 additions & 2 deletions .claude/agents/pr-review-frontend.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,8 +64,71 @@ Do not re-explain rules that are documented in skills. Focus findings on specifi
1. **Review the PR context** — The diff, changed files, and PR metadata are available via your loaded `pr-context` skill
2. Read each file using Read tool
3. Evaluate against skill standards
4. Create Finding objects per `pr-review-output-contract` schema
5. Return raw JSON array (no prose, no code fences)
4. **Check for framework compiler features** (see section below)
5. Create Finding objects per `pr-review-output-contract` schema
6. Return raw JSON array (no prose, no code fences)

## Framework Compiler and Emerging Feature Awareness

Before flagging directives or unusual patterns as invalid, verify they aren't valid framework compiler features.

**Common false positives:**
- `'use memo'` - Valid React Compiler directive (not in React core)
- `'use cache'` - Valid Next.js 15+ directive for cache optimization
- Unusual component patterns that match new framework features

**Validation process:**
1. **Check package.json for compiler plugins:**
- `babel-plugin-react-compiler` → Enables React Compiler directives
- Check React version → React 19+ has Server Components
- Check Next.js version → Next.js 15+ has new directives

2. **Known directives by framework:**

**React (standard):**
- `'use client'` - Mark component as Client Component (React 18+)
- `'use server'` - Mark function for Server Actions (React 19+)
- `'use strict'` - JavaScript strict mode

**React Compiler (requires babel-plugin-react-compiler):**
- `'use memo'` - Compiler optimization hint
- Reference: https://react.dev/learn/react-compiler#annotations

**Next.js:**
- `'use client'` - Client Component boundary
- `'use server'` - Server Actions
- `'use cache'` - Cache function results (Next.js 15+)

3. **Before flagging as invalid:**
```bash
# Check for React Compiler
grep -q "babel-plugin-react-compiler" package.json

# Check framework versions
grep "\"react\":" package.json
grep "\"next\":" package.json
```

4. **When uncertain:**
- Note the unusual pattern with lower confidence
- Suggest verification rather than asserting it's invalid
- Reference framework docs if available

**Example: Handling 'use memo'**
```typescript
// DON'T flag as invalid if babel-plugin-react-compiler is present
function MyComponent() {
'use memo';
// ... component code
}

// Before flagging, check:
// 1. Is babel-plugin-react-compiler in package.json?
// 2. If yes, this is valid - no finding needed
// 3. If no, flag as potentially invalid with MEDIUM confidence
```

*Source: PR #1699 learning - bot incorrectly flagged valid React Compiler directive*

# Tool Policy

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,320 @@
# PR #1699 Learnings: Custom Libraries and Framework Features

**Source:** [PR #1699 - Notify users to enter custom headers if required in chat](https://github.com/inkeep/agents/pull/1699)

**Analysis Date:** 2026-02-06

**Patterns Identified:** 2 HIGH-generalizability patterns for automated PR review

---

## Executive Summary

Human reviewers in PR #1699 caught two classes of issues that automated bots completely missed:

1. **Custom implementation incompleteness**: A custom `jsonSchemaToZod` function was missing critical functionality (handling the `required` field), leading to incorrect validation behavior
2. **Framework compiler feature false positives**: Bot flagged valid React Compiler directive (`'use memo'`) as invalid

These learnings led to improvements in two PR review agents:
- `pr-review-standards`: Added detection for custom implementations vs standard libraries
- `pr-review-frontend`: Added framework compiler feature validation

---

## Pattern 1: Custom Implementations vs Standard Libraries

### What Humans Caught

**Sarah's insight:**
> "I think we should only show the error, block submission if the field(s) that are required in the headers are missing, but I think the chat still works if the non required headers are not set."

Sarah recognized that validation should distinguish between required and optional fields.

**Dima's root cause identification:**
> "yep, our custom `jsonSchemaToZod` incorrectly convert non required json schema fields as required zod schema fields"

Dima identified that the custom conversion function violated JSON Schema specification semantics by ignoring the `required` array.

### Why Bots Missed It

The custom `jsonSchemaToZod` function had:
- ✅ No syntax errors
- ✅ No type errors
- ✅ Logical flow that appeared correct
- ❌ **Missing implementation of the `required` field semantic**

Bots excel at syntactic correctness but don't validate compliance with domain specifications (JSON Schema, OpenAPI, GraphQL, etc.).

### The Solution

The PR switched from custom implementation to standard library:
```typescript
// BEFORE: Custom implementation (incomplete)
function jsonSchemaToZod(schema) {
if (schema.type === 'object' && schema.properties) {
const shape = {};
for (const [key, prop] of Object.entries(schema.properties)) {
shape[key] = jsonSchemaToZod(prop); // All fields become required!
}
return z.object(shape);
}
// ... other types
}

// AFTER: Standard library (complete)
import { convertJsonSchemaToZod } from 'zod-from-json-schema';
// or
import { z } from 'zod';
const schema = z.fromJSONSchema(jsonSchema);
```

Standard libraries handle:
- ✅ `required` vs optional field semantics
- ✅ `additionalProperties` handling
- ✅ Nested schema references (`$ref`)
- ✅ Format validators (email, uri, date-time)
- ✅ `enum`, `const`, `oneOf`, `allOf`, `anyOf`
- ✅ Numeric constraints (min, max, multipleOf)
- ✅ String patterns and length constraints

### Generalizability

**Applies to:**
- Schema conversion: JSON Schema ↔ Zod, OpenAPI ↔ types, GraphQL ↔ TypeScript
- Standard algorithms: JWT parsing, date formatting, URL parsing, base64 encoding
- Protocol implementations: OAuth flows, SAML, OpenID Connect
- Data structure operations: deep cloning, object merging, array utilities

**Detection heuristics:**
1. Function names matching common operations: `parse`, `convert`, `transform`, `validate`, `serialize`
2. Implementations of well-known standards (JSON Schema, JWT, OAuth)
3. TODO/FIXME comments mentioning library alternatives
4. Importing types but not implementation functions from libraries

**Review questions:**
- Does a well-maintained library exist for this functionality?
- Does the custom implementation handle all edge cases?
- Is there a documented reason for the custom approach?

### Example Finding Template

```markdown
**Issue:** Custom implementation of [STANDARD_FUNCTIONALITY]

This function implements [DESCRIBE_FUNCTIONALITY]. Consider using:
- [LIBRARY_NAME_1] ([npm link])
- [LIBRARY_NAME_2] ([npm link])

Custom implementations often miss edge cases like:
- [EDGE_CASE_1]
- [EDGE_CASE_2]
- [EDGE_CASE_3]

Unless there's a specific requirement (performance, bundle size, documented in comments),
prefer battle-tested libraries.

**Severity:** MAJOR (if missing critical functionality), MINOR (if complete but maintenance burden)
**Confidence:** HIGH (if library clearly exists), MEDIUM (if trade-offs are unclear)
```

---

## Pattern 2: Framework Compiler Features

### What Humans Caught

**Bot flagged:**
```typescript
function ChatWidget() {
'use memo'; // ❌ Bot: "Invalid 'use memo' directive"
// ...
}
```

**Human corrected:**
Dima provided link to React Compiler documentation showing `'use memo'` is a valid directive when `babel-plugin-react-compiler` is installed.

### Why Bots Missed It

Bots' training data doesn't include:
- Recent framework features (React Compiler, Next.js 15 directives)
- Compiler plugin-specific syntax
- Framework-specific pragmas and annotations

The bot saw an unusual string statement and flagged it as invalid without checking:
1. Is there a compiler plugin that enables this syntax?
2. Does framework documentation support this?

### The Solution

**Before flagging directives as invalid, check:**

1. **package.json for compiler plugins:**
```bash
grep -q "babel-plugin-react-compiler" package.json
```

2. **Framework versions:**
```bash
grep "\"react\":" package.json # React 19+ has Server Components
grep "\"next\":" package.json # Next.js 15+ has 'use cache'
```

3. **Known directives by framework:**

| Framework | Standard Directives | Compiler Directives | Plugin Required |
|-----------|-------------------|-------------------|-----------------|
| React | `'use client'`, `'use server'`, `'use strict'` | `'use memo'` | babel-plugin-react-compiler |
| Next.js | `'use client'`, `'use server'` | `'use cache'` | Next.js 15+ |
| JavaScript | `'use strict'` | N/A | N/A |

### Generalizability

**Applies to:**
- React Compiler directives (`'use memo'`)
- Next.js cache directives (`'use cache'`)
- Vue macros (`defineModel`, `defineProps` - Vue 3.3+)
- Svelte compiler hints
- Any framework-specific pragma or annotation

**Detection approach:**
1. Identify unusual patterns (standalone string statements, compiler hints)
2. Check package.json for framework compiler plugins
3. Cross-reference against known framework features
4. If uncertain, note with lower confidence rather than asserting invalid

**False positive prevention:**
```typescript
// Process:
// 1. See unusual directive → 'use memo'
// 2. Check package.json → babel-plugin-react-compiler present?
// 3. If YES → Valid, no finding needed
// 4. If NO → Flag with MEDIUM confidence, suggest verification
// 5. If UNCERTAIN → Note pattern with LOW confidence
```

---

## Patterns Evaluated but Not Implemented

### Pattern: Schema Validation Semantic Correctness

**Why valuable:** Catches bugs where validation logic violates specification semantics (JSON Schema `required` field, optional vs nullable, additionalProperties)

**Why not implemented:** Too complex for automated detection without deep spec knowledge. Requires:
- Understanding JSON Schema specification
- Recognizing semantic vs syntactic correctness
- Context about validation library behaviors
- High risk of false positives

**Status:** Valuable insight but requires human judgment. Consider for future AI improvements with spec-aware reasoning.

### Pattern: Validation Behavioral Correctness

**Why valuable:** Catches UX issues where validation blocking behavior doesn't match required/optional semantics

**Why not implemented:** Too context-dependent. Requires:
- Business logic understanding (when to block vs warn)
- User experience reasoning
- Understanding validation's impact on workflows
- Domain knowledge about acceptable partial states

**Status:** Best handled by human reviewers who understand product requirements.

---

## Implementation Changes

### pr-review-standards.md

**Added section:** "Custom Implementations vs Standard Libraries"

**Location:** After "Hard-Coded & Brute-Forced" section

**Content:**
- Detection heuristics for custom implementations
- Review questions to evaluate necessity
- Example finding template
- Reference to PR #1699 learning

### pr-review-frontend.md

**Added section:** "Framework Compiler and Emerging Feature Awareness"

**Location:** In "Workflow" section (step 4)

**Content:**
- Validation process for unusual directives
- Known directives by framework
- package.json checking procedures
- Example handling of `'use memo'`
- Reference to PR #1699 learning

---

## Success Metrics

These improvements should reduce false positives and increase valuable findings:

**Reduce false positives:**
- ✅ Don't flag valid React Compiler directives as invalid
- ✅ Don't flag valid Next.js 15+ directives as invalid
- ✅ Verify framework features before flagging as errors

**Increase valuable findings:**
- ✅ Flag custom implementations of standard functionality
- ✅ Suggest well-maintained libraries as alternatives
- ✅ Catch incomplete implementations missing edge cases

**Measurement approach:**
- Track false positive rate for directive-related findings
- Track acceptance rate for custom-implementation findings
- Compare bot findings to human reviewer insights in future PRs

---

## Related PRs and Context

**PR #1699 Details:**
- **Author:** dimaMachina
- **Merged:** 2026-02-05
- **Files Changed:** 45 files
- **Key Changes:**
- Replaced custom `jsonSchemaToZod` with `zod-from-json-schema` library
- Added custom headers validation in playground
- Fixed validation to respect required/optional field semantics

**Human Reviewers:**
- Sarah (@sarah-inkeep) - Caught validation behavior issue
- Miles (@miles-kt-inkeep) - Confirmed test placement issue
- Dima (@dimaMachina) - Identified root cause and solution

**Key Learning:**
The most valuable human insights were about **semantic correctness** (does this follow the JSON Schema spec?) and **library alternatives** (should we use a standard library?), not syntactic correctness. Bots excel at local code correctness but struggle with domain specification compliance and evaluating implementation choices.

---

## Future Directions

### Near-term (implementable now):
1. ✅ Maintain a database of common standards and their library implementations
2. ✅ Check package.json for framework compiler plugins before flagging unusual syntax
3. ✅ Flag TODO comments that mention library alternatives

### Medium-term (requires enhancement):
1. Build spec-aware validation for common standards (JSON Schema, OpenAPI)
2. Maintain framework feature database updated with each major version
3. Cross-reference function names against npm packages to suggest alternatives

### Long-term (requires AI reasoning improvements):
1. Semantic correctness validation against specifications
2. Behavioral correctness analysis (blocking vs warning, required vs optional)
3. Business context understanding for validation UX decisions

---

## Conclusion

PR #1699 revealed a critical blindspot in automated reviews: **bots focus on syntactic correctness but miss semantic correctness and implementation choices**. By adding checks for custom implementations and framework compiler features, we've addressed two high-generalizability patterns that will improve review quality across many PRs.

The key insight: **Human reviewers reason about specification compliance, library alternatives, and emerging framework features**. Automated reviews should validate these dimensions, not just syntax and types.
Loading