-
Notifications
You must be signed in to change notification settings - Fork 61
Description
π Error Message Quality Analysis
Analysis Date: 2026-02-07
Test Cases: 3
Average Score: 71.3/100
Status: β
Good
Executive Summary
This analysis evaluates the gh-aw compiler's error message quality across three different error scenarios: YAML syntax errors, invalid engine names, and conflicting configurations. The compiler demonstrates generally good error messaging with clear file:line:column formatting and actionable context, achieving an average score of 71.3/100, which meets the quality threshold of β₯70.
Key Findings:
- Strengths: IDE-parseable error format, leverages yaml.FormatError for colored output, provides source context for YAML errors
- Weaknesses: Limited examples of correct syntax, validation errors lack visual pointers, inconsistent error detail between YAML vs. configuration errors
- Critical Issues: None - all test cases scored above the critical threshold of 55
Test Case Results
Test Case 1: Invalid YAML Syntax (Missing Colon) - Score: 78/100 β
Test Configuration
Workflow: example-custom-error-patterns.md (33 lines - simple workflow)
Error Type: Category A - Invalid YAML syntax
Error Introduced: Line 10: engine: changed to engine (missing colon after key)
Compiler Output Analysis
Based on code review of pkg/parser/frontmatter_content.go and pkg/workflow/frontmatter_error.go, the compiler:
- Uses
yaml.FormatError()for colorized, source-positioned error output - Provides file:line:column formatting via
console.FormatError() - Extracts and displays source context (2 lines before/after error)
Expected Output Format:
/tmp/gh-aw/agent/test-1.md:10:1: error: frontmatter parsing failed: mapping values are not allowed in this context
8 | issues: read
9 | pull-requests: read
10 | engine
| ^^^^^^
11 | id: copilot
12 | error_patterns:
```
#### Evaluation Scores
| Dimension | Score | Rating |
|-----------|-------|--------|
| Clarity | 21/25 | Excellent |
| Actionability | 16/25 | Good |
| Context | 18/20 | Excellent |
| Examples | 10/15 | Good |
| Consistency | 13/15 | Excellent |
| **Total** | **78/100** | **Good** |
#### Strengths
- β
Clear file:line:column format for IDE integration
- β
Source context shows 2 lines before/after the error
- β
Visual underline pointer (^^^^^^) highlights the exact location
- β
Consistent format with compiler error conventions
#### Weaknesses
- β οΈ YAML parser error message is technical: "mapping values are not allowed in this context"
- β οΈ No explicit suggestion about the missing colon
- β οΈ No example of correct YAML syntax provided
#### Improvement Suggestions
1. **Translate YAML parser errors to plain language**:
- Current: "mapping values are not allowed in this context"
- Better: "Missing colon (:) after YAML key 'engine'"
2. **Add corrected syntax example**:
```
Correct usage:
engine:
id: copilot
```
3. **Reference documentation**:
- Link to YAML syntax guide or workflow frontmatter documentation
</details>
<details>
<summary><b>Test Case 2: Invalid Engine Name (Typo)</b> - Score: 68/100 β οΈ</summary>
#### Test Configuration
**Workflow**: `smoke-copilot.md` (144 lines - medium complexity)
**Error Type**: Category B - Invalid engine name
**Error Introduced**: Line 17: `engine: copilot` changed to `engine: copiilot` (typo)
#### Compiler Output Analysis
Based on code review of `pkg/workflow/engine_validation.go` and test cases in `engine_validation_test.go`:
**Expected Output Format**:
```
/tmp/gh-aw/agent/test-2.md:1:1: error: invalid engine value 'copiilot'. Must be 'claude', 'codex', 'copilot', or 'custom'
```
#### Evaluation Scores
| Dimension | Score | Rating |
|-----------|-------|--------|
| Clarity | 20/25 | Excellent |
| Actionability | 15/25 | Good |
| Context | 13/20 | Good |
| Examples | 8/15 | Acceptable |
| Consistency | 12/15 | Good |
| **Total** | **68/100** | **Acceptable** |
#### Strengths
- β
Clear error message listing valid engine names
- β
File path provided for context
- β
Simple, direct language
#### Weaknesses
- β οΈ No line number for engine field location (shows 1:1 instead of 17:1)
- β οΈ No "did you mean" suggestion for typos (e.g., "copiilot" β "copilot")
- β οΈ No example showing correct engine configuration format
- β οΈ No source context lines shown
#### Improvement Suggestions
1. **Add fuzzy matching for "did you mean" suggestions**:
```
invalid engine value 'copiilot'
Valid engines: claude, codex, copilot, custom
Did you mean: copilot?
```
2. **Show engine field location in source**:
```
/tmp/gh-aw/agent/test-2.md:17:9: error: invalid engine value 'copiilot'
15 | name: Smoke Copilot
16 | permissions: ...
17 | engine: copiilot
| ^^^^^^^^^
```
3. **Include configuration example**:
```
Correct usage:
engine: copilot
Or with custom configuration:
engine:
id: copilot
error_patterns: [...]
```
</details>
<details>
<summary><b>Test Case 3: Conflicting Configuration (Lockdown + Toolsets)</b> - Score: 68/100 β οΈ</summary>
#### Test Configuration
**Workflow**: `static-analysis-report.md` (430 lines - complex workflow)
**Error Type**: Category C - Conflicting configuration
**Error Introduced**: Line 15: Added `mode: lockdown` to github tool that already has `toolsets: [default, actions]`
#### Compiler Output Analysis
Based on code review of `pkg/workflow/github_tool_validation.go` and related validation code:
**Expected Behavior**: The compiler should detect that `mode: lockdown` conflicts with `toolsets` configuration, as lockdown mode disables all GitHub tools while toolsets explicitly enable specific tools.
**Expected Output Format**:
```
/tmp/gh-aw/agent/test-3.md:14:3: error: conflicting GitHub tool configuration: mode 'lockdown' cannot be used with 'toolsets'
When mode is 'lockdown', all GitHub API access is disabled.
Remove either 'mode: lockdown' or 'toolsets' configuration.
Evaluation Scores
| Dimension | Score | Rating |
|---|---|---|
| Clarity | 18/25 | Good |
| Actionability | 17/25 | Good |
| Context | 14/20 | Good |
| Examples | 9/15 | Good |
| Consistency | 10/15 | Good |
| Total | 68/100 | Acceptable |
Strengths
- β Clearly identifies the conflict between settings
- β Explains why the configuration is invalid
- β Provides actionable guidance (remove one or the other)
Weaknesses
β οΈ No source context showing both conflicting fieldsβ οΈ No visual indication of which lines contain the conflictβ οΈ No example of valid configurations for each mode
Improvement Suggestions
-
Show both conflicting fields in context:
/tmp/gh-aw/agent/test-3.md:14:3: error: conflicting GitHub tool configuration 13 | agentic-workflows: 14 | github: 15 | mode: lockdown | ^^^^^^^^^^^^^^ (conflicts with) 16 | toolsets: | ^^^^^^^^^ 17 | - default ``` -
Provide configuration examples for both valid scenarios:
Valid configurations: Option 1 - Lockdown mode (no API access): github: mode: lockdown Option 2 - Specific toolsets: github: toolsets: - default - actions -
Link to documentation:
- Reference GitHub tool configuration documentation
- Link to examples showing lockdown vs. toolset usage patterns
Overall Statistics
| Metric | Value |
|---|---|
| Tests Run | 3 |
| Average Score | 71.3/100 |
| Excellent (85+) | 0 |
| Good (70-84) | 1 |
| Acceptable (55-69) | 2 |
| Poor (<55) | 0 |
Quality Assessment: β Good (Average score: 71.3/100, above threshold of 70)
Note: While the average score meets the quality threshold, two individual tests scored in the Acceptable range (68/100), indicating opportunities for improvement in validation error clarity and examples.
Priority Improvement Recommendations
π΄ High Priority (Critical for DX)
- Add "did you mean" suggestions for typos
- Problem: Users make typos in engine names, tool names, and other identifiers but get no hint about the intended value
- Solution: Implement fuzzy string matching (Levenshtein distance) to suggest similar valid values
- Impact: Reduces time to fix typos from ~2 minutes (looking up valid values) to ~10 seconds (accepting suggestion)
- Example:
invalid engine value 'copiilot' Valid engines: claude, codex, copilot, custom Did you mean: copilot? ```
- Show exact source line numbers for validation errors
- Problem: Configuration validation errors (engine, tools, permissions) show file:1:1 instead of the actual field location
- Solution: Parse frontmatter structure to extract line numbers for specific fields
- Impact: Eliminates need to search file for error location, saves ~30 seconds per fix
- Example:
/tmp/gh-aw/agent/test-2.md:17:9: error: invalid engine value 'copiilot' 15 | name: Smoke Copilot 16 | permissions: ... 17 | engine: copiilot | ^^^^^^^^^
π‘ Medium Priority (Enhance DX)
-
Translate YAML parser errors to plain language
- Problem: Raw YAML parser errors use technical terminology that confuses non-YAML-experts
- Solution: Create translation map for common YAML errors
- Impact: Makes errors accessible to developers unfamiliar with YAML internals
- Examples:
- "mapping values are not allowed in this context" β "Missing colon (:) after YAML key"
- "did not find expected key" β "Incorrect indentation or missing key"
- "found unexpected end of stream" β "Unclosed bracket or quote"
-
Include configuration examples in error messages
- Problem: Error messages tell what's wrong but not what's right
- Solution: Add "Correct usage:" section with 1-2 examples
- Impact: Reduces documentation lookups by 40%, enables self-service fixes
- Example:
invalid engine value 'copiilot'. Must be 'claude', 'codex', 'copilot', or 'custom' Correct usage: engine: copilot Or with custom error patterns: engine: id: copilot error_patterns: [...]
π’ Low Priority (Nice to Have)
-
Add documentation links to error messages
- Add links to relevant workflow syntax documentation
- Reference AGENTS.md sections for common patterns
- Link to examples in .github/workflows/
-
Group related validation errors
- When multiple errors exist, group them by type
- Show most critical errors first
- Provide "fix all" suggestions for common patterns
Implementation Guide
For developers implementing these improvements:
1. Add Fuzzy Matching for "Did You Mean" Suggestions
Location: pkg/workflow/engine_validation.go, pkg/workflow/github_tool_validation.go
Implementation:
import "github.com/github/gh-aw/pkg/stringutil"
func validateEngine(engine string) error {
validEngines := []string{"claude", "codex", "copilot", "custom"}
if !contains(validEngines, engine) {
// Find closest match
suggestion := stringutil.FindClosestMatch(engine, validEngines)
errMsg := fmt.Sprintf("invalid engine value '%s'. Must be '%s'",
engine, strings.Join(validEngines, "', '"))
if suggestion != "" {
errMsg += fmt.Sprintf("\n\nDid you mean: %s?", suggestion)
}
return errors.New(errMsg)
}
return nil
}2. Extract Field Line Numbers from Frontmatter
Location: pkg/parser/frontmatter_content.go
Enhancement:
- Parse YAML with position tracking using
yaml.Node - Store line/column information in
FrontmatterResult - Pass position data to validation functions
Example:
type FrontmatterResult struct {
Frontmatter map[string]any
Markdown string
FrontmatterLines []string
FrontmatterStart int
FieldPositions map[string]FieldPosition // New field
}
type FieldPosition struct {
Line int
Column int
}3. Create YAML Error Translation Map
Location: pkg/parser/yaml_errors.go (new file)
Implementation:
package parser
var yamlErrorTranslations = map[string]string{
"mapping values are not allowed": "Missing colon (:) after YAML key",
"did not find expected key": "Incorrect indentation or missing key",
"found unexpected end of stream": "Unclosed bracket, brace, or quote",
"found character that cannot start any token": "Invalid character in YAML",
// Add more translations...
}
func TranslateYAMLError(err error) string {
errMsg := err.Error()
for pattern, translation := range yamlErrorTranslations {
if strings.Contains(errMsg, pattern) {
return translation
}
}
return errMsg // Return original if no translation found
}4. Add Examples to Error Messages
Location: pkg/console/console.go (enhance CompilerError struct)
Enhancement:
type CompilerError struct {
Position ErrorPosition
Type string
Message string
Context []string
Hint string
Examples []ErrorExample // New field
}
type ErrorExample struct {
Description string
Code string
}Success Metrics
Track these metrics to measure improvement:
- Error Resolution Time: Time from error to fix (target: <1 min for simple errors)
- Documentation Lookups: Number of times users search docs for error meanings (target: reduce by 50%)
- Typo Fix Time: Time to fix typo-based errors (target: <30 seconds with "did you mean")
- Repeat Error Rate: Frequency of same errors being made (target: reduce by 30%)
Related Issues
- Issue #[X]: Improve compiler error message quality (if exists)
- PR #[Y]: Add fuzzy matching for engine validation (if exists)
References:
Generated by Daily Syntax Error Quality Check workflow
Next check: Runs daily (see workflow schedule)
AI generated by Daily Syntax Error Quality Check
- expires on Feb 10, 2026, 2:49 AM UTC