-
Notifications
You must be signed in to change notification settings - Fork 54
Closed as not planned
Description
Description
Create comprehensive testing infrastructure for SDK workflows, including unit tests, integration tests, and validation tools.
Part of Epic
#10154 - Copilot SDK Integration for Advanced Agentic Workflows
Testing Requirements
1. Unit Tests
Test SDK engine compilation logic in isolation:
Test Areas:
- Frontmatter parsing for SDK configuration
- Session config generation
- Inline tool compilation
- Event handler generation
- Multi-agent configuration
- Backward compatibility with CLI mode
Test Files:
pkg/workflow/copilot_sdk_engine_test.go
pkg/workflow/copilot_sdk_session_test.go
pkg/workflow/copilot_sdk_tools_test.go
pkg/workflow/copilot_sdk_events_test.go
2. Integration Tests
Test generated workflows in actual GitHub Actions environment:
Test Workflows:
pkg/cli/workflows/
├── test-sdk-single-turn.md
├── test-sdk-multi-turn.md
├── test-sdk-custom-tools.md
├── test-sdk-event-handlers.md
├── test-sdk-multi-agent.md
└── test-sdk-migration.md
3. Comparison Tests
Compare SDK vs CLI for equivalent workflows:
Comparison Dimensions:
- Functional correctness
- Performance (latency, token usage)
- Cost (tokens, API calls)
- Reliability (success rate, error handling)
- Observability (logs, metrics)
4. Validation Tools
SDK Compatibility Checker:
gh aw check-sdk-compatibility workflow.md
# Output:
✅ Workflow is compatible with SDK mode
⚠️ Custom bash tools recommended for conversion to inline tools
💡 Consider enabling session persistence for multi-turnSDK Migration Tool:
gh aw migrate-to-sdk workflow.md
# Output:
✅ Migrated workflow.md to SDK mode
📝 Created workflow.sdk.md
📊 Compatibility report: workflow-migration-report.mdSDK Validator:
gh aw validate-sdk workflow.md
# Validates:
- Session configuration
- Inline tool syntax
- Event handler registration
- Multi-agent setup
- Resource limitsTest Implementation
Unit Test Example
func TestCopilotSDKEngine_CompileSession(t *testing.T) {
tests := []struct {
name string
config SessionConfig
want string
err bool
}{
{
name: "basic session config",
config: SessionConfig{
Persistent: true,
Storage: "artifacts",
MaxTurns: 10,
},
want: "session-config.json",
err: false,
},
// More test cases...
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
engine := NewCopilotSDKEngine()
got, err := engine.CompileSessionConfig(&tt.config)
if (err != nil) != tt.err {
t.Errorf("CompileSessionConfig() error = %v, wantErr %v", err, tt.err)
}
// Assertions...
})
}
}Integration Test Example
---
# test-sdk-multi-turn.md
engine:
id: copilot
mode: sdk
session:
persistent: true
max-turns: 3
tools:
github:
allowed: [issue_read]
---
# Multi-Turn Test
Turn 1: What is the issue title?
Turn 2: Summarize the issue description.
Turn 3: Suggest next steps.Test Validation:
func TestSDKMultiTurn(t *testing.T) {
// Compile workflow
workflow := compileWorkflow("test-sdk-multi-turn.md")
// Run in test environment
result := runWorkflow(workflow)
// Validate
assert.Equal(t, 3, result.Metrics.Turns)
assert.True(t, result.SessionPersisted)
assert.NotEmpty(t, result.Outputs)
}Comparison Test Example
func TestSDKvsCLI_Performance(t *testing.T) {
prompt := "Analyze code quality"
// Run with CLI
cliStart := time.Now()
cliResult := runCLI(prompt)
cliDuration := time.Since(cliStart)
// Run with SDK
sdkStart := time.Now()
sdkResult := runSDK(prompt)
sdkDuration := time.Since(sdkStart)
// Compare results
assert.Equal(t, cliResult.Output, sdkResult.Output)
// Log performance metrics
t.Logf("CLI: %v, SDK: %v", cliDuration, sdkDuration)
t.Logf("CLI tokens: %d, SDK tokens: %d", cliResult.Tokens, sdkResult.Tokens)
}Validation Tools Implementation
Compatibility Checker
type CompatibilityChecker struct {
workflow *Workflow
}
func (c *CompatibilityChecker) Check() (*CompatibilityReport, error) {
report := &CompatibilityReport{
Compatible: true,
Warnings: []string{},
Suggestions: []string{},
}
// Check for CLI-specific features
if c.workflow.HasBashTools() {
report.Warnings = append(report.Warnings,
"Custom bash tools could be converted to inline tools")
report.Suggestions = append(report.Suggestions,
"Consider using SDK inline tools for better integration")
}
// Check for multi-turn patterns
if c.workflow.HasMultiTurnPattern() {
report.Suggestions = append(report.Suggestions,
"Enable session persistence for multi-turn conversations")
}
return report, nil
}Migration Tool
func MigrateToSDK(workflowPath string) error {
// Parse existing workflow
workflow, err := parseWorkflow(workflowPath)
if err != nil {
return err
}
// Convert frontmatter
workflow.Engine.Mode = "sdk"
// Add session config if multi-turn detected
if detectMultiTurn(workflow) {
workflow.Engine.Session = &SessionConfig{
Persistent: true,
MaxTurns: 10,
}
}
// Convert bash tools to inline tools
for _, bashTool := range workflow.Tools.Bash {
inlineTool := convertToInlineTool(bashTool)
workflow.Tools.Inline = append(workflow.Tools.Inline, inlineTool)
}
// Write migrated workflow
return writeWorkflow(workflow, workflowPath+".sdk.md")
}Implementation Tasks
- Create unit test suite for SDK engine
- Add integration tests for SDK workflows
- Implement comparison test framework
- Build compatibility checker tool
- Build migration tool
- Create SDK validator
- Add performance benchmarks
- Create test data generators
- Add CI/CD integration for SDK tests
- Document testing best practices
Success Criteria
- >80% code coverage for SDK engine
- All integration tests passing
- Comparison tests show SDK functional parity
- Validation tools working and documented
- Migration tool successfully converts example workflows
- Performance benchmarks established
- CI/CD running SDK tests automatically
Performance Benchmarks
Track key metrics:
- Compilation time: CLI vs SDK
- Execution latency: single-turn vs multi-turn
- Token efficiency: context retention benefits
- Memory usage: session state overhead
- Cost: per-workflow execution costs
CI/CD Integration
# .github/workflows/test-sdk.yml
name: Test SDK Engine
on: [push, pull_request]
jobs:
test-sdk:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run SDK unit tests
run: go test -v ./pkg/workflow/copilot_sdk*
- name: Run SDK integration tests
run: make test-sdk-integration
- name: Run comparison tests
run: make test-sdk-vs-cli
- name: Performance benchmarks
run: make benchmark-sdkReferences
- Current CLI testing infrastructure
- SDK POC testing from Design session state management for SDK workflows #10155
- Integration test patterns
Priority: High - Ensures quality
Estimated Effort: 7-10 days
Dependencies: #10159 (SDK engine implementation)
Skills Required: Go testing, GitHub Actions, test automation
Reactions are currently unavailable