Bug Report Triage Service

A comprehensive Python service that uses LangChain, OpenAI, and Kafka to automatically read, triage, and create GitHub issues from bug reports. The service implements a multi-agent architecture where different agents handle specific aspects of the bug report processing workflow.

📚 Documentation

Comprehensive documentation is available in the docs/ directory:

🏗️ Architecture Documentation - Complete system architecture with detailed Mermaid diagrams
📋 API Documentation - REST API reference with OpenAPI 3.0 specification
🚀 Deployment Guide - Production deployment instructions for Kubernetes, Docker, and cloud platforms
⚙️ Operations Runbook - Production operations, monitoring, and incident response procedures

🏗️ Architecture Overview

The AI Pipeline uses a multi-agent architecture with event-driven processing through Apache Kafka:

graph LR
    BUG[Bug Report] --> TRIAGE[Triage Agent]
    TRIAGE --> TICKET[Ticket Creation Agent] 
    TICKET --> GITHUB[GitHub API Agent]
    GITHUB --> ISSUE[GitHub Issue]
    
    COORD[Coordinator Agent] --> TRIAGE
    COORD --> TICKET
    COORD --> GITHUB
    
    style BUG fill:#e8f5e8
    style ISSUE fill:#e1f5fe
    style COORD fill:#f3e5f5

Key Components:

Coordinator Agent: Workflow orchestration and monitoring
Triage Agent: AI-powered bug analysis and categorization
Ticket Creation Agent: GitHub issue formatting and generation
GitHub API Agent: Issue creation and API integration
Apache Kafka: Reliable message passing and event streaming
Redis: State management and request tracking
OpenAI GPT-4: Intelligent analysis and content generation

For detailed architecture information, see 📋 Architecture Documentation.

🚀 Features

✅ Intelligent Triage: Uses OpenAI to analyze priority, severity, and categorization
✅ Multi-Agent Architecture: Scalable, distributed processing
✅ Kafka Messaging: Reliable inter-agent communication
✅ State Management: Redis-based request tracking with timeouts
✅ GitHub Integration: Automatic issue creation (with mock support)
✅ Comprehensive Logging: Detailed logging and error handling
✅ Status Monitoring: Real-time progress tracking
✅ Graceful Shutdown: Proper cleanup and signal handling

📋 Requirements

System Dependencies

Python 3.8+
Kafka cluster (default: localhost:9092)
Redis server (default: localhost:6379)

API Keys

OpenAI API key
GitHub API token (optional for production use)

🛠️ Installation

Option 1: Podman (Recommended)

Clone the repository

git clone <repository-url>
cd bug-report-triage-service

Configure environment

cp .env.example .env
# Edit .env with your OpenAI API key (required)

Start with Podman

# Make the script executable (Linux/Mac)
chmod +x docker-start.sh

# Start infrastructure only (Kafka, Redis, UIs)
./docker-start.sh infrastructure

# OR start everything including the service
./docker-start.sh full

# OR use interactive mode
./docker-start.sh

Access web interfaces
- Kafka UI: http://localhost:8080
- Redis Commander: http://localhost:8081

Option 2: Manual Installation

Clone the repository

git clone <repository-url>
cd bug-report-triage-service

Install dependencies
```
pip install -r requirements.txt
```

Configure environment

cp .env.example .env
# Edit .env with your configuration

Start external services

# Start Kafka (example with Podman)
podman run -d --name kafka -p 9092:9092 \
  -e KAFKA_ZOOKEEPER_CONNECT=localhost:2181 \
  -e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:9092 \
  confluentinc/cp-kafka:latest

# Start Redis (example with Podman)
podman run -d --name redis -p 6379:6379 redis:alpine

⚙️ Configuration

Environment Variables (.env)

# Required
OPENAI_API_KEY=your_openai_api_key_here

# Kafka Configuration
KAFKA_BOOTSTRAP_SERVERS=localhost:9092

# GitHub Configuration (for production)
GITHUB_API_TOKEN=your_github_token_here
GITHUB_REPO_OWNER=your_github_username
GITHUB_REPO_NAME=your_repo_name

# Redis Configuration
REDIS_URL=redis://localhost:6379

Topics Configuration

The service automatically creates and uses these Kafka topics:

bug-reports
triage-results
ticket-creation
status-updates

🏃‍♂️ Usage

Starting the Service

python bug_report_service.py

Submitting Bug Reports

from bug_report_service import BugReportTriageService
from models import BugReport

# Create service instance
service = BugReportTriageService()

# Create a bug report
bug_report = BugReport(
    id="BUG-001",
    title="Login page crashes on mobile devices",
    description="The login page consistently crashes when accessed from mobile browsers.",
    reporter="user@example.com",
    environment="Mobile browsers (iOS Safari, Android Chrome)",
    steps_to_reproduce="1. Open app on mobile\n2. Navigate to login\n3. App crashes",
    expected_behavior="Login should work normally",
    actual_behavior="Page crashes with JavaScript error"
)

# Submit for processing
request_id = service.submit_bug_report(bug_report)
print(f"Request ID: {request_id}")

Monitoring Progress

# Check specific request status
status = service.get_request_status(request_id)
print(f"Status: {status['status']}")
print(f"Current Step: {status['current_step']}")

# Get all active requests
active_requests = service.get_all_active_requests()
for request in active_requests:
    print(f"Request {request['request_id']}: {request['status']}")

Example Output

When a bug report is processed, you'll see output like:

2024-01-01 12:00:00 - TriageAgent - INFO - Starting bug triage for request abc-123
2024-01-01 12:00:05 - TriageAgent - INFO - Triage completed for bug BUG-001: high/major
2024-01-01 12:00:06 - TicketCreationAgent - INFO - Starting GitHub issue creation for request abc-123
2024-01-01 12:00:10 - GitHubAPIAgent - INFO - GitHub issue created successfully: #1234
2024-01-01 12:00:11 - CoordinatorAgent - INFO - Request abc-123 completed successfully

📊 Data Models

BugReport

{
    "id": "BUG-001",
    "title": "Login page crashes",
    "description": "Detailed description...",
    "reporter": "user@example.com",
    "environment": "iOS Safari 15.0",
    "steps_to_reproduce": "1. Step one\n2. Step two",
    "expected_behavior": "Should work normally",
    "actual_behavior": "Crashes with error",
    "attachments": ["file1.log", "screenshot.png"],
    "created_at": "2024-01-01T12:00:00Z",
    "metadata": {"key": "value"}
}

TriageResult

{
    "bug_report_id": "BUG-001",
    "priority": "high",           # low, medium, high, critical
    "severity": "major",          # minor, moderate, major, blocker  
    "category": "frontend",
    "labels": ["bug", "mobile", "crash"],
    "assignee_suggestion": "frontend-team",
    "estimated_effort": "medium", # small, medium, large, extra-large
    "triage_notes": "Critical mobile issue..."
}

🧪 Testing

The project includes comprehensive test coverage with unit tests, integration tests, and automated CI/CD pipelines.

Quick Test Commands

# Run all tests
python run_tests.py --all-tests

# Run only unit tests
python run_tests.py --unit

# Run with coverage
pytest tests/unit/ --cov=. --cov-report=html

# Run full CI pipeline locally
python run_tests.py --full

Test Structure

tests/
├── conftest.py           # Test fixtures and configuration
├── unit/                 # Unit tests
│   ├── test_models.py   # Model validation tests
│   └── test_bug_report_service.py  # Service logic tests
└── integration/         # Integration tests
    └── test_end_to_end.py  # End-to-end workflow tests

Test Categories

Unit Tests: Test individual components in isolation with mocked dependencies
Integration Tests: Test component interactions with real or containerized services
End-to-End Tests: Test complete workflows from bug report submission to GitHub issue creation

Running Tests

Prerequisites

# Install test dependencies
pip install -r requirements-test.txt

Unit Tests

# Run unit tests with coverage
pytest tests/unit/ -v --cov=. --cov-report=html

# Run specific test file
pytest tests/unit/test_models.py -v

# Run specific test
pytest tests/unit/test_models.py::TestBugReport::test_bug_report_creation_valid -v

Integration Tests

# Run integration tests (requires running services)
pytest tests/integration/ -v -m "integration and not slow"

# Run with services (using docker-compose)
./docker-start.sh infrastructure
pytest tests/integration/ -v

Code Quality Checks

The project uses several tools for code quality:

Black: Code formatting
isort: Import sorting
Flake8: Linting
MyPy: Type checking
Safety: Vulnerability scanning
Bandit: Security linting

# Format code
python run_tests.py --format

# Run all quality checks
python run_tests.py --lint

# Run security checks
python run_tests.py --security

Test Runner Script

The run_tests.py script provides convenient commands:

# Show all options
python run_tests.py --help

# Install dependencies
python run_tests.py --install-deps

# Format code
python run_tests.py --format

# Run linting
python run_tests.py --lint

# Run unit tests
python run_tests.py --unit

# Run integration tests  
python run_tests.py --integration

# Run security checks
python run_tests.py --security

# Run complete CI pipeline locally
python run_tests.py --full

Demo and Manual Testing

Run the Demo

python example_usage.py

This will:

Initialize all service components
Show sample bug reports
Simulate the complete workflow
Display example triage results and GitHub issues

Manual Testing

Start the service: python bug_report_service.py
In another terminal, submit test bug reports using the example code
Monitor the logs to see the workflow progression
Check Redis for state persistence
Verify Kafka topics receive messages

CI/CD Pipeline

The project includes a comprehensive GitHub Actions CI/CD pipeline:

Pipeline Jobs

Lint and Format Check: Code formatting and linting validation
Unit Tests: Run on Python 3.9-3.12 with coverage reporting
Integration Tests: Full workflow testing with live services
Podman Tests: Container compatibility testing
Security Scan: Vulnerability and security analysis
Container Build: Build and test container images

Pipeline Triggers

Push to main or develop branches
Pull requests to main or develop branches

Pipeline Features

Multi-Python Version Testing: Tests on Python 3.9, 3.10, 3.11, and 3.12
Service Dependencies: Automatic Kafka and Redis service setup
Podman Support: Tests container functionality with Podman
Coverage Reporting: Code coverage analysis and reporting
Security Scanning: Dependency vulnerability and code security checks
Artifact Collection: Test reports and security scan results
Caching: Dependency caching for faster builds

Status Badges

Add these badges to track CI status:

![CI Status](https://github.com/your-username/AI-Pipeline/workflows/CI%20Pipeline/badge.svg)
![Coverage](https://codecov.io/gh/your-username/AI-Pipeline/branch/main/graph/badge.svg)

🔧 Development

Adding New Agents

Create a new agent class inheriting from BaseAgent:

from agents.base_agent import BaseAgent

class CustomAgent(BaseAgent):
    def __init__(self):
        super().__init__("CustomAgent")
    
    def get_system_prompt(self) -> str:
        return "Your agent's system prompt..."
    
    def process_message(self, topic: str, message: Dict[str, Any]) -> None:
        # Process incoming messages
        pass

Add the agent to bug_report_service.py
Configure Kafka consumer for the agent
Update the workflow as needed

Extending Functionality

Add new triage criteria: Modify the TriageAgent system prompt
Custom GitHub issue format: Update TicketCreationAgent prompt
Additional integrations: Create new agents for Slack, email, etc.
Enhanced monitoring: Add metrics collection and dashboards

🚨 Error Handling

The service includes comprehensive error handling:

Agent failures: Errors are logged and status is updated
Kafka connectivity: Automatic retries and graceful degradation
LLM failures: Retry logic with exponential backoff
Timeouts: Configurable timeouts with automatic cleanup
State corruption: Redis fallback and recovery mechanisms

📈 Scalability

The architecture supports horizontal scaling:

Agent scaling: Run multiple instances of each agent type
Kafka partitioning: Distribute load across partitions
Redis clustering: Scale state management
Load balancing: Use Kafka consumer groups for distribution

🔒 Security Considerations

Store API keys securely (environment variables, secrets management)
Use GitHub fine-grained tokens with minimal required permissions
Implement input validation for bug reports
Consider message encryption for sensitive data
Regular security updates for dependencies

📝 Logging

Logs are written to:

Console: Real-time monitoring during development
File: bug_report_service.log for persistent storage
Structured format: Timestamp, logger name, level, message

Log levels:

INFO: Normal operations, status updates
WARNING: Recoverable issues, timeouts
ERROR: Failures, exceptions
DEBUG: Detailed tracing (disabled by default)

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

LangChain: For LLM integration framework
OpenAI: For GPT-4 API
Apache Kafka: For reliable messaging
Redis: For state management
GitHub: For issue tracking integration

Quick Start Commands

# 1. Setup
pip install -r requirements.txt
cp .env.example .env
# Edit .env with your keys

# 2. Start dependencies (Podman)
podman run -d -p 9092:9092 --name kafka confluentinc/cp-kafka:latest
podman run -d -p 6379:6379 --name redis redis:alpine

# 3. Run demo
python example_usage.py

# 4. Start service
python bug_report_service.py

The service is now ready to intelligently triage your bug reports! 🎉

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
agents		agents
docs		docs
tests		tests
.env.example		.env.example
.flake8		.flake8
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
bug_report_service.py		bug_report_service.py
config.py		config.py
docker-compose.yml		docker-compose.yml
docker-start.sh		docker-start.sh
example_usage.py		example_usage.py
kafka_utils.py		kafka_utils.py
models.py		models.py
podman-setup.md		podman-setup.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements-test.txt		requirements-test.txt
requirements.txt		requirements.txt
run_tests.py		run_tests.py
state_manager.py		state_manager.py

ianlintner/AI-Pipeline

Folders and files

Latest commit

History

Repository files navigation