-
Notifications
You must be signed in to change notification settings - Fork 4
docs: Add agentic RAG architecture documentation #700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add environment variables to support SPIFFE workload identity integration for AI agents and services. This enables cryptographic machine identity with configurable migration phases: - SPIFFE_ENABLED: Toggle SPIFFE integration - SPIFFE_AUTH_MODE: Migration phases (disabled→optional→preferred→required) - SPIFFE_ENDPOINT_SOCKET: SPIRE Agent Workload API socket - SPIFFE_TRUST_DOMAIN: Trust domain for identity hierarchy - SPIFFE_LEGACY_JWT_WARNING: Track legacy auth usage during migration - SPIFFE_SVID_TTL_SECONDS: Certificate lifetime configuration - SPIFFE_JWT_AUDIENCES: Allowed JWT-SVID audiences Related to: MCP Context Forge integration (PR #684)
This architecture document outlines how to integrate SPIRE (SPIFFE Runtime Environment) into RAG Modulo to provide cryptographic workload identities for AI agents. This enables zero-trust agent authentication and secure agent-to-agent (A2A) communication. Key architectural decisions: - JWT-SVIDs for stateless verification (vs X.509 for mTLS) - Trust domain: spiffe://rag-modulo.example.com - Integration with IBM MCP Context Forge (PR #684) - Capability-based access control for agents - 5-phase implementation plan Agent types defined: - search-enricher: MCP tool invocation - cot-reasoning: Chain of Thought orchestration - question-decomposer: Query decomposition - source-attribution: Document source tracking - entity-extraction: Named entity recognition - answer-synthesis: Answer generation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit implements the SPIFFE/SPIRE integration for AI agent authentication as designed in docs/architecture/spire-integration-architecture.md. Key changes: - Add py-spiffe dependency for SPIFFE JWT-SVID support - Create core SPIFFE authentication module (spiffe_auth.py) with: - SPIFFEConfig for environment-based configuration - AgentPrincipal dataclass for authenticated agent identity - SPIFFEAuthenticator for JWT-SVID validation - AgentType and AgentCapability enums - Helper functions for SPIFFE ID parsing and building - Create Agent data model with SQLAlchemy: - Agent model with SPIFFE ID, type, capabilities, status - Relationships to User (owner) and Team - Status management (active, suspended, revoked) - Add Agent repository, service, and router layers: - Full CRUD operations for agents - Agent registration with SPIFFE ID generation - Status and capability management - JWT-SVID validation endpoint - Extend AuthenticationMiddleware to detect and validate SPIFFE JWT-SVIDs - Add SPIRE deployment configuration templates: - server.conf, agent.conf for SPIRE configuration - docker-compose.spire.yml for local development - README.md with deployment instructions - Add comprehensive unit tests for all SPIFFE components Reference: PR #695 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Critical fixes: - Add database migration for agents table (migrations/add_agents_table.sql) - Fix signature verification security: failed validation now always rejects (prevents fallback bypass attack) - Fix timezone handling: use UTC consistently for JWT timestamps Improvements: - Align env vars with .env.example (SPIFFE_JWT_AUDIENCES, SPIFFE_SVID_TTL_SECONDS) - Add capability enforcement decorator (require_capabilities) - Add OpenAPI tags metadata for agents endpoint - Update and expand unit tests (47 tests passing) Addresses review comments from PR #695. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…served word SQLAlchemy's Declarative API reserves the 'metadata' attribute name. Renamed the field to 'agent_metadata' in the model while keeping the database column name as 'metadata' via explicit column name mapping. This also updates the schema to use validation_alias for proper model_validate() from ORM objects. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
The test_validate_jwt_svid_valid test was failing because AgentPrincipal requires a trust_domain field which was not being provided. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Critical fixes: - Fix timezone-naive datetime to use UTC throughout (agent.py, agent_repository.py) - Change default agent status from ACTIVE to PENDING for approval workflow - Add RuntimeError when SPIFFE enabled but py-spiffe library missing - Restrict trust domain to configured value only (security fix) High priority security fixes: - Add capability validation per agent type (ALLOWED_CAPABILITIES_BY_TYPE) - Add authentication requirement to SPIFFE validation endpoint - Reject user-specified trust domains that don't match server config Code quality improvements: - Add OpenAPI tags metadata for agent router documentation - Fix require_capabilities decorator type hints (ParamSpec, TypeVar) - Add composite database indexes (owner+status, type+status, team+status) - Update migration script with new composite indexes Test updates: - Update test_register_agent_with_custom_trust_domain to verify rejection - Fix test_authenticator_creates_principal_with_fallback to mock spiffe module 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Add comprehensive architecture documentation for the Agentic RAG Platform: - agentic-ui-architecture.md: React component hierarchy, state management, and API integration for agent features - backend-architecture-diagram.md: Overall backend architecture with Mermaid diagrams showing service layers and data flow - mcp-integration-architecture.md: MCP client/server integration strategy, PR comparison (#671 vs #684), and Context Forge integration - rag-modulo-mcp-server-architecture.md: Exposing RAG capabilities as MCP server with tools (rag_search, rag_ingest, etc.) and resources - search-agent-hooks-architecture.md: 3-stage agent pipeline (pre-search, post-search, response) with database schema and execution flow - system-architecture.md: Complete system architecture overview with technology stack and data flows These documents guide implementation of: - PR #695 (SPIFFE/SPIRE agent identity) - PR #671 (MCP Gateway client) - Issue #697 (Agent execution hooks) - Issue #698 (MCP Server) - Issue #699 (Agentic UI) Closes #696 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
🚀 Development Environment OptionsThis repository supports Dev Containers for a consistent development environment. Option 1: GitHub Codespaces (Recommended)Create a cloud-based development environment:
Option 2: VS Code Dev Containers (Local)Use Dev Containers on your local machine:
Option 3: Traditional Local SetupSet up the development environment manually: # Clone the repository
git clone https://github.com/manavgup/rag_modulo.git
cd rag_modulo
git checkout docs/agentic-architecture-696
# Initialize development environment
make dev-init
make dev-build
make dev-up
make dev-validateAvailable CommandsOnce in your development environment: make help # Show all available commands
make dev-validate # Validate environment setup
make test-atomic # Run atomic tests
make test-unit # Run unit tests
make lint # Run lintingServices AvailableWhen running
This automated message helps reviewers quickly set up the development environment. |
Pull Request Review: Agentic RAG Architecture Documentation & SPIFFE/SPIRE IntegrationOverviewThis PR introduces comprehensive architecture documentation and a complete SPIFFE/SPIRE identity integration for agent workload authentication. The implementation is ambitious, adding ~9,673 lines across 29 files including core infrastructure, database models, services, and extensive documentation. ✅ Strengths1. Comprehensive Architecture Documentation
2. Security-First Design
3. Production-Ready Code Quality
4. Database Design Excellence
5. Clean Architecture
🔍 Issues & RecommendationsCRITICAL Issues1. Missing Dependency: py-spiffeLocation: Issue: The code imports from spiffe import JwtSource, WorkloadApiClient # type: ignore[import-not-found]Impact: Runtime ImportError when SPIFFE is enabled Fix Required: poetry add py-spiffe
poetry lockVerification: Check that 2. Migration Script Missing psycopg2 Dependency CheckLocation: Issue: Script imports import psycopg2
from dotenv import load_dotenvImpact: Migration will fail if psycopg2 not installed Recommendation: Add error handling: try:
import psycopg2
except ImportError as e:
print("ERROR: psycopg2 is required. Install with: pip install psycopg2-binary")
sys.exit(1)3. Security: Signature Validation FallbackLocation: Issue: The fallback mode accepts tokens without signature validation in development: if self.config.fallback_to_jwt:
logger.warning(
"SPIRE unavailable, accepting token without signature validation. "
"This is ONLY safe in development environments."
)Concern: While the security note is present, this could be dangerous if accidentally enabled in production. Recommendation: Add environment check: if self.config.fallback_to_jwt:
if os.getenv("ENVIRONMENT", "development") == "production":
logger.error("SPIRE unavailable in production. Fallback disabled for security.")
return None
logger.warning("...")HIGH Priority Issues4. Authentication Middleware: Agent vs User ConfusionLocation: Issue: Agent authentication sets agent_data = {...}
request.state.user = agent_data # For backward compatibilityConcern: This violates the principle of least surprise. Downstream code checking Recommendation:
5. Race Condition in Agent RegistrationLocation: Issue: Agent instance ID uses UUID prefix without checking uniqueness: agent_instance_id = str(uuid.uuid4())[:8]Concern: While collision probability is low, there's no database uniqueness check. Recommendation: Either:
6. Missing Index on last_seen_atLocation: Issue: Use Case: Queries like "find inactive agents" will do full table scans. Recommendation: Add: CREATE INDEX IF NOT EXISTS idx_agents_last_seen_at ON agents(last_seen_at DESC)
WHERE last_seen_at IS NOT NULL;7. SPIRE Docker Compose Not IntegratedLocation: Issue: This is a standalone compose file, not integrated with main Impact: Developers won't know how to run SPIRE with local dev Recommendation:
MEDIUM Priority Issues8. Inconsistent Enum DefinitionsLocation: Multiple files Issue:
Concern: Potential inconsistency and maintenance burden Recommendation:
9. Missing Error Handling for JWT DecodeLocation: Issue: JWT decode in try:
unverified = jwt.decode(token, options={"verify_signature": False})
...
except Exception:
return FalseConcern: Masks all errors, including programming errors Recommendation: Be specific: except (jwt.DecodeError, jwt.InvalidTokenError):
return False
except Exception as e:
logger.error(f"Unexpected error checking SPIFFE JWT-SVID: {e}")
return False10. Repository Error Handling - Lost ContextLocation: Issue: Generic catch-all loses original exception context: except Exception as e:
self.db.rollback()
logger.error(f"Error creating agent: {e!s}")
raise RepositoryError(f"Failed to create agent: {e!s}") from eRecommendation: Handle specific exceptions: except (IntegrityError, SQLAlchemyError) as e:
self.db.rollback()
raise RepositoryError(f"Database error creating agent: {e!s}") from e11. Missing API DocumentationLocation: Issue: Endpoints lack OpenAPI examples in docstrings Impact: API documentation will be less helpful Recommendation: Add OpenAPI examples: @router.post(
"/register",
response_model=AgentRegistrationResponse,
responses={
201: {"description": "Agent registered successfully"},
400: {"description": "Invalid request", "model": ErrorResponse},
409: {"description": "SPIFFE ID already exists"},
}
)12. Type Annotation: Self vs AgentRepositoryLocation: Issue: Using def __init__(self: Any, db: Session) -> None:Recommendation: Remove type hint (not needed) or use proper type: def __init__(self, db: Session) -> None:LOW Priority / Style Issues13. Magic String for Trust DomainLocation: Multiple files Issue: Recommendation: Use constant or config default 14. Logging: f-string vs Extra ContextLocation: Issue: Uses f-string instead of structured logging: logger.warning(f"Rejected trust domain '{request.trust_domain}'...")Per CLAUDE.md Enhanced Logging guidelines: logger.warning(
"Rejected trust domain from user",
extra={
"requested_domain": request.trust_domain,
"configured_domain": self._config.trust_domain,
"user_id": str(owner_user_id)
}
)15. Test Fixture NamingLocation: Issue: Mock fixtures lack docstrings Recommendation: Add brief descriptions for clarity 📋 Testing ChecklistRequired Before Merge:
Recommended:
🎯 Summary AssessmentOverall Rating: 8.5/10Verdict: Approve with Required Changes This is an exceptionally well-designed and implemented feature. The architecture is sound, the code quality is high, and the documentation is outstanding. However, there are critical dependency issues that must be resolved before merge. Must Fix Before Merge:
Recommended for Follow-Up PR:
📚 Alignment with Repository Standards✅ Follows CLAUDE.md guidelines:
🚀 Impact AssessmentPositive Impact:
Risk Assessment:
Migration Complexity: Medium
🤝 RecommendationsFor Immediate Action:
For Documentation:
For Future Enhancement:
Great work on this comprehensive feature! The architecture is solid and the implementation quality is high. Address the critical dependency issues and this will be ready to merge. 🤖 Generated with Claude Code Co-Authored-By: Claude noreply@anthropic.com |
|
Closing - branch included unrelated files from PR #695. Will recreate from clean main branch. |
Summary
Add comprehensive architecture documentation for the Agentic RAG Platform. These documents
establish the design foundation for transforming RAG Modulo into a fully agentic system.
Documents Added
agentic-ui-architecture.mdbackend-architecture-diagram.mdmcp-integration-architecture.mdrag-modulo-mcp-server-architecture.mdsearch-agent-hooks-architecture.mdsystem-architecture.mdTotal: ~3,450 lines of documentation
Architecture Highlights
3-Stage Agent Pipeline (search-agent-hooks-architecture.md)
MCP Integration (mcp-integration-architecture.md)
rag_search,rag_ingest, etc. to Claude DesktopAgentic UI (agentic-ui-architecture.md)
Implementation Roadmap
These documents guide:
Test Plan
Closes #696
🤖 Generated with Claude Code