diff --git a/mcp-blarify-server/.github/workflows/test.yml b/mcp-blarify-server/.github/workflows/test.yml new file mode 100644 index 00000000..129e8f57 --- /dev/null +++ b/mcp-blarify-server/.github/workflows/test.yml @@ -0,0 +1,76 @@ +name: Test MCP Server + +on: + push: + paths: + - 'mcp-blarify-server/**' + pull_request: + paths: + - 'mcp-blarify-server/**' + +jobs: + test: + runs-on: ubuntu-latest + + services: + neo4j: + image: neo4j:5-community + env: + NEO4J_AUTH: neo4j/testpassword + NEO4J_PLUGINS: '["apoc", "graph-data-science"]' + NEO4J_dbms_security_procedures_unrestricted: apoc.*,gds.* + ports: + - 7474:7474 + - 7687:7687 + options: >- + --health-cmd "curl -f http://localhost:7474 || exit 1" + --health-interval 10s + --health-timeout 5s + --health-retries 5 + + steps: + - uses: actions/checkout@v3 + + - name: Set up Python + uses: actions/setup-python@v4 + with: + python-version: '3.9' + + - name: Install dependencies + working-directory: ./mcp-blarify-server + run: | + python -m pip install --upgrade pip + pip install -r requirements.txt + pip install pytest-cov + + - name: Wait for Neo4j + run: | + timeout 60 bash -c 'until curl -f http://localhost:7474; do sleep 2; done' + + - name: Set up test data + working-directory: ./mcp-blarify-server + run: python tests/setup_test_graph.py + + - name: Run unit tests + working-directory: ./mcp-blarify-server + run: | + python -m pytest tests/test_query_builder.py tests/test_context_builder.py tests/test_llm_processor.py tests/test_server.py -v + + - name: Run integration tests + working-directory: ./mcp-blarify-server + env: + NEO4J_URI: bolt://localhost:7687 + NEO4J_USERNAME: neo4j + NEO4J_PASSWORD: testpassword + run: | + python -m pytest tests/test_integration.py -v + + - name: Generate coverage report + working-directory: ./mcp-blarify-server + run: | + python -m pytest --cov=src --cov-report=xml --cov-report=html + + - name: Upload coverage + uses: codecov/codecov-action@v3 + with: + file: ./mcp-blarify-server/coverage.xml \ No newline at end of file diff --git a/mcp-blarify-server/README.md b/mcp-blarify-server/README.md new file mode 100644 index 00000000..672995f7 --- /dev/null +++ b/mcp-blarify-server/README.md @@ -0,0 +1,245 @@ +# MCP Blarify Server + +An MCP (Model Context Protocol) server that provides AI coding agents with sophisticated tools to query and analyze Blarify graph databases stored in Neo4j. + +## Overview + +This MCP server sits in front of a Neo4j database containing a Blarify graph representation of a codebase. It provides three main tools that AI agents can use to understand code structure, relationships, and plan changes. + +## Features + +### Tools + +1. **`getContextForFiles`** + - Retrieves comprehensive context for specified files + - Includes classes, functions, dependencies, imports, and documentation + - Traverses the graph to configurable depth + - Returns organized Markdown with LLM assistance + +2. **`getContextForSymbol`** + - Gets detailed context for a specific symbol (class, function, variable) + - Finds definitions, usages, inheritance, and relationships + - Supports fuzzy matching for symbol names + - Shows callers, callees, and references + +3. **`buildPlanForChange`** + - Analyzes codebase to create implementation plans + - Extracts entities from change requests using LLM + - Identifies affected files, dependencies, and tests + - Generates step-by-step implementation guide + +### Capabilities + +- **Intelligent Query Building**: Constructs efficient Cypher queries for complex traversals +- **LLM Integration**: Uses Azure OpenAI to organize results and extract information +- **Context Organization**: Structures graph data into readable Markdown +- **Impact Analysis**: Traces dependencies and affected components +- **Pattern Recognition**: Identifies design patterns and architectural concepts + +## Installation + +1. Clone the repository: +```bash +cd mcp-blarify-server +``` + +2. Install dependencies: +```bash +pip install -r requirements.txt +``` + +3. Configure environment variables: +```bash +# Neo4j configuration +export NEO4J_URI="bolt://localhost:7687" +export NEO4J_USERNAME="neo4j" +export NEO4J_PASSWORD="your-password" +export NEO4J_DATABASE="neo4j" + +# Azure OpenAI configuration (optional but recommended) +export AZURE_OPENAI_API_KEY="your-api-key" +export AZURE_OPENAI_ENDPOINT="https://your-instance.openai.azure.com/" +export AZURE_OPENAI_DEPLOYMENT_NAME="gpt-4" +export AZURE_OPENAI_API_VERSION="2024-02-15-preview" + +# Optional settings +export MAX_TRAVERSAL_DEPTH="3" +export MAX_CONTEXT_LENGTH="8000" +export ENABLE_QUERY_CACHE="true" +``` + +## Usage + +### Running the Server + +```bash +python -m src.server +``` + +The server runs on stdio transport, making it compatible with any MCP client. + +### Using with Claude Desktop + +Add to your Claude Desktop configuration: + +```json +{ + "mcpServers": { + "blarify": { + "command": "python", + "args": ["-m", "src.server"], + "cwd": "/path/to/mcp-blarify-server", + "env": { + "NEO4J_URI": "bolt://localhost:7687", + "NEO4J_USERNAME": "neo4j", + "NEO4J_PASSWORD": "your-password", + "AZURE_OPENAI_API_KEY": "your-api-key", + "AZURE_OPENAI_ENDPOINT": "https://your-instance.openai.azure.com/" + } + } + } +} +``` + +### Example Queries + +#### Get Context for Files +``` +Use the getContextForFiles tool to show me the context for: +- src/services/auth.py +- src/models/user.py +``` + +#### Find Symbol Information +``` +Use the getContextForSymbol tool to find information about the UserService class +``` + +#### Build Implementation Plan +``` +Use the buildPlanForChange tool to create a plan for: +"Add email verification to the user registration process" +``` + +## Architecture + +``` +mcp-blarify-server/ +├── src/ +│ ├── server.py # Main MCP server +│ ├── config.py # Configuration management +│ ├── tools/ +│ │ ├── context_tools.py # File and symbol context retrieval +│ │ ├── planning_tools.py # Change planning functionality +│ │ └── query_builder.py # Cypher query construction +│ └── processors/ +│ ├── graph_traversal.py # Neo4j graph traversal logic +│ ├── context_builder.py # Context organization +│ └── llm_processor.py # LLM integration for formatting +└── tests/ # Test suite +``` + +## Configuration Options + +| Environment Variable | Description | Default | +|---------------------|-------------|---------| +| `NEO4J_URI` | Neo4j connection URI | `bolt://localhost:7687` | +| `NEO4J_USERNAME` | Neo4j username | `neo4j` | +| `NEO4J_PASSWORD` | Neo4j password | `password` | +| `NEO4J_DATABASE` | Neo4j database name | `neo4j` | +| `AZURE_OPENAI_API_KEY` | Azure OpenAI API key | None | +| `AZURE_OPENAI_ENDPOINT` | Azure OpenAI endpoint | None | +| `AZURE_OPENAI_DEPLOYMENT_NAME` | Model deployment name | `gpt-4` | +| `MAX_TRAVERSAL_DEPTH` | Maximum graph traversal depth | `3` | +| `MAX_CONTEXT_LENGTH` | Maximum context length in chars | `8000` | +| `ENABLE_QUERY_CACHE` | Enable query result caching | `true` | +| `CACHE_TTL_SECONDS` | Cache time-to-live | `3600` | + +## Development + +### Running Tests + +#### Unit Tests + +Run unit tests without Neo4j: +```bash +pytest tests/test_query_builder.py tests/test_context_builder.py tests/test_llm_processor.py tests/test_server.py -v +``` + +#### Integration Tests + +Run integration tests with real Neo4j: + +1. Start Neo4j: +```bash +docker-compose up -d +``` + +2. Set up test data: +```bash +python tests/setup_test_graph.py +``` + +3. Run integration tests: +```bash +pytest tests/test_integration.py -v +``` + +Or use the convenience script: +```bash +./run_integration_tests.sh +``` + +#### Manual Testing + +Test the server interactively: +```bash +python manual_test.py +``` + +This will: +- Connect to Neo4j +- List available tools +- Test each tool with sample data +- Show example responses + +### Adding New Tools + +1. Create tool arguments model in `server.py` +2. Add tool definition in `handle_list_tools()` +3. Implement tool logic in appropriate module +4. Add handler in `handle_call_tool()` + +### Extending Query Patterns + +Add new query builders in `src/tools/query_builder.py` for custom graph traversals. + +## Troubleshooting + +### Connection Issues + +If you see connection errors: +1. Verify Neo4j is running: `neo4j status` +2. Check credentials and URI +3. Ensure the database exists +4. Test connection with `cypher-shell` + +### LLM Processing + +If LLM features aren't working: +1. Verify Azure OpenAI credentials +2. Check API endpoint format +3. Ensure deployment name is correct +4. Monitor API quotas + +### Performance + +For large graphs: +1. Adjust `MAX_TRAVERSAL_DEPTH` +2. Use `MAX_NODES_PER_TYPE` limits +3. Enable query caching +4. Consider adding indexes in Neo4j + +## License + +This project follows the same license as the parent Blarify project. \ No newline at end of file diff --git a/mcp-blarify-server/docker-compose.yml b/mcp-blarify-server/docker-compose.yml new file mode 100644 index 00000000..c2edcfdd --- /dev/null +++ b/mcp-blarify-server/docker-compose.yml @@ -0,0 +1,30 @@ +version: '3.8' + +services: + neo4j: + image: neo4j:5-community + container_name: blarify-neo4j-test + ports: + - "7474:7474" # HTTP + - "7687:7687" # Bolt + environment: + - NEO4J_AUTH=neo4j/testpassword + - NEO4J_PLUGINS=["apoc", "graph-data-science"] + - NEO4J_dbms_security_procedures_unrestricted=apoc.*,gds.* + - NEO4J_dbms_security_procedures_allowlist=apoc.*,gds.* + volumes: + - neo4j_data:/data + - neo4j_logs:/logs + - neo4j_import:/var/lib/neo4j/import + - neo4j_plugins:/plugins + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:7474"] + interval: 10s + timeout: 10s + retries: 5 + +volumes: + neo4j_data: + neo4j_logs: + neo4j_import: + neo4j_plugins: \ No newline at end of file diff --git a/mcp-blarify-server/examples/example_usage.py b/mcp-blarify-server/examples/example_usage.py new file mode 100644 index 00000000..fb561ffb --- /dev/null +++ b/mcp-blarify-server/examples/example_usage.py @@ -0,0 +1,220 @@ +"""Example usage of MCP Blarify Server tools.""" + +import asyncio +import json +from typing import Dict, Any + + +async def simulate_tool_call(tool_name: str, arguments: Dict[str, Any]): + """Simulate calling an MCP tool.""" + print(f"\n=== Calling {tool_name} ===") + print(f"Arguments: {json.dumps(arguments, indent=2)}") + print("\n--- Example Response ---") + + if tool_name == "getContextForFiles": + return """# Context for Files + +## Directory: /src/services + +### File: user_service.py +**Path**: `/src/services/user_service.py` + +**Description**: Service for managing user data and authentication + +**Contains**: +- CLASSs: UserService +- FUNCTIONs: create_user, get_user, update_user, delete_user + +**Imports**: +- models.user +- utils.validation +- utils.security + +**Imported by**: +- /src/api/routes/users.py +- /src/api/routes/auth.py +- /tests/services/test_user_service.py + +### File: auth_service.py +**Path**: `/src/services/auth_service.py` + +**Description**: Handles authentication and authorization logic + +**Contains**: +- CLASSs: AuthService +- FUNCTIONs: login, logout, verify_token, refresh_token + +**Imports**: +- jwt +- services.user_service +- models.token + +**Imported by**: +- /src/api/routes/auth.py +- /src/middleware/auth_middleware.py +""" + + elif tool_name == "getContextForSymbol": + return """# Symbol: UserService + +**Type**: CLASS +**Location**: `/src/services/user_service.py` + +**Description**: Core service for user management operations including CRUD operations, password management, and user validation. + +**Inherits from**: BaseService +**Inherited by**: ExtendedUserService (in /src/services/extended_user.py) + +**Methods**: +- create_user(user_data: dict) -> User +- get_user(user_id: int) -> Optional[User] +- update_user(user_id: int, user_data: dict) -> User +- delete_user(user_id: int) -> bool +- validate_password(user_id: int, password: str) -> bool +- reset_password(user_id: int, new_password: str) -> bool + +**Called by**: 8 locations +- UserController.create in `/src/api/controllers/user_controller.py` +- UserController.get in `/src/api/controllers/user_controller.py` +- AuthService.login in `/src/services/auth_service.py` +- AuthService.register in `/src/services/auth_service.py` +- UserValidator.validate in `/src/validators/user_validator.py` + +## Related Symbols + +### UserController +**Type**: CLASS +**Location**: `/src/api/controllers/user_controller.py` +Uses UserService for all user operations +""" + + elif tool_name == "buildPlanForChange": + return """# Implementation Plan + +## Change Request +Add email verification to user registration process + +## Impact Analysis +- **Entities Affected**: 4 +- **Total Dependencies**: 12 +- **Files to Modify**: 8 +- **Test Files**: 5 + +## Implementation Steps + +### 1. Prepare Development Environment +- Create feature branch: `feature/email-verification` +- Ensure all tests pass before starting +- Review existing code structure + +### 2. Modify Existing Files + +#### 1. Update `/src/models/user.py` +- Add `email_verified` boolean field (default: False) +- Add `verification_token` string field +- Add `verification_sent_at` timestamp field + +#### 2. Update `/src/services/user_service.py` +- Modify `create_user` to generate verification token +- Add `verify_email(token: str) -> bool` method +- Add `resend_verification(user_id: int) -> bool` method + +#### 3. Update `/src/services/email_service.py` +- Add `send_verification_email(user: User, token: str)` method +- Create email template for verification + +#### 4. Update `/src/api/routes/auth.py` +- Add `/verify-email/` endpoint +- Add `/resend-verification` endpoint +- Update registration response to indicate verification needed + +### 3. Create New Components + +#### 1. Create `/src/templates/email/verification.html` +- HTML template for verification email +- Include verification link and instructions + +#### 2. Create `/src/utils/token_generator.py` +- Secure token generation utilities +- Token validation methods + +### 4. Update Tests +Existing test files to update: +- `/tests/services/test_user_service.py` +- `/tests/services/test_email_service.py` +- `/tests/api/test_auth.py` +- `/tests/models/test_user.py` +- `/tests/integration/test_registration_flow.py` + +### 5. Update Documentation +Documentation files to update: +- `/docs/api/authentication.md` +- `/docs/user-guide/registration.md` +- `/README.md` (mention email verification requirement) + +### 6. Validation +- Run full test suite +- Test email delivery in development environment +- Verify token expiration logic +- Test edge cases (invalid tokens, expired tokens) + +## Dependencies to Consider +- **UserService**: Add verification logic +- **EmailService**: Ensure SMTP configuration +- **User model**: Database migration required +- **AuthController**: Update registration flow + +## Risk Assessment +- **Breaking Changes**: Existing users won't have email_verified field +- **Performance Impact**: Additional database queries for verification +- **Security**: Ensure tokens are cryptographically secure + +## Rollback Plan +- Keep feature branch separate until fully tested +- Create database migration rollback script +- Document how to disable email verification if needed +""" + + else: + return f"Unknown tool: {tool_name}" + + +async def main(): + """Run example tool calls.""" + print("MCP Blarify Server - Example Tool Usage") + print("=" * 50) + + # Example 1: Get context for files + response = await simulate_tool_call( + "getContextForFiles", + { + "file_paths": [ + "src/services/user_service.py", + "src/services/auth_service.py" + ] + } + ) + print(response) + + # Example 2: Get context for a symbol + response = await simulate_tool_call( + "getContextForSymbol", + { + "symbol_name": "UserService", + "symbol_type": "class" + } + ) + print(response) + + # Example 3: Build a plan for change + response = await simulate_tool_call( + "buildPlanForChange", + { + "change_request": "Add email verification to user registration process" + } + ) + print(response) + + +if __name__ == "__main__": + asyncio.run(main()) \ No newline at end of file diff --git a/mcp-blarify-server/manual_test.py b/mcp-blarify-server/manual_test.py new file mode 100644 index 00000000..358b04bf --- /dev/null +++ b/mcp-blarify-server/manual_test.py @@ -0,0 +1,174 @@ +"""Manual test script to demonstrate MCP server functionality.""" + +import asyncio +import os +import json +from src.server import BlarifyMCPServer + + +async def test_mcp_server(): + """Test the MCP server with real Neo4j data.""" + print("MCP Blarify Server - Manual Test") + print("=" * 60) + + # Configure environment + os.environ.update({ + "NEO4J_URI": "bolt://localhost:7687", + "NEO4J_USERNAME": "neo4j", + "NEO4J_PASSWORD": "testpassword", + "NEO4J_DATABASE": "neo4j" + }) + + # Create server + server = BlarifyMCPServer() + + try: + # Connect to Neo4j + print("\n1. Connecting to Neo4j...") + await server._connect_to_neo4j() + print("✓ Connected successfully") + + # List available tools + print("\n2. Available tools:") + tools = await server.server.list_tools() + for tool in tools: + print(f" - {tool.name}: {tool.description}") + + # Test getContextForFiles + print("\n3. Testing getContextForFiles") + print("-" * 40) + result = await server.server.call_tool( + "getContextForFiles", + {"file_paths": ["user_service.py", "auth_service.py"]} + ) + print("Request: Get context for user_service.py and auth_service.py") + print("Response:") + print(result[0].text[:1000] + "..." if len(result[0].text) > 1000 else result[0].text) + + # Test getContextForSymbol + print("\n4. Testing getContextForSymbol") + print("-" * 40) + result = await server.server.call_tool( + "getContextForSymbol", + {"symbol_name": "UserService", "symbol_type": "class"} + ) + print("Request: Get context for UserService class") + print("Response:") + print(result[0].text[:1000] + "..." if len(result[0].text) > 1000 else result[0].text) + + # Test buildPlanForChange + print("\n5. Testing buildPlanForChange") + print("-" * 40) + result = await server.server.call_tool( + "buildPlanForChange", + {"change_request": "Add email verification to user registration process"} + ) + print("Request: Build plan for adding email verification") + print("Response:") + print(result[0].text[:1500] + "..." if len(result[0].text) > 1500 else result[0].text) + + # Test with non-existent file + print("\n6. Testing error handling") + print("-" * 40) + result = await server.server.call_tool( + "getContextForFiles", + {"file_paths": ["non_existent_file.py"]} + ) + print("Request: Get context for non-existent file") + print("Response:") + print(result[0].text) + + # Test fuzzy symbol search + print("\n7. Testing fuzzy symbol search") + print("-" * 40) + result = await server.server.call_tool( + "getContextForSymbol", + {"symbol_name": "user"} # Partial match + ) + print("Request: Get context for 'user' (partial match)") + print("Response:") + print(result[0].text[:800] + "..." if len(result[0].text) > 800 else result[0].text) + + print("\n" + "=" * 60) + print("✓ All tests completed successfully!") + + except Exception as e: + print(f"\n✗ Error: {e}") + import traceback + traceback.print_exc() + finally: + if server.driver: + server.driver.close() + print("\n✓ Closed Neo4j connection") + + +async def test_direct_queries(): + """Test direct graph queries to verify data.""" + from neo4j import GraphDatabase + + print("\n\nDirect Graph Queries") + print("=" * 60) + + driver = GraphDatabase.driver( + "bolt://localhost:7687", + auth=("neo4j", "testpassword") + ) + + try: + with driver.session() as session: + # Query 1: Find all classes + print("\n1. All classes in the graph:") + result = session.run("MATCH (c:CLASS) RETURN c.name as name ORDER BY name") + for record in result: + print(f" - {record['name']}") + + # Query 2: UserService details + print("\n2. UserService details:") + result = session.run(""" + MATCH (us:CLASS {name: 'UserService'}) + OPTIONAL MATCH (us)-[:HAS_METHOD]->(m) + OPTIONAL MATCH (us)-[:INHERITS_FROM]->(parent) + OPTIONAL MATCH (us)<-[:USES]-(caller) + OPTIONAL MATCH (desc)-[:DESCRIBES]->(us) + RETURN us.name as name, + collect(DISTINCT m.name) as methods, + parent.name as parent, + collect(DISTINCT caller.name) as callers, + desc.description as description + """) + + for record in result: + print(f" Name: {record['name']}") + print(f" Parent: {record['parent']}") + print(f" Methods: {', '.join(record['methods'])}") + print(f" Called by: {', '.join(record['callers'])}") + print(f" Description: {record['description']}") + + # Query 3: File dependencies + print("\n3. File import relationships:") + result = session.run(""" + MATCH (f1:FILE)-[:IMPORTS]->(f2:FILE) + RETURN f1.name as importer, f2.name as imported + ORDER BY importer + """) + + for record in result: + print(f" {record['importer']} → {record['imported']}") + + finally: + driver.close() + + +async def main(): + """Run all tests.""" + await test_mcp_server() + await test_direct_queries() + + +if __name__ == "__main__": + print("Starting manual test...") + print("Make sure Neo4j is running (docker-compose up -d)") + print("Make sure test data is loaded (python tests/setup_test_graph.py)") + input("Press Enter to continue...") + + asyncio.run(main()) \ No newline at end of file diff --git a/mcp-blarify-server/requirements.txt b/mcp-blarify-server/requirements.txt new file mode 100644 index 00000000..4fcff369 --- /dev/null +++ b/mcp-blarify-server/requirements.txt @@ -0,0 +1,8 @@ +mcp>=0.9.0 +neo4j>=5.0.0 +python-dotenv>=1.0.0 +openai>=1.0.0 +pydantic>=2.0.0 +pytest>=7.0.0 +pytest-asyncio>=0.21.0 +pytest-mock>=3.10.0 \ No newline at end of file diff --git a/mcp-blarify-server/run_integration_tests.sh b/mcp-blarify-server/run_integration_tests.sh new file mode 100755 index 00000000..c84256e7 --- /dev/null +++ b/mcp-blarify-server/run_integration_tests.sh @@ -0,0 +1,72 @@ +#!/bin/bash + +echo "Starting integration tests for MCP Blarify Server..." +echo "===========================================" + +# Check if docker is running +if ! docker info > /dev/null 2>&1; then + echo "Error: Docker is not running. Please start Docker first." + exit 1 +fi + +# Start Neo4j if not already running +echo "Starting Neo4j container..." +docker-compose up -d + +# Wait for Neo4j to be ready +echo "Waiting for Neo4j to be ready..." +sleep 10 + +# Check Neo4j health +max_attempts=30 +attempt=0 +while [ $attempt -lt $max_attempts ]; do + if curl -f http://localhost:7474 > /dev/null 2>&1; then + echo "Neo4j is ready!" + break + fi + echo "Waiting for Neo4j... (attempt $((attempt+1))/$max_attempts)" + sleep 2 + attempt=$((attempt+1)) +done + +if [ $attempt -eq $max_attempts ]; then + echo "Error: Neo4j failed to start" + docker-compose logs neo4j + exit 1 +fi + +# Set up test data +echo "Setting up test graph data..." +python tests/setup_test_graph.py + +if [ $? -ne 0 ]; then + echo "Error: Failed to set up test graph" + exit 1 +fi + +# Run integration tests +echo "Running integration tests..." +python -m pytest tests/test_integration.py -v + +# Store test result +TEST_RESULT=$? + +# Show logs if tests failed +if [ $TEST_RESULT -ne 0 ]; then + echo "Tests failed! Showing Neo4j logs..." + docker-compose logs --tail=50 neo4j +fi + +# Cleanup (optional - comment out to keep Neo4j running) +# echo "Stopping Neo4j..." +# docker-compose down + +echo "===========================================" +if [ $TEST_RESULT -eq 0 ]; then + echo "Integration tests passed!" +else + echo "Integration tests failed!" +fi + +exit $TEST_RESULT \ No newline at end of file diff --git a/mcp-blarify-server/src/__init__.py b/mcp-blarify-server/src/__init__.py new file mode 100644 index 00000000..6574d26b --- /dev/null +++ b/mcp-blarify-server/src/__init__.py @@ -0,0 +1,3 @@ +"""MCP Blarify Server package.""" + +__version__ = "0.1.0" \ No newline at end of file diff --git a/mcp-blarify-server/src/config.py b/mcp-blarify-server/src/config.py new file mode 100644 index 00000000..a2a8a131 --- /dev/null +++ b/mcp-blarify-server/src/config.py @@ -0,0 +1,44 @@ +"""Configuration for MCP Blarify Server.""" + +import os +from typing import Optional +from dotenv import load_dotenv + +load_dotenv() + + +class Config: + """Configuration settings for the MCP server.""" + + # Neo4j settings + NEO4J_URI: str = os.getenv("NEO4J_URI", "bolt://localhost:7687") + NEO4J_USERNAME: str = os.getenv("NEO4J_USERNAME", "neo4j") + NEO4J_PASSWORD: str = os.getenv("NEO4J_PASSWORD", "password") + NEO4J_DATABASE: str = os.getenv("NEO4J_DATABASE", "neo4j") + + # Azure OpenAI settings + AZURE_OPENAI_API_KEY: Optional[str] = os.getenv("AZURE_OPENAI_API_KEY") + AZURE_OPENAI_ENDPOINT: Optional[str] = os.getenv("AZURE_OPENAI_ENDPOINT") + AZURE_OPENAI_DEPLOYMENT_NAME: str = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME", "gpt-4") + AZURE_OPENAI_API_VERSION: str = os.getenv("AZURE_OPENAI_API_VERSION", "2024-02-15-preview") + + # Graph traversal settings + MAX_TRAVERSAL_DEPTH: int = int(os.getenv("MAX_TRAVERSAL_DEPTH", "3")) + MAX_NODES_PER_TYPE: int = int(os.getenv("MAX_NODES_PER_TYPE", "50")) + + # Context generation settings + MAX_CONTEXT_LENGTH: int = int(os.getenv("MAX_CONTEXT_LENGTH", "8000")) + INCLUDE_LLM_SUMMARIES: bool = os.getenv("INCLUDE_LLM_SUMMARIES", "true").lower() == "true" + + # Cache settings + ENABLE_QUERY_CACHE: bool = os.getenv("ENABLE_QUERY_CACHE", "true").lower() == "true" + CACHE_TTL_SECONDS: int = int(os.getenv("CACHE_TTL_SECONDS", "3600")) + + @classmethod + def validate(cls) -> None: + """Validate required configuration.""" + if not cls.NEO4J_URI: + raise ValueError("NEO4J_URI is required") + + if cls.INCLUDE_LLM_SUMMARIES and not cls.AZURE_OPENAI_API_KEY: + raise ValueError("AZURE_OPENAI_API_KEY is required when INCLUDE_LLM_SUMMARIES is enabled") \ No newline at end of file diff --git a/mcp-blarify-server/src/processors/__init__.py b/mcp-blarify-server/src/processors/__init__.py new file mode 100644 index 00000000..0e9b837f --- /dev/null +++ b/mcp-blarify-server/src/processors/__init__.py @@ -0,0 +1 @@ +"""Processors for graph data manipulation.""" \ No newline at end of file diff --git a/mcp-blarify-server/src/processors/context_builder.py b/mcp-blarify-server/src/processors/context_builder.py new file mode 100644 index 00000000..cd9b7f99 --- /dev/null +++ b/mcp-blarify-server/src/processors/context_builder.py @@ -0,0 +1,248 @@ +"""Context builder for organizing graph data into consumable formats.""" + +import logging +from typing import Dict, Any, List, Optional +from collections import defaultdict + +from ..config import Config + +logger = logging.getLogger(__name__) + + +class ContextBuilder: + """Builds structured context from graph traversal results.""" + + def __init__(self): + """Initialize the context builder.""" + self.max_context_length = Config.MAX_CONTEXT_LENGTH + self.max_nodes_per_type = Config.MAX_NODES_PER_TYPE + + def build_files_context(self, file_contexts: List[Dict[str, Any]]) -> str: + """Build combined context for multiple files.""" + if not file_contexts: + return "# No Files Found\n\nNo files matching the specified paths were found in the graph." + + sections = [] + + # Group files by directory for better organization + files_by_dir = defaultdict(list) + for context in file_contexts: + if file_info := context.get("file"): + path = file_info.get("path", "unknown") + dir_path = "/".join(path.split("/")[:-1]) or "/" + files_by_dir[dir_path].append(context) + + # Build context for each directory + for dir_path, contexts in sorted(files_by_dir.items()): + if dir_path != "/": + sections.append(f"## Directory: {dir_path}\n") + + for context in contexts: + file_section = self._build_single_file_section(context) + sections.append(file_section) + + full_context = "\n".join(sections) + + # Truncate if too long + if len(full_context) > self.max_context_length: + full_context = self._truncate_context(full_context) + + return full_context + + def build_symbol_context(self, symbol_context: Dict[str, Any], related_symbols: List[Dict[str, Any]] = None) -> str: + """Build context for a symbol with optional related symbols.""" + if not symbol_context or not symbol_context.get("symbol"): + return "# Symbol Not Found\n\nThe requested symbol could not be found in the graph." + + sections = [] + + # Main symbol context + main_section = self._build_symbol_section(symbol_context, is_main=True) + sections.append(main_section) + + # Related symbols + if related_symbols: + sections.append("\n## Related Symbols\n") + for related in related_symbols[:5]: # Limit to 5 related symbols + related_section = self._build_symbol_section(related, is_main=False) + sections.append(related_section) + + full_context = "\n".join(sections) + + # Truncate if too long + if len(full_context) > self.max_context_length: + full_context = self._truncate_context(full_context) + + return full_context + + def build_change_plan_context(self, change_request: str, impact_analysis: Dict[str, Any], patterns: Dict[str, Any] = None) -> Dict[str, Any]: + """Build context for change planning.""" + context = { + "change_request": change_request, + "impact_summary": self._summarize_impact(impact_analysis), + "affected_files": self._extract_affected_files(impact_analysis), + "affected_entities": list(impact_analysis.keys()), + "test_files": self._extract_test_files(impact_analysis), + "documentation_files": self._extract_documentation_files(impact_analysis), + "patterns": patterns + } + + return context + + def _build_single_file_section(self, context: Dict[str, Any]) -> str: + """Build markdown section for a single file.""" + file_info = context.get("file", {}) + path = file_info.get("path", "Unknown") + name = path.split("/")[-1] + + section = f"### File: {name}\n" + section += f"**Path**: `{path}`\n" + + # Add description if available + if desc := context.get("description"): + section += f"\n**Description**: {desc.get('description', 'No description available')}\n" + + # Add contents + if contents := context.get("contents"): + section += "\n**Contains**:\n" + grouped = self._group_by_type(contents) + for node_type, nodes in grouped.items(): + if nodes: + section += f"- {node_type}s: {', '.join(n.get('name', 'Unknown') for n in nodes[:10])}\n" + if len(nodes) > 10: + section += f" - ... and {len(nodes) - 10} more\n" + + # Add dependencies + if imports := context.get("imports"): + section += "\n**Imports**:\n" + for imp in imports[:10]: + imp_name = imp.get("name", imp.get("path", "Unknown")) + section += f"- {imp_name}\n" + if len(imports) > 10: + section += f"- ... and {len(imports) - 10} more\n" + + # Add usage + if importers := context.get("importers"): + section += "\n**Imported by**:\n" + for imp in importers[:5]: + section += f"- {imp.get('path', 'Unknown')}\n" + if len(importers) > 5: + section += f"- ... and {len(importers) - 5} more\n" + + section += "\n" + return section + + def _build_symbol_section(self, context: Dict[str, Any], is_main: bool = True) -> str: + """Build markdown section for a symbol.""" + symbol = context.get("symbol", {}) + name = symbol.get("name", "Unknown") + labels = symbol.get("_labels", ["Unknown"]) + + if is_main: + section = f"# Symbol: {name}\n\n" + else: + section = f"### {name}\n" + + section += f"**Type**: {', '.join(labels)}\n" + + # Add location + if file_info := context.get("file"): + section += f"**Location**: `{file_info.get('path', 'Unknown')}`\n" + + # Add description + if desc := context.get("description"): + section += f"\n**Description**: {desc.get('description', 'No description available')}\n" + + # Add inheritance + if parents := context.get("parents"): + section += f"\n**Inherits from**: {', '.join(p.get('name', 'Unknown') for p in parents)}\n" + + if children := context.get("children"): + section += f"**Inherited by**: {', '.join(c.get('name', 'Unknown') for c in children[:5])}" + if len(children) > 5: + section += f" ... and {len(children) - 5} more" + section += "\n" + + # Add methods for classes + if methods := context.get("methods"): + section += "\n**Methods**:\n" + for method in methods[:10]: + section += f"- {method.get('name', 'Unknown')}\n" + if len(methods) > 10: + section += f"- ... and {len(methods) - 10} more\n" + + # Add usage + if callers := context.get("callers"): + section += f"\n**Called by**: {len(callers)} locations\n" + for caller in callers[:5]: + caller_name = caller.get("name", "Unknown") + if caller_file := caller.get("path"): + section += f"- {caller_name} in `{caller_file}`\n" + else: + section += f"- {caller_name}\n" + + section += "\n" + return section + + def _group_by_type(self, nodes: List[Dict[str, Any]]) -> Dict[str, List[Dict[str, Any]]]: + """Group nodes by their primary type.""" + grouped = defaultdict(list) + for node in nodes: + labels = node.get("_labels", ["Unknown"]) + primary_type = labels[0] if labels else "Unknown" + grouped[primary_type].append(node) + return dict(grouped) + + def _summarize_impact(self, impact_analysis: Dict[str, Any]) -> Dict[str, Any]: + """Summarize the impact analysis.""" + total_dependents = 0 + total_files = set() + total_tests = set() + + for entity, impact in impact_analysis.items(): + total_dependents += len(impact.get("dependents", [])) + for f in impact.get("containing_files", []): + total_files.add(f.get("path", "")) + for t in impact.get("test_files", []): + total_tests.add(t.get("path", "")) + + return { + "entities_affected": len(impact_analysis), + "total_dependents": total_dependents, + "files_affected": len(total_files), + "test_files_affected": len(total_tests) + } + + def _extract_affected_files(self, impact_analysis: Dict[str, Any]) -> List[str]: + """Extract all affected files from impact analysis.""" + files = set() + for entity, impact in impact_analysis.items(): + for f in impact.get("containing_files", []): + files.add(f.get("path", "")) + return sorted(list(files)) + + def _extract_test_files(self, impact_analysis: Dict[str, Any]) -> List[str]: + """Extract all test files from impact analysis.""" + files = set() + for entity, impact in impact_analysis.items(): + for f in impact.get("test_files", []): + files.add(f.get("path", "")) + return sorted(list(files)) + + def _extract_documentation_files(self, impact_analysis: Dict[str, Any]) -> List[str]: + """Extract all documentation files from impact analysis.""" + files = set() + for entity, impact in impact_analysis.items(): + for d in impact.get("documentation", []): + files.add(d.get("path", "")) + return sorted(list(files)) + + def _truncate_context(self, context: str) -> str: + """Truncate context to maximum length.""" + if len(context) <= self.max_context_length: + return context + + # Truncate and add message + truncated = context[:self.max_context_length - 100] + truncated += "\n\n... (context truncated due to length limit)" + return truncated \ No newline at end of file diff --git a/mcp-blarify-server/src/processors/graph_traversal.py b/mcp-blarify-server/src/processors/graph_traversal.py new file mode 100644 index 00000000..833938fd --- /dev/null +++ b/mcp-blarify-server/src/processors/graph_traversal.py @@ -0,0 +1,169 @@ +"""Graph traversal logic for extracting context from Neo4j.""" + +from typing import List, Dict, Any, Optional, Set +from neo4j import GraphDatabase, Driver +import logging + +from ..config import Config +from ..tools.query_builder import QueryBuilder + +logger = logging.getLogger(__name__) + + +class GraphTraversal: + """Handles graph traversal operations.""" + + def __init__(self, driver: Driver): + """Initialize with Neo4j driver.""" + self.driver = driver + self.query_builder = QueryBuilder() + + def find_files(self, file_paths: List[str]) -> List[Dict[str, Any]]: + """Find FILE nodes matching the given paths.""" + query = self.query_builder.find_files_query(file_paths) + + with self.driver.session(database=Config.NEO4J_DATABASE) as session: + result = session.run(query) + files = [] + for record in result: + node = record["n"] + files.append(self._node_to_dict(node)) + return files + + def get_file_context(self, file_path: str) -> Dict[str, Any]: + """Get comprehensive context for a file.""" + query = self.query_builder.get_file_context_query( + file_path, + Config.MAX_TRAVERSAL_DEPTH + ) + + with self.driver.session(database=Config.NEO4J_DATABASE) as session: + result = session.run(query, file_path=file_path) + record = result.single() + + if not record: + return {} + + context = { + "file": self._node_to_dict(record["file"]), + "contents": [self._node_to_dict(n) for n in record["contents"] if n], + "imports": [self._node_to_dict(n) for n in record["imports"] if n], + "importers": [self._node_to_dict(n) for n in record["importers"] if n], + "documentation": [self._node_to_dict(n) for n in record["documentation"] if n], + "description": self._node_to_dict(record["file_desc"]) if record["file_desc"] else None, + "content_descriptions": [self._node_to_dict(n) for n in record["content_descriptions"] if n], + "filesystem_node": self._node_to_dict(record["filesystem_node"]) if record["filesystem_node"] else None + } + + return context + + def find_symbol(self, symbol_name: str, symbol_type: Optional[str] = None) -> List[Dict[str, Any]]: + """Find nodes matching a symbol name.""" + query = self.query_builder.find_symbol_query(symbol_name, symbol_type) + + with self.driver.session(database=Config.NEO4J_DATABASE) as session: + result = session.run(query, symbol_name=symbol_name) + symbols = [] + for record in result: + node = record["n"] + match_type = record["match_type"] + symbol_dict = self._node_to_dict(node) + symbol_dict["match_type"] = match_type + symbols.append(symbol_dict) + return symbols + + def get_symbol_context(self, symbol_node_id: int) -> Dict[str, Any]: + """Get comprehensive context for a symbol.""" + query = self.query_builder.get_symbol_context_query(str(symbol_node_id)) + + with self.driver.session(database=Config.NEO4J_DATABASE) as session: + result = session.run(query, symbol_id=symbol_node_id) + record = result.single() + + if not record: + return {} + + context = { + "symbol": self._node_to_dict(record["symbol"]), + "file": self._node_to_dict(record["file"]) if record["file"] else None, + "parents": [self._node_to_dict(n) for n in record["parents"] if n], + "interfaces": [self._node_to_dict(n) for n in record["interfaces"] if n], + "children": [self._node_to_dict(n) for n in record["children"] if n], + "methods": [self._node_to_dict(n) for n in record["methods"] if n], + "attributes": [self._node_to_dict(n) for n in record["attributes"] if n], + "callers": [self._node_to_dict(n) for n in record["callers"] if n], + "callees": [self._node_to_dict(n) for n in record["callees"] if n], + "referencers": [self._node_to_dict(n) for n in record["referencers"] if n], + "documentation": [self._node_to_dict(n) for n in record["documentation"] if n], + "description": self._node_to_dict(record["description"]) if record["description"] else None, + "concepts": [self._node_to_dict(n) for n in record["concepts"] if n] + } + + return context + + def analyze_change_impact(self, entity_names: List[str]) -> Dict[str, Any]: + """Analyze the impact of changing specific entities.""" + query = self.query_builder.analyze_change_impact_query(entity_names) + + with self.driver.session(database=Config.NEO4J_DATABASE) as session: + result = session.run(query) + + impacts = {} + for record in result: + target = self._node_to_dict(record["target"]) + target_name = target.get("name", target.get("path", "unknown")) + + impacts[target_name] = { + "target": target, + "dependents": [self._node_to_dict(n) for n in record["dependents"] if n], + "containing_files": [self._node_to_dict(n) for n in record["containing_files"] if n], + "test_files": [self._node_to_dict(n) for n in record["test_files"] if n], + "documentation": [self._node_to_dict(n) for n in record["documentation"] if n] + } + + return impacts + + def find_patterns(self, concept_name: str) -> Dict[str, Any]: + """Find code implementing specific patterns or concepts.""" + query = self.query_builder.find_related_patterns_query(concept_name) + + with self.driver.session(database=Config.NEO4J_DATABASE) as session: + result = session.run(query, concept_name=concept_name) + + patterns = [] + for record in result: + pattern = { + "concept": self._node_to_dict(record["concept"]), + "implementers": [self._node_to_dict(n) for n in record["implementers"] if n], + "documentation": [self._node_to_dict(n) for n in record["documentation"] if n] + } + patterns.append(pattern) + + return {"patterns": patterns} + + def _node_to_dict(self, node) -> Optional[Dict[str, Any]]: + """Convert a Neo4j node to a dictionary.""" + if not node: + return None + + # Get all properties + props = dict(node) + + # Add node metadata + props["_id"] = node.id + props["_labels"] = list(node.labels) + + return props + + def get_extended_context(self, start_nodes: List[Dict[str, Any]], depth: int = 2) -> Dict[str, Any]: + """Get extended context by traversing from start nodes.""" + # This would implement a more sophisticated traversal + # For now, we'll use the specific context methods + extended_context = { + "nodes": start_nodes, + "relationships": [], + "depth": depth + } + + # TODO: Implement generic traversal logic + return extended_context \ No newline at end of file diff --git a/mcp-blarify-server/src/processors/llm_processor.py b/mcp-blarify-server/src/processors/llm_processor.py new file mode 100644 index 00000000..6461342d --- /dev/null +++ b/mcp-blarify-server/src/processors/llm_processor.py @@ -0,0 +1,317 @@ +"""LLM processor for organizing and structuring graph data.""" + +import json +import logging +from typing import Dict, Any, List, Optional +from openai import AzureOpenAI + +from ..config import Config + +logger = logging.getLogger(__name__) + + +class LLMProcessor: + """Processes graph data using LLM to create structured output.""" + + def __init__(self): + """Initialize the LLM processor.""" + self.enabled = Config.AZURE_OPENAI_API_KEY is not None + + if self.enabled: + self.client = AzureOpenAI( + api_key=Config.AZURE_OPENAI_API_KEY, + api_version=Config.AZURE_OPENAI_API_VERSION, + azure_endpoint=Config.AZURE_OPENAI_ENDPOINT + ) + self.deployment_name = Config.AZURE_OPENAI_DEPLOYMENT_NAME + else: + logger.warning("LLM processing disabled - no Azure OpenAI API key provided") + self.client = None + + def organize_file_context(self, context: Dict[str, Any]) -> str: + """Organize file context into structured Markdown.""" + if not context or not context.get("file"): + return "# File Not Found\n\nThe requested file could not be found in the graph." + + file_info = context["file"] + file_path = file_info.get("path", "Unknown") + + # Build markdown without LLM if disabled + if not self.enabled: + return self._build_file_context_markdown(context) + + # Use LLM to organize the context + prompt = self._create_file_context_prompt(context) + + try: + response = self.client.chat.completions.create( + model=self.deployment_name, + messages=[ + {"role": "system", "content": "You are a code analysis assistant. Organize the provided graph data into clear, structured Markdown that helps developers understand the code context."}, + {"role": "user", "content": prompt} + ], + temperature=0.3, + max_tokens=2000 + ) + + return response.choices[0].message.content.strip() + + except Exception as e: + logger.error(f"LLM processing failed: {e}") + return self._build_file_context_markdown(context) + + def organize_symbol_context(self, context: Dict[str, Any]) -> str: + """Organize symbol context into structured Markdown.""" + if not context or not context.get("symbol"): + return "# Symbol Not Found\n\nThe requested symbol could not be found in the graph." + + # Build markdown without LLM if disabled + if not self.enabled: + return self._build_symbol_context_markdown(context) + + # Use LLM to organize the context + prompt = self._create_symbol_context_prompt(context) + + try: + response = self.client.chat.completions.create( + model=self.deployment_name, + messages=[ + {"role": "system", "content": "You are a code analysis assistant. Organize the provided symbol information into clear, structured Markdown that helps developers understand how the symbol is used."}, + {"role": "user", "content": prompt} + ], + temperature=0.3, + max_tokens=2000 + ) + + return response.choices[0].message.content.strip() + + except Exception as e: + logger.error(f"LLM processing failed: {e}") + return self._build_symbol_context_markdown(context) + + def create_implementation_plan(self, change_request: str, impact_analysis: Dict[str, Any]) -> str: + """Create an implementation plan based on change request and impact analysis.""" + if not self.enabled: + return self._build_basic_implementation_plan(change_request, impact_analysis) + + prompt = self._create_implementation_plan_prompt(change_request, impact_analysis) + + try: + response = self.client.chat.completions.create( + model=self.deployment_name, + messages=[ + {"role": "system", "content": "You are a software architect. Create detailed implementation plans that consider dependencies, testing, and documentation."}, + {"role": "user", "content": prompt} + ], + temperature=0.4, + max_tokens=3000 + ) + + return response.choices[0].message.content.strip() + + except Exception as e: + logger.error(f"LLM processing failed: {e}") + return self._build_basic_implementation_plan(change_request, impact_analysis) + + def extract_entities_from_request(self, change_request: str) -> List[str]: + """Extract entity names from a change request.""" + if not self.enabled: + # Basic extraction without LLM + return self._extract_entities_basic(change_request) + + prompt = f"""Extract all code entities (classes, functions, modules, files) mentioned in this change request. +Return them as a JSON array of strings. + +Change request: {change_request} + +Return only the JSON array, no other text.""" + + try: + response = self.client.chat.completions.create( + model=self.deployment_name, + messages=[ + {"role": "system", "content": "You are a code parser. Extract entity names from text."}, + {"role": "user", "content": prompt} + ], + temperature=0.1, + max_tokens=500 + ) + + content = response.choices[0].message.content.strip() + # Try to parse JSON + if content.startswith("```json"): + content = content[7:] + if content.endswith("```"): + content = content[:-3] + + entities = json.loads(content.strip()) + return entities if isinstance(entities, list) else [] + + except Exception as e: + logger.error(f"Entity extraction failed: {e}") + return self._extract_entities_basic(change_request) + + def _create_file_context_prompt(self, context: Dict[str, Any]) -> str: + """Create prompt for file context organization.""" + return f"""Organize this file context into clear Markdown: + +File: {context['file'].get('path', 'Unknown')} + +Contents (classes/functions): {json.dumps(context.get('contents', []), indent=2)} + +Imports: {json.dumps(context.get('imports', []), indent=2)} + +Imported by: {json.dumps(context.get('importers', []), indent=2)} + +Documentation: {json.dumps(context.get('documentation', []), indent=2)} + +Description: {json.dumps(context.get('description', {}), indent=2)} + +Create a well-structured Markdown document with: +1. File overview +2. What it contains (classes, functions) +3. Dependencies (what it imports) +4. Usage (what imports it) +5. Related documentation +6. Any available descriptions + +Use clear headers and formatting.""" + + def _create_symbol_context_prompt(self, context: Dict[str, Any]) -> str: + """Create prompt for symbol context organization.""" + symbol = context['symbol'] + return f"""Organize this symbol context into clear Markdown: + +Symbol: {symbol.get('name', 'Unknown')} (Type: {', '.join(symbol.get('_labels', []))}) + +File: {json.dumps(context.get('file', {}), indent=2)} + +Inheritance: +- Parents: {json.dumps(context.get('parents', []), indent=2)} +- Children: {json.dumps(context.get('children', []), indent=2)} +- Interfaces: {json.dumps(context.get('interfaces', []), indent=2)} + +Members: +- Methods: {json.dumps(context.get('methods', []), indent=2)} +- Attributes: {json.dumps(context.get('attributes', []), indent=2)} + +Usage: +- Called by: {json.dumps(context.get('callers', []), indent=2)} +- Calls: {json.dumps(context.get('callees', []), indent=2)} +- Referenced by: {json.dumps(context.get('referencers', []), indent=2)} + +Documentation: {json.dumps(context.get('documentation', []), indent=2)} + +Description: {json.dumps(context.get('description', {}), indent=2)} + +Create a well-structured Markdown document that explains: +1. What the symbol is and where it's defined +2. Its inheritance/implementation hierarchy +3. Its members (for classes) +4. How it's used in the codebase +5. Related documentation""" + + def _create_implementation_plan_prompt(self, change_request: str, impact_analysis: Dict[str, Any]) -> str: + """Create prompt for implementation plan generation.""" + return f"""Create an implementation plan for this change request: + +Change Request: {change_request} + +Impact Analysis: +{json.dumps(impact_analysis, indent=2)} + +Create a detailed implementation plan that includes: +1. Summary of the change +2. Impact analysis (affected components) +3. Step-by-step implementation plan +4. Files to create/modify/delete +5. Required tests +6. Documentation updates +7. Dependencies to consider + +Format as clear Markdown with numbered steps and specific file paths.""" + + def _build_file_context_markdown(self, context: Dict[str, Any]) -> str: + """Build file context markdown without LLM.""" + file_info = context["file"] + md = f"# Context for File: {file_info.get('path', 'Unknown')}\n\n" + + if desc := context.get("description"): + md += f"## Overview\n{desc.get('description', 'No description available')}\n\n" + + if contents := context.get("contents"): + md += "## Contains\n" + for item in contents: + item_type = item.get("_labels", ["Unknown"])[0] + md += f"- **{item_type}**: {item.get('name', 'Unknown')}\n" + md += "\n" + + if imports := context.get("imports"): + md += "## Dependencies\n" + for imp in imports: + md += f"- {imp.get('name', imp.get('path', 'Unknown'))}\n" + md += "\n" + + if importers := context.get("importers"): + md += "## Used By\n" + for imp in importers: + md += f"- {imp.get('path', 'Unknown')}\n" + md += "\n" + + return md + + def _build_symbol_context_markdown(self, context: Dict[str, Any]) -> str: + """Build symbol context markdown without LLM.""" + symbol = context["symbol"] + symbol_type = ", ".join(symbol.get("_labels", ["Unknown"])) + + md = f"# Context for Symbol: {symbol.get('name', 'Unknown')}\n\n" + md += f"**Type**: {symbol_type}\n" + + if file_info := context.get("file"): + md += f"**Location**: {file_info.get('path', 'Unknown')}\n" + + if desc := context.get("description"): + md += f"\n## Description\n{desc.get('description', 'No description available')}\n" + + # Add other sections... + return md + + def _build_basic_implementation_plan(self, change_request: str, impact_analysis: Dict[str, Any]) -> str: + """Build basic implementation plan without LLM.""" + md = f"# Implementation Plan\n\n" + md += f"## Change Request\n{change_request}\n\n" + md += f"## Impact Analysis\n" + + for entity, impact in impact_analysis.items(): + md += f"\n### {entity}\n" + if deps := impact.get("dependents"): + md += f"- **Dependents**: {len(deps)} items\n" + if files := impact.get("containing_files"): + md += f"- **Files**: {', '.join(f.get('path', 'Unknown') for f in files)}\n" + + return md + + def _extract_entities_basic(self, change_request: str) -> List[str]: + """Basic entity extraction without LLM.""" + # Simple extraction based on common patterns + import re + + # Look for quoted strings, CamelCase, snake_case + patterns = [ + r'"([^"]+)"', # Quoted strings + r'`([^`]+)`', # Backtick strings + r'\b([A-Z][a-zA-Z0-9]+)\b', # CamelCase + r'\b([a-z_][a-z0-9_]+)\b' # snake_case + ] + + entities = set() + for pattern in patterns: + matches = re.findall(pattern, change_request) + entities.update(matches) + + # Filter out common words + common_words = {'the', 'a', 'an', 'and', 'or', 'but', 'in', 'on', 'at', 'to', 'for'} + entities = [e for e in entities if e.lower() not in common_words and len(e) > 2] + + return list(entities)[:20] # Limit to 20 entities \ No newline at end of file diff --git a/mcp-blarify-server/src/server.py b/mcp-blarify-server/src/server.py new file mode 100644 index 00000000..2d78a5a3 --- /dev/null +++ b/mcp-blarify-server/src/server.py @@ -0,0 +1,187 @@ +"""MCP Server for Blarify Neo4j Graph.""" + +import asyncio +import logging +from typing import Dict, Any, List, Optional +import sys +import os + +# Add parent directory to path for imports +sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) + +from mcp.server import Server, NotificationOptions +from mcp.server.models import InitializationOptions +from mcp.types import Tool, TextContent +from neo4j import GraphDatabase +from pydantic import BaseModel, Field + +from src.config import Config +from src.tools.context_tools import ContextTools +from src.tools.planning_tools import PlanningTools + +# Configure logging +logging.basicConfig( + level=logging.INFO, + format='%(asctime)s - %(name)s - %(levelname)s - %(message)s' +) +logger = logging.getLogger(__name__) + + +class GetContextForFilesArgs(BaseModel): + """Arguments for getContextForFiles tool.""" + file_paths: List[str] = Field( + description="List of file paths to get context for" + ) + + +class GetContextForSymbolArgs(BaseModel): + """Arguments for getContextForSymbol tool.""" + symbol_name: str = Field( + description="Name of the symbol (class, function, variable) to find" + ) + symbol_type: Optional[str] = Field( + default=None, + description="Optional type hint: 'class', 'function', 'method', etc." + ) + + +class BuildPlanForChangeArgs(BaseModel): + """Arguments for buildPlanForChange tool.""" + change_request: str = Field( + description="Description of the change to be implemented" + ) + + +class BlarifyMCPServer: + """MCP Server for Blarify graph queries.""" + + def __init__(self): + """Initialize the MCP server.""" + self.server = Server("mcp-blarify") + self.driver = None + self.context_tools = None + self.planning_tools = None + + # Validate configuration + try: + Config.validate() + except ValueError as e: + logger.error(f"Configuration error: {e}") + raise + + # Register handlers + self._register_handlers() + + def _register_handlers(self): + """Register MCP protocol handlers.""" + + @self.server.list_tools() + async def handle_list_tools() -> List[Tool]: + """List available tools.""" + return [ + Tool( + name="getContextForFiles", + description="Retrieve comprehensive context for specified files including classes, functions, dependencies, and documentation", + inputSchema=GetContextForFilesArgs.model_json_schema() + ), + Tool( + name="getContextForSymbol", + description="Get detailed context for a specific symbol (class, function, variable) including definition, usage, and relationships", + inputSchema=GetContextForSymbolArgs.model_json_schema() + ), + Tool( + name="buildPlanForChange", + description="Analyze the codebase and create a detailed implementation plan for a change request", + inputSchema=BuildPlanForChangeArgs.model_json_schema() + ) + ] + + @self.server.call_tool() + async def handle_call_tool(name: str, arguments: Dict[str, Any]) -> List[TextContent]: + """Handle tool calls.""" + try: + # Ensure connection is established + if not self.driver: + await self._connect_to_neo4j() + + if name == "getContextForFiles": + args = GetContextForFilesArgs(**arguments) + result = await self.context_tools.get_context_for_files(args.file_paths) + + elif name == "getContextForSymbol": + args = GetContextForSymbolArgs(**arguments) + result = await self.context_tools.get_context_for_symbol( + args.symbol_name, + args.symbol_type + ) + + elif name == "buildPlanForChange": + args = BuildPlanForChangeArgs(**arguments) + result = await self.planning_tools.build_plan_for_change(args.change_request) + + else: + result = f"Unknown tool: {name}" + + return [TextContent(type="text", text=result)] + + except Exception as e: + logger.error(f"Error handling tool {name}: {e}") + error_msg = f"Error executing {name}: {str(e)}" + return [TextContent(type="text", text=error_msg)] + + async def _connect_to_neo4j(self): + """Establish connection to Neo4j.""" + try: + logger.info(f"Connecting to Neo4j at {Config.NEO4J_URI}") + self.driver = GraphDatabase.driver( + Config.NEO4J_URI, + auth=(Config.NEO4J_USERNAME, Config.NEO4J_PASSWORD) + ) + + # Verify connection + self.driver.verify_connectivity() + logger.info("Successfully connected to Neo4j") + + # Initialize tools + self.context_tools = ContextTools(self.driver) + self.planning_tools = PlanningTools(self.driver) + + except Exception as e: + logger.error(f"Failed to connect to Neo4j: {e}") + raise + + async def run(self): + """Run the MCP server.""" + try: + # Connect to Neo4j + await self._connect_to_neo4j() + + # Run the server + async with self.server.run( + transport="stdio", + initialization_options=InitializationOptions( + server_name="Blarify MCP Server", + server_version="0.1.0" + ) + ) as running_server: + logger.info("Blarify MCP Server started") + await running_server.wait_closed() + + except Exception as e: + logger.error(f"Server error: {e}") + raise + finally: + # Cleanup + if self.driver: + self.driver.close() + logger.info("Closed Neo4j connection") + + +async def main(): + """Main entry point.""" + server = BlarifyMCPServer() + await server.run() + + +if __name__ == "__main__": + asyncio.run(main()) \ No newline at end of file diff --git a/mcp-blarify-server/src/tools/__init__.py b/mcp-blarify-server/src/tools/__init__.py new file mode 100644 index 00000000..32c49aa9 --- /dev/null +++ b/mcp-blarify-server/src/tools/__init__.py @@ -0,0 +1 @@ +"""MCP tools for Blarify graph queries.""" \ No newline at end of file diff --git a/mcp-blarify-server/src/tools/context_tools.py b/mcp-blarify-server/src/tools/context_tools.py new file mode 100644 index 00000000..95eeb3b4 --- /dev/null +++ b/mcp-blarify-server/src/tools/context_tools.py @@ -0,0 +1,144 @@ +"""MCP tools for context retrieval.""" + +import logging +from typing import List, Dict, Any, Optional +from neo4j import Driver + +from ..processors.graph_traversal import GraphTraversal +from ..processors.context_builder import ContextBuilder +from ..processors.llm_processor import LLMProcessor + +logger = logging.getLogger(__name__) + + +class ContextTools: + """MCP tools for retrieving context from the graph.""" + + def __init__(self, driver: Driver): + """Initialize context tools.""" + self.driver = driver + self.graph_traversal = GraphTraversal(driver) + self.context_builder = ContextBuilder() + self.llm_processor = LLMProcessor() + + async def get_context_for_files(self, file_paths: List[str]) -> str: + """ + Retrieve comprehensive context for a list of files. + + Args: + file_paths: List of file paths to get context for + + Returns: + Markdown-formatted context for the files + """ + logger.info(f"Getting context for files: {file_paths}") + + try: + # Find matching files in the graph + file_nodes = self.graph_traversal.find_files(file_paths) + + if not file_nodes: + return f"# Files Not Found\n\nNo files matching these paths were found:\n" + \ + "\n".join(f"- {path}" for path in file_paths) + + # Get context for each file + file_contexts = [] + for file_node in file_nodes: + path = file_node.get("path", "") + context = self.graph_traversal.get_file_context(path) + if context: + file_contexts.append(context) + + # Build structured context + if self.llm_processor.enabled: + # Use LLM to organize the contexts + combined_context = [] + for context in file_contexts: + organized = self.llm_processor.organize_file_context(context) + combined_context.append(organized) + + result = "# Context for Files\n\n" + "\n---\n\n".join(combined_context) + else: + # Use context builder without LLM + result = self.context_builder.build_files_context(file_contexts) + + return result + + except Exception as e: + logger.error(f"Error getting file context: {e}") + return f"# Error\n\nFailed to retrieve context for files: {str(e)}" + + async def get_context_for_symbol(self, symbol_name: str, symbol_type: Optional[str] = None) -> str: + """ + Retrieve context for a specific symbol. + + Args: + symbol_name: Name of the symbol to find + symbol_type: Optional type hint (class, function, etc.) + + Returns: + Markdown-formatted context for the symbol + """ + logger.info(f"Getting context for symbol: {symbol_name} (type: {symbol_type})") + + try: + # Find matching symbols + symbols = self.graph_traversal.find_symbol(symbol_name, symbol_type) + + if not symbols: + return f"# Symbol Not Found\n\nNo symbol named '{symbol_name}' was found in the graph." + + # Get the best match (exact match preferred) + best_match = symbols[0] + symbol_id = best_match.get("_id") + + # Get comprehensive context + context = self.graph_traversal.get_symbol_context(symbol_id) + + if not context: + return f"# Symbol Found but No Context\n\nSymbol '{symbol_name}' exists but has no associated context." + + # Get context for related symbols if any + related_contexts = [] + if len(symbols) > 1: + for symbol in symbols[1:4]: # Get up to 3 more related symbols + related_id = symbol.get("_id") + related_context = self.graph_traversal.get_symbol_context(related_id) + if related_context: + related_contexts.append(related_context) + + # Build structured context + if self.llm_processor.enabled: + # Use LLM to organize the context + main_context = self.llm_processor.organize_symbol_context(context) + + if related_contexts: + related_parts = [] + for rc in related_contexts: + related_parts.append(self.llm_processor.organize_symbol_context(rc)) + + result = main_context + "\n\n---\n\n## Other Matches\n\n" + "\n---\n\n".join(related_parts) + else: + result = main_context + else: + # Use context builder without LLM + result = self.context_builder.build_symbol_context(context, related_contexts) + + return result + + except Exception as e: + logger.error(f"Error getting symbol context: {e}") + return f"# Error\n\nFailed to retrieve context for symbol '{symbol_name}': {str(e)}" + + def _format_file_list(self, files: List[Dict[str, Any]]) -> str: + """Format a list of files for display.""" + if not files: + return "None" + + file_paths = [f.get("path", "Unknown") for f in files[:10]] + result = "\n".join(f"- `{path}`" for path in file_paths) + + if len(files) > 10: + result += f"\n- ... and {len(files) - 10} more files" + + return result \ No newline at end of file diff --git a/mcp-blarify-server/src/tools/planning_tools.py b/mcp-blarify-server/src/tools/planning_tools.py new file mode 100644 index 00000000..21eed8de --- /dev/null +++ b/mcp-blarify-server/src/tools/planning_tools.py @@ -0,0 +1,239 @@ +"""MCP tools for change planning.""" + +import logging +from typing import Dict, Any, List +from neo4j import Driver + +from ..processors.graph_traversal import GraphTraversal +from ..processors.context_builder import ContextBuilder +from ..processors.llm_processor import LLMProcessor + +logger = logging.getLogger(__name__) + + +class PlanningTools: + """MCP tools for planning code changes.""" + + def __init__(self, driver: Driver): + """Initialize planning tools.""" + self.driver = driver + self.graph_traversal = GraphTraversal(driver) + self.context_builder = ContextBuilder() + self.llm_processor = LLMProcessor() + + async def build_plan_for_change(self, change_request: str) -> str: + """ + Build an implementation plan for a change request. + + Args: + change_request: Description of the desired change + + Returns: + Markdown-formatted implementation plan + """ + logger.info(f"Building plan for change: {change_request[:100]}...") + + try: + # Extract entities from the change request + entities = self.llm_processor.extract_entities_from_request(change_request) + logger.info(f"Extracted entities: {entities}") + + if not entities: + # If no entities extracted, return a basic plan template + return self._create_basic_plan_template(change_request) + + # Analyze impact for each entity + impact_analysis = self.graph_traversal.analyze_change_impact(entities) + + # Look for related patterns/concepts + patterns = self._find_related_patterns(change_request) + + # Build change context + change_context = self.context_builder.build_change_plan_context( + change_request, + impact_analysis, + patterns + ) + + # Generate implementation plan + if self.llm_processor.enabled: + plan = self.llm_processor.create_implementation_plan( + change_request, + impact_analysis + ) + else: + plan = self._create_detailed_plan(change_context) + + return plan + + except Exception as e: + logger.error(f"Error building change plan: {e}") + return f"# Error\n\nFailed to build implementation plan: {str(e)}" + + def _find_related_patterns(self, change_request: str) -> Dict[str, Any]: + """Find patterns or concepts related to the change request.""" + try: + # Extract potential pattern/concept keywords + keywords = ["pattern", "architecture", "design", "approach", "strategy"] + + patterns = {} + for keyword in keywords: + if keyword.lower() in change_request.lower(): + # Search for related concepts + pattern_results = self.graph_traversal.find_patterns(keyword) + if pattern_results.get("patterns"): + patterns[keyword] = pattern_results["patterns"] + + return patterns + + except Exception as e: + logger.error(f"Error finding patterns: {e}") + return {} + + def _create_basic_plan_template(self, change_request: str) -> str: + """Create a basic plan template when no entities are found.""" + return f"""# Implementation Plan + +## Change Request +{change_request} + +## Analysis +Unable to automatically identify specific code entities from the change request. +Please provide more specific details about: +- Which files or modules need to be modified +- Which classes or functions are involved +- What new components need to be created + +## General Implementation Steps + +### 1. Requirements Analysis +- Review the change request in detail +- Identify all affected components +- Define acceptance criteria + +### 2. Design +- Plan the technical approach +- Identify dependencies +- Design new components if needed + +### 3. Implementation +- Create/modify necessary files +- Implement core functionality +- Handle edge cases + +### 4. Testing +- Write unit tests +- Update integration tests +- Perform manual testing + +### 5. Documentation +- Update code documentation +- Update user documentation +- Document any API changes + +## Next Steps +To create a more detailed plan, please provide specific information about the code elements involved in this change. +""" + + def _create_detailed_plan(self, context: Dict[str, Any]) -> str: + """Create a detailed plan from the context.""" + change_request = context.get("change_request", "") + impact_summary = context.get("impact_summary", {}) + affected_files = context.get("affected_files", []) + test_files = context.get("test_files", []) + doc_files = context.get("documentation_files", []) + + plan = f"""# Implementation Plan + +## Change Request +{change_request} + +## Impact Analysis +- **Entities Affected**: {impact_summary.get('entities_affected', 0)} +- **Total Dependencies**: {impact_summary.get('total_dependents', 0)} +- **Files to Modify**: {impact_summary.get('files_affected', 0)} +- **Test Files**: {impact_summary.get('test_files_affected', 0)} + +## Implementation Steps + +### 1. Prepare Development Environment +- Create feature branch +- Ensure all tests pass before starting +- Review existing code structure + +### 2. Modify Existing Files +""" + + # Add files to modify + for i, file_path in enumerate(affected_files[:10], 1): + plan += f"\n#### {i}. Update `{file_path}`" + plan += f"\n- Review current implementation" + plan += f"\n- Apply necessary changes" + plan += f"\n- Ensure backward compatibility\n" + + if len(affected_files) > 10: + plan += f"\n... and {len(affected_files) - 10} more files\n" + + plan += """ +### 3. Create New Components (if needed) +- Identify any new files or modules required +- Follow existing project structure and patterns +- Add appropriate documentation + +### 4. Update Tests +""" + + if test_files: + plan += "Existing test files to update:\n" + for test_file in test_files[:5]: + plan += f"- `{test_file}`\n" + if len(test_files) > 5: + plan += f"- ... and {len(test_files) - 5} more test files\n" + else: + plan += "- Create new test files as needed\n" + plan += "- Ensure comprehensive test coverage\n" + + plan += """ +### 5. Update Documentation +""" + + if doc_files: + plan += "Documentation files to update:\n" + for doc_file in doc_files[:5]: + plan += f"- `{doc_file}`\n" + if len(doc_files) > 5: + plan += f"- ... and {len(doc_files) - 5} more documentation files\n" + else: + plan += "- Update relevant documentation\n" + plan += "- Add examples if applicable\n" + + plan += """ +### 6. Validation +- Run full test suite +- Perform integration testing +- Code review + +## Dependencies to Consider +""" + + # Add specific entities and their dependencies + entities = context.get("affected_entities", []) + for entity in entities[:5]: + plan += f"- **{entity}**: Check all usages and dependencies\n" + + if len(entities) > 5: + plan += f"- ... and {len(entities) - 5} more entities\n" + + plan += """ +## Risk Assessment +- **Breaking Changes**: Review all public APIs +- **Performance Impact**: Profile critical paths +- **Security**: Review any security implications + +## Rollback Plan +- Keep feature branch separate until fully tested +- Document any migration steps +- Prepare rollback instructions if needed +""" + + return plan \ No newline at end of file diff --git a/mcp-blarify-server/src/tools/query_builder.py b/mcp-blarify-server/src/tools/query_builder.py new file mode 100644 index 00000000..6fb019f3 --- /dev/null +++ b/mcp-blarify-server/src/tools/query_builder.py @@ -0,0 +1,202 @@ +"""Cypher query builder for graph traversals.""" + +from typing import List, Dict, Any, Optional + + +class QueryBuilder: + """Builds Cypher queries for various graph operations.""" + + @staticmethod + def find_files_query(file_paths: List[str]) -> str: + """Build query to find FILE nodes by path.""" + # Handle both absolute and relative paths + path_conditions = [] + for path in file_paths: + # Match on path ending for flexibility + path_conditions.append(f"n.path ENDS WITH '{path}'") + + where_clause = " OR ".join(path_conditions) + + return f""" + MATCH (n:FILE) + WHERE {where_clause} + RETURN n + """ + + @staticmethod + def get_file_context_query(file_path: str, max_depth: int = 2) -> str: + """Build query to get comprehensive context for a file.""" + return f""" + // Find the file node + MATCH (file:FILE) + WHERE file.path ENDS WITH $file_path + + // Get direct contents (classes, functions) + OPTIONAL MATCH (file)-[:CONTAINS]->(content) + WHERE content:CLASS OR content:FUNCTION + + // Get imports and dependencies + OPTIONAL MATCH (file)-[:IMPORTS]->(imported) + + // Get files that import this file + OPTIONAL MATCH (importer:FILE)-[:IMPORTS]->(file) + + // Get documentation nodes + OPTIONAL MATCH (doc:DOCUMENTATION_FILE)-[:DOCUMENTS]->(file) + + // Get LLM descriptions if available + OPTIONAL MATCH (file)<-[:DESCRIBES]-(file_desc:DESCRIPTION) + OPTIONAL MATCH (content)<-[:DESCRIBES]-(content_desc:DESCRIPTION) + + // Get related filesystem nodes + OPTIONAL MATCH (file)-[:HAS_FILE]->(fs:FILESYSTEM_FILE) + + RETURN file, + COLLECT(DISTINCT content) as contents, + COLLECT(DISTINCT imported) as imports, + COLLECT(DISTINCT importer) as importers, + COLLECT(DISTINCT doc) as documentation, + file_desc, + COLLECT(DISTINCT content_desc) as content_descriptions, + fs as filesystem_node + """ + + @staticmethod + def find_symbol_query(symbol_name: str, symbol_type: Optional[str] = None) -> str: + """Build query to find nodes matching a symbol name.""" + type_filter = "" + if symbol_type: + type_filter = f"AND (n:{symbol_type.upper()})" + + return f""" + // Exact match + MATCH (n) + WHERE n.name = $symbol_name {type_filter} + RETURN n, 'exact' as match_type + + UNION + + // Case-insensitive match + MATCH (n) + WHERE toLower(n.name) = toLower($symbol_name) {type_filter} + RETURN n, 'case_insensitive' as match_type + + UNION + + // Contains match (for partial matches) + MATCH (n) + WHERE toLower(n.name) CONTAINS toLower($symbol_name) {type_filter} + RETURN n, 'partial' as match_type + + ORDER BY + CASE match_type + WHEN 'exact' THEN 0 + WHEN 'case_insensitive' THEN 1 + ELSE 2 + END, + size(n.name) + LIMIT 10 + """ + + @staticmethod + def get_symbol_context_query(symbol_id: str) -> str: + """Build query to get comprehensive context for a symbol.""" + return f""" + // Find the symbol node + MATCH (symbol) + WHERE id(symbol) = $symbol_id + + // Get containing file + OPTIONAL MATCH (file:FILE)-[:CONTAINS]->(symbol) + + // Get inheritance/implementation relationships + OPTIONAL MATCH (symbol)-[:INHERITS_FROM]->(parent) + OPTIONAL MATCH (symbol)-[:IMPLEMENTS]->(interface) + OPTIONAL MATCH (child)-[:INHERITS_FROM]->(symbol) + + // Get method/attribute relationships for classes + OPTIONAL MATCH (symbol)-[:HAS_METHOD]->(method:FUNCTION) + OPTIONAL MATCH (symbol)-[:HAS_ATTRIBUTE]->(attr) + + // Get usage relationships + OPTIONAL MATCH (caller)-[:CALLS]->(symbol) + WHERE caller:FUNCTION OR caller:METHOD + OPTIONAL MATCH (symbol)-[:CALLS]->(callee) + + // Get reference relationships + OPTIONAL MATCH (referencer)-[:REFERENCES]->(symbol) + + // Get documentation + OPTIONAL MATCH (doc:DOCUMENTATION_FILE)-[:DOCUMENTS]->(symbol) + OPTIONAL MATCH (doc:DOCUMENTED_ENTITY {name: symbol.name}) + + // Get LLM description + OPTIONAL MATCH (symbol)<-[:DESCRIBES]-(description:DESCRIPTION) + + // Get related concepts + OPTIONAL MATCH (symbol)-[:IMPLEMENTS_CONCEPT]->(concept:CONCEPT) + + RETURN symbol, + file, + COLLECT(DISTINCT parent) as parents, + COLLECT(DISTINCT interface) as interfaces, + COLLECT(DISTINCT child) as children, + COLLECT(DISTINCT method) as methods, + COLLECT(DISTINCT attr) as attributes, + COLLECT(DISTINCT caller) as callers, + COLLECT(DISTINCT callee) as callees, + COLLECT(DISTINCT referencer) as referencers, + COLLECT(DISTINCT doc) as documentation, + description, + COLLECT(DISTINCT concept) as concepts + """ + + @staticmethod + def analyze_change_impact_query(entity_names: List[str]) -> str: + """Build query to analyze impact of changing specific entities.""" + name_list = ", ".join([f"'{name}'" for name in entity_names]) + + return f""" + // Find all nodes matching the entity names + MATCH (target) + WHERE target.name IN [{name_list}] + + // Find direct dependencies + OPTIONAL MATCH (target)<-[:CALLS|REFERENCES|IMPORTS|INHERITS_FROM|IMPLEMENTS]-(dependent) + + // Find containing files + OPTIONAL MATCH (file:FILE)-[:CONTAINS]->(target) + + // Find test files + OPTIONAL MATCH (test:FILE)-[:TESTS]->(target) + WHERE test.path CONTAINS 'test' + + // Find related documentation + OPTIONAL MATCH (doc:DOCUMENTATION_FILE)-[:DOCUMENTS]->(target) + + // Aggregate results + RETURN target, + COLLECT(DISTINCT dependent) as dependents, + COLLECT(DISTINCT file) as containing_files, + COLLECT(DISTINCT test) as test_files, + COLLECT(DISTINCT doc) as documentation + """ + + @staticmethod + def find_related_patterns_query(concept_name: str) -> str: + """Build query to find code implementing specific patterns/concepts.""" + return f""" + // Find concept nodes + MATCH (concept:CONCEPT) + WHERE concept.name CONTAINS $concept_name + + // Find implementing code + OPTIONAL MATCH (implementer)-[:IMPLEMENTS_CONCEPT]->(concept) + + // Find related documentation + OPTIONAL MATCH (doc:DOCUMENTATION_FILE)-[:CONTAINS_CONCEPT]->(concept) + + RETURN concept, + COLLECT(DISTINCT implementer) as implementers, + COLLECT(DISTINCT doc) as documentation + """ \ No newline at end of file diff --git a/mcp-blarify-server/tests/__init__.py b/mcp-blarify-server/tests/__init__.py new file mode 100644 index 00000000..9dd6b54a --- /dev/null +++ b/mcp-blarify-server/tests/__init__.py @@ -0,0 +1 @@ +"""Tests for MCP Blarify Server.""" \ No newline at end of file diff --git a/mcp-blarify-server/tests/setup_test_graph.py b/mcp-blarify-server/tests/setup_test_graph.py new file mode 100644 index 00000000..beb24df0 --- /dev/null +++ b/mcp-blarify-server/tests/setup_test_graph.py @@ -0,0 +1,397 @@ +"""Set up a test Blarify graph in Neo4j for integration testing.""" + +import logging +from neo4j import GraphDatabase +import time + +logging.basicConfig(level=logging.INFO) +logger = logging.getLogger(__name__) + + +class TestGraphSetup: + """Sets up a test graph that mimics Blarify output.""" + + def __init__(self, uri="bolt://localhost:7687", user="neo4j", password="testpassword"): + self.driver = GraphDatabase.driver(uri, auth=(user, password)) + logger.info(f"Connected to Neo4j at {uri}") + + def close(self): + self.driver.close() + + def clear_database(self): + """Clear all nodes and relationships.""" + with self.driver.session() as session: + session.run("MATCH (n) DETACH DELETE n") + logger.info("Cleared database") + + def create_test_graph(self): + """Create a test graph with various node types and relationships.""" + with self.driver.session() as session: + # Create folder structure + session.run(""" + CREATE (root:FOLDER {path: 'file:///project', name: 'project'}) + CREATE (src:FOLDER {path: 'file:///project/src', name: 'src'}) + CREATE (services:FOLDER {path: 'file:///project/src/services', name: 'services'}) + CREATE (controllers:FOLDER {path: 'file:///project/src/controllers', name: 'controllers'}) + CREATE (models:FOLDER {path: 'file:///project/src/models', name: 'models'}) + CREATE (tests:FOLDER {path: 'file:///project/tests', name: 'tests'}) + CREATE (docs:FOLDER {path: 'file:///project/docs', name: 'docs'}) + + CREATE (root)-[:CONTAINS]->(src) + CREATE (root)-[:CONTAINS]->(tests) + CREATE (root)-[:CONTAINS]->(docs) + CREATE (src)-[:CONTAINS]->(services) + CREATE (src)-[:CONTAINS]->(controllers) + CREATE (src)-[:CONTAINS]->(models) + """) + logger.info("Created folder structure") + + # Create files and code nodes + session.run(""" + // User model file + CREATE (user_model_file:FILE { + path: 'file:///project/src/models/user.py', + name: 'user.py', + extension: '.py' + }) + CREATE (user_class:CLASS { + name: 'User', + path: 'file:///project/src/models/user.py', + line: 10 + }) + CREATE (user_init:METHOD:FUNCTION { + name: '__init__', + path: 'file:///project/src/models/user.py', + line: 15 + }) + CREATE (user_validate:METHOD:FUNCTION { + name: 'validate', + path: 'file:///project/src/models/user.py', + line: 25 + }) + + CREATE (user_model_file)-[:CONTAINS]->(user_class) + CREATE (user_class)-[:HAS_METHOD]->(user_init) + CREATE (user_class)-[:HAS_METHOD]->(user_validate) + + // User service file + CREATE (user_service_file:FILE { + path: 'file:///project/src/services/user_service.py', + name: 'user_service.py', + extension: '.py' + }) + CREATE (base_service:CLASS { + name: 'BaseService', + path: 'file:///project/src/services/base.py', + line: 5 + }) + CREATE (user_service:CLASS { + name: 'UserService', + path: 'file:///project/src/services/user_service.py', + line: 20 + }) + CREATE (create_user:METHOD:FUNCTION { + name: 'create_user', + path: 'file:///project/src/services/user_service.py', + line: 30 + }) + CREATE (get_user:METHOD:FUNCTION { + name: 'get_user', + path: 'file:///project/src/services/user_service.py', + line: 45 + }) + CREATE (update_user:METHOD:FUNCTION { + name: 'update_user', + path: 'file:///project/src/services/user_service.py', + line: 60 + }) + + CREATE (user_service_file)-[:CONTAINS]->(user_service) + CREATE (user_service)-[:INHERITS_FROM]->(base_service) + CREATE (user_service)-[:HAS_METHOD]->(create_user) + CREATE (user_service)-[:HAS_METHOD]->(get_user) + CREATE (user_service)-[:HAS_METHOD]->(update_user) + + // Auth service file + CREATE (auth_service_file:FILE { + path: 'file:///project/src/services/auth_service.py', + name: 'auth_service.py', + extension: '.py' + }) + CREATE (auth_service:CLASS { + name: 'AuthService', + path: 'file:///project/src/services/auth_service.py', + line: 15 + }) + CREATE (login:METHOD:FUNCTION { + name: 'login', + path: 'file:///project/src/services/auth_service.py', + line: 25 + }) + CREATE (verify_token:METHOD:FUNCTION { + name: 'verify_token', + path: 'file:///project/src/services/auth_service.py', + line: 40 + }) + + CREATE (auth_service_file)-[:CONTAINS]->(auth_service) + CREATE (auth_service)-[:HAS_METHOD]->(login) + CREATE (auth_service)-[:HAS_METHOD]->(verify_token) + + // User controller file + CREATE (user_controller_file:FILE { + path: 'file:///project/src/controllers/user_controller.py', + name: 'user_controller.py', + extension: '.py' + }) + CREATE (user_controller:CLASS { + name: 'UserController', + path: 'file:///project/src/controllers/user_controller.py', + line: 10 + }) + CREATE (handle_create:METHOD:FUNCTION { + name: 'handle_create', + path: 'file:///project/src/controllers/user_controller.py', + line: 20 + }) + CREATE (handle_get:METHOD:FUNCTION { + name: 'handle_get', + path: 'file:///project/src/controllers/user_controller.py', + line: 35 + }) + + CREATE (user_controller_file)-[:CONTAINS]->(user_controller) + CREATE (user_controller)-[:HAS_METHOD]->(handle_create) + CREATE (user_controller)-[:HAS_METHOD]->(handle_get) + """) + logger.info("Created code nodes") + + # Create relationships between code elements + session.run(""" + MATCH (us:CLASS {name: 'UserService'}) + MATCH (uc:CLASS {name: 'User'}) + MATCH (ctrl:CLASS {name: 'UserController'}) + MATCH (auth:CLASS {name: 'AuthService'}) + + // UserService uses User model + CREATE (us)-[:USES]->(uc) + + // UserController uses UserService + CREATE (ctrl)-[:USES]->(us) + + // AuthService uses UserService + CREATE (auth)-[:USES]->(us) + + // Method calls + MATCH (create:FUNCTION {name: 'create_user'}) + MATCH (validate:FUNCTION {name: 'validate'}) + CREATE (create)-[:CALLS]->(validate) + + MATCH (login:FUNCTION {name: 'login'}) + MATCH (get:FUNCTION {name: 'get_user'}) + CREATE (login)-[:CALLS]->(get) + + MATCH (handle:FUNCTION {name: 'handle_create'}) + CREATE (handle)-[:CALLS]->(create) + """) + logger.info("Created code relationships") + + # Create import relationships + session.run(""" + MATCH (usf:FILE {name: 'user_service.py'}) + MATCH (umf:FILE {name: 'user.py'}) + CREATE (usf)-[:IMPORTS]->(umf) + + MATCH (asf:FILE {name: 'auth_service.py'}) + CREATE (asf)-[:IMPORTS]->(usf) + + MATCH (ucf:FILE {name: 'user_controller.py'}) + CREATE (ucf)-[:IMPORTS]->(usf) + """) + logger.info("Created import relationships") + + # Create LLM description nodes + session.run(""" + MATCH (us:CLASS {name: 'UserService'}) + CREATE (us_desc:DESCRIPTION { + description: 'Core service for user management operations. Handles CRUD operations for users with validation and security checks.', + node_id: id(us) + }) + CREATE (us_desc)-[:DESCRIBES]->(us) + + MATCH (auth:CLASS {name: 'AuthService'}) + CREATE (auth_desc:DESCRIPTION { + description: 'Authentication service handling user login, token generation, and session management.', + node_id: id(auth) + }) + CREATE (auth_desc)-[:DESCRIBES]->(auth) + """) + logger.info("Created LLM descriptions") + + # Create documentation nodes + session.run(""" + CREATE (readme:DOCUMENTATION_FILE { + path: 'file:///project/README.md', + name: 'README.md', + doc_type: 'md' + }) + CREATE (api_doc:DOCUMENTATION_FILE { + path: 'file:///project/docs/api.md', + name: 'api.md', + doc_type: 'md' + }) + + // Documentation concepts + CREATE (auth_concept:CONCEPT { + name: 'JWT Authentication', + description: 'JSON Web Token based authentication system' + }) + CREATE (rest_concept:CONCEPT { + name: 'REST API', + description: 'RESTful API design patterns' + }) + + CREATE (readme)-[:CONTAINS_CONCEPT]->(auth_concept) + CREATE (api_doc)-[:CONTAINS_CONCEPT]->(rest_concept) + + // Link documentation to code + MATCH (auth:CLASS {name: 'AuthService'}) + CREATE (auth_concept)-[:DOCUMENTS]->(auth) + + MATCH (ctrl:CLASS {name: 'UserController'}) + CREATE (rest_concept)-[:DOCUMENTS]->(ctrl) + """) + logger.info("Created documentation nodes") + + # Create filesystem nodes + session.run(""" + CREATE (fs_src:FILESYSTEM_DIRECTORY { + path: '/project/src', + name: 'src', + size: 4096 + }) + CREATE (fs_services:FILESYSTEM_DIRECTORY { + path: '/project/src/services', + name: 'services', + size: 4096 + }) + CREATE (fs_user_service:FILESYSTEM_FILE { + path: '/project/src/services/user_service.py', + name: 'user_service.py', + size: 2048, + extension: 'py' + }) + + CREATE (fs_src)-[:HAS_CHILD]->(fs_services) + CREATE (fs_services)-[:HAS_CHILD]->(fs_user_service) + + // Link filesystem to code + MATCH (usf:FILE {name: 'user_service.py'}) + CREATE (fs_user_service)-[:REPRESENTS]->(usf) + """) + logger.info("Created filesystem nodes") + + # Create test files + session.run(""" + CREATE (test_file:FILE { + path: 'file:///project/tests/test_user_service.py', + name: 'test_user_service.py', + extension: '.py' + }) + + MATCH (us:CLASS {name: 'UserService'}) + CREATE (test_file)-[:TESTS]->(us) + """) + logger.info("Created test relationships") + + # Add file relationships to folders + session.run(""" + MATCH (models:FOLDER {name: 'models'}) + MATCH (umf:FILE {name: 'user.py'}) + CREATE (models)-[:CONTAINS]->(umf) + + MATCH (services:FOLDER {name: 'services'}) + MATCH (usf:FILE {name: 'user_service.py'}) + MATCH (asf:FILE {name: 'auth_service.py'}) + CREATE (services)-[:CONTAINS]->(usf) + CREATE (services)-[:CONTAINS]->(asf) + + MATCH (controllers:FOLDER {name: 'controllers'}) + MATCH (ucf:FILE {name: 'user_controller.py'}) + CREATE (controllers)-[:CONTAINS]->(ucf) + + MATCH (tests:FOLDER {name: 'tests'}) + MATCH (tf:FILE {name: 'test_user_service.py'}) + CREATE (tests)-[:CONTAINS]->(tf) + + MATCH (docs:FOLDER {name: 'docs'}) + MATCH (readme:DOCUMENTATION_FILE {name: 'README.md'}) + MATCH (api:DOCUMENTATION_FILE {name: 'api.md'}) + CREATE (docs)-[:CONTAINS]->(readme) + CREATE (docs)-[:CONTAINS]->(api) + """) + logger.info("Created folder relationships") + + def verify_graph(self): + """Verify the graph was created correctly.""" + with self.driver.session() as session: + # Count nodes by type + result = session.run(""" + MATCH (n) + RETURN labels(n) as labels, count(n) as count + ORDER BY count DESC + """) + + print("\nNode counts by label:") + for record in result: + print(f" {record['labels']}: {record['count']}") + + # Count relationships + result = session.run(""" + MATCH ()-[r]->() + RETURN type(r) as type, count(r) as count + ORDER BY count DESC + """) + + print("\nRelationship counts by type:") + for record in result: + print(f" {record['type']}: {record['count']}") + + # Sample queries + print("\nSample query - UserService context:") + result = session.run(""" + MATCH (us:CLASS {name: 'UserService'}) + OPTIONAL MATCH (us)-[:HAS_METHOD]->(m) + OPTIONAL MATCH (us)-[:INHERITS_FROM]->(parent) + OPTIONAL MATCH (us)<-[:USES]-(caller) + RETURN us.name as name, + collect(DISTINCT m.name) as methods, + parent.name as inherits_from, + collect(DISTINCT caller.name) as used_by + """) + + for record in result: + print(f" Name: {record['name']}") + print(f" Methods: {record['methods']}") + print(f" Inherits from: {record['inherits_from']}") + print(f" Used by: {record['used_by']}") + + +def main(): + """Set up the test graph.""" + # Wait for Neo4j to be ready + print("Waiting for Neo4j to start...") + time.sleep(5) + + setup = TestGraphSetup() + try: + setup.clear_database() + setup.create_test_graph() + setup.verify_graph() + print("\nTest graph created successfully!") + finally: + setup.close() + + +if __name__ == "__main__": + main() \ No newline at end of file diff --git a/mcp-blarify-server/tests/test_context_builder.py b/mcp-blarify-server/tests/test_context_builder.py new file mode 100644 index 00000000..c437bcf1 --- /dev/null +++ b/mcp-blarify-server/tests/test_context_builder.py @@ -0,0 +1,117 @@ +"""Tests for context builder.""" + +import pytest +from src.processors.context_builder import ContextBuilder + + +class TestContextBuilder: + """Test context builder functionality.""" + + @pytest.fixture + def context_builder(self): + """Create a context builder instance.""" + return ContextBuilder() + + def test_build_files_context_empty(self, context_builder): + """Test building context with no files.""" + result = context_builder.build_files_context([]) + assert "No Files Found" in result + + def test_build_files_context_single_file(self, context_builder): + """Test building context for a single file.""" + file_contexts = [{ + "file": { + "path": "/src/main.py", + "name": "main.py", + "_labels": ["FILE"] + }, + "contents": [ + {"name": "main", "_labels": ["FUNCTION"]}, + {"name": "App", "_labels": ["CLASS"]} + ], + "imports": [ + {"name": "os", "path": "os"}, + {"name": "sys", "path": "sys"} + ] + }] + + result = context_builder.build_files_context(file_contexts) + assert "File: main.py" in result + assert "Path**: `/src/main.py`" in result + assert "FUNCTIONs: main" in result + assert "CLASSs: App" in result + assert "os" in result + assert "sys" in result + + def test_build_symbol_context(self, context_builder): + """Test building context for a symbol.""" + symbol_context = { + "symbol": { + "name": "UserService", + "_labels": ["CLASS"], + "_id": 123 + }, + "file": { + "path": "/src/services/user_service.py" + }, + "methods": [ + {"name": "create_user", "_labels": ["METHOD"]}, + {"name": "get_user", "_labels": ["METHOD"]} + ], + "callers": [ + {"name": "UserController", "path": "/src/controllers/user.py"} + ] + } + + result = context_builder.build_symbol_context(symbol_context) + assert "Symbol: UserService" in result + assert "Type**: CLASS" in result + assert "Location**: `/src/services/user_service.py`" in result + assert "create_user" in result + assert "get_user" in result + assert "Called by**: 1 locations" in result + + def test_build_change_plan_context(self, context_builder): + """Test building context for change planning.""" + impact_analysis = { + "UserService": { + "target": {"name": "UserService"}, + "dependents": [{"name": "UserController"}], + "containing_files": [{"path": "/src/services/user_service.py"}], + "test_files": [{"path": "/tests/test_user_service.py"}] + } + } + + context = context_builder.build_change_plan_context( + "Add email verification", + impact_analysis + ) + + assert context["change_request"] == "Add email verification" + assert context["affected_entities"] == ["UserService"] + assert "/src/services/user_service.py" in context["affected_files"] + assert "/tests/test_user_service.py" in context["test_files"] + assert context["impact_summary"]["entities_affected"] == 1 + + def test_truncate_context(self, context_builder): + """Test context truncation.""" + # Create a long context + long_text = "x" * (context_builder.max_context_length + 1000) + + result = context_builder._truncate_context(long_text) + assert len(result) <= context_builder.max_context_length + assert "context truncated" in result + + def test_group_by_type(self, context_builder): + """Test grouping nodes by type.""" + nodes = [ + {"name": "func1", "_labels": ["FUNCTION"]}, + {"name": "func2", "_labels": ["FUNCTION"]}, + {"name": "Class1", "_labels": ["CLASS"]}, + {"name": "var1", "_labels": ["VARIABLE"]} + ] + + grouped = context_builder._group_by_type(nodes) + assert len(grouped["FUNCTION"]) == 2 + assert len(grouped["CLASS"]) == 1 + assert len(grouped["VARIABLE"]) == 1 \ No newline at end of file diff --git a/mcp-blarify-server/tests/test_integration.py b/mcp-blarify-server/tests/test_integration.py new file mode 100644 index 00000000..5c900fdf --- /dev/null +++ b/mcp-blarify-server/tests/test_integration.py @@ -0,0 +1,228 @@ +"""Integration tests for MCP server with real Neo4j.""" + +import pytest +import asyncio +import os +from neo4j import GraphDatabase +import subprocess +import time + +# Add parent directory to path +import sys +sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) + +from src.server import BlarifyMCPServer +from src.tools.context_tools import ContextTools +from src.tools.planning_tools import PlanningTools + + +class TestIntegration: + """Integration tests with real Neo4j database.""" + + @classmethod + def setup_class(cls): + """Set up test environment.""" + # Check if Neo4j is running + try: + driver = GraphDatabase.driver( + "bolt://localhost:7687", + auth=("neo4j", "testpassword") + ) + driver.verify_connectivity() + driver.close() + print("Neo4j is already running") + except Exception: + print("Starting Neo4j with docker-compose...") + subprocess.run(["docker-compose", "up", "-d"], + cwd=os.path.dirname(os.path.dirname(__file__))) + time.sleep(10) # Wait for Neo4j to start + + # Set up test graph + print("Setting up test graph...") + subprocess.run([sys.executable, "tests/setup_test_graph.py"], + cwd=os.path.dirname(os.path.dirname(__file__))) + + @pytest.fixture + def neo4j_driver(self): + """Create Neo4j driver.""" + driver = GraphDatabase.driver( + "bolt://localhost:7687", + auth=("neo4j", "testpassword") + ) + yield driver + driver.close() + + @pytest.fixture + async def mcp_server(self): + """Create MCP server instance.""" + os.environ.update({ + "NEO4J_URI": "bolt://localhost:7687", + "NEO4J_USERNAME": "neo4j", + "NEO4J_PASSWORD": "testpassword", + "NEO4J_DATABASE": "neo4j" + }) + + server = BlarifyMCPServer() + await server._connect_to_neo4j() + yield server + if server.driver: + server.driver.close() + + @pytest.mark.asyncio + async def test_get_context_for_files_real(self, mcp_server): + """Test getting context for real files in the graph.""" + result = await mcp_server.context_tools.get_context_for_files([ + "user_service.py", + "auth_service.py" + ]) + + assert "Context for Files" in result + assert "user_service.py" in result + assert "UserService" in result + assert "create_user" in result + assert "auth_service.py" in result + assert "AuthService" in result + + # Check for relationships + assert "Imports" in result or "Dependencies" in result + assert "user.py" in result # UserService imports User model + + @pytest.mark.asyncio + async def test_get_context_for_symbol_real(self, mcp_server): + """Test getting context for a real symbol in the graph.""" + result = await mcp_server.context_tools.get_context_for_symbol( + "UserService", + "class" + ) + + assert "UserService" in result + assert "CLASS" in result + assert "/src/services/user_service.py" in result + + # Check for methods + assert "create_user" in result + assert "get_user" in result + assert "update_user" in result + + # Check for inheritance + assert "BaseService" in result + + # Check for usage + assert "UserController" in result or "AuthService" in result + + @pytest.mark.asyncio + async def test_build_plan_for_change_real(self, mcp_server): + """Test building a change plan with real graph data.""" + result = await mcp_server.planning_tools.build_plan_for_change( + "Add email verification to the UserService" + ) + + assert "Implementation Plan" in result + assert "UserService" in result + + # Should identify affected files + assert "user_service.py" in result + + # Should identify test files + assert "test_user_service.py" in result + + # Should have implementation steps + assert "Modify" in result or "Update" in result + + @pytest.mark.asyncio + async def test_complex_traversal(self, neo4j_driver): + """Test complex graph traversal queries.""" + context_tools = ContextTools(neo4j_driver) + + # Get context for UserController - should show full dependency chain + result = await context_tools.get_context_for_symbol("UserController") + + # Should show that UserController uses UserService + assert "UserService" in result + + # Should show methods + assert "handle_create" in result or "handle_get" in result + + @pytest.mark.asyncio + async def test_documentation_integration(self, mcp_server): + """Test documentation node integration.""" + # Get context for a file that has documentation + result = await mcp_server.context_tools.get_context_for_symbol("AuthService") + + # Should include concept if documentation nodes are linked + if "JWT Authentication" in result: + assert "Authentication" in result + + @pytest.mark.asyncio + async def test_mcp_tool_calls(self, mcp_server): + """Test MCP tool calls through the server interface.""" + # Test listing tools + tools = await mcp_server.server.list_tools() + assert len(tools) == 3 + assert any(t.name == "getContextForFiles" for t in tools) + + # Test calling getContextForFiles + result = await mcp_server.server.call_tool( + "getContextForFiles", + {"file_paths": ["user_service.py"]} + ) + assert len(result) == 1 + assert "UserService" in result[0].text + + # Test calling getContextForSymbol + result = await mcp_server.server.call_tool( + "getContextForSymbol", + {"symbol_name": "UserService", "symbol_type": "class"} + ) + assert len(result) == 1 + assert "UserService" in result[0].text + + # Test calling buildPlanForChange + result = await mcp_server.server.call_tool( + "buildPlanForChange", + {"change_request": "Add password reset functionality"} + ) + assert len(result) == 1 + assert "Implementation Plan" in result[0].text + + def test_verify_test_data(self, neo4j_driver): + """Verify test data was loaded correctly.""" + with neo4j_driver.session() as session: + # Check UserService exists + result = session.run( + "MATCH (n:CLASS {name: 'UserService'}) RETURN n" + ).single() + assert result is not None + + # Check relationships exist + result = session.run(""" + MATCH (us:CLASS {name: 'UserService'})-[:HAS_METHOD]->(m) + RETURN count(m) as method_count + """).single() + assert result["method_count"] >= 3 + + # Check file nodes exist + result = session.run(""" + MATCH (f:FILE) + RETURN count(f) as file_count + """).single() + assert result["file_count"] >= 4 + + +@pytest.mark.asyncio +async def test_server_lifecycle(): + """Test server startup and shutdown.""" + server = BlarifyMCPServer() + + # Test connection + await server._connect_to_neo4j() + assert server.driver is not None + assert server.context_tools is not None + assert server.planning_tools is not None + + # Test tool availability + tools = await server.server.list_tools() + assert len(tools) == 3 + + # Cleanup + server.driver.close() \ No newline at end of file diff --git a/mcp-blarify-server/tests/test_llm_processor.py b/mcp-blarify-server/tests/test_llm_processor.py new file mode 100644 index 00000000..ae42d70b --- /dev/null +++ b/mcp-blarify-server/tests/test_llm_processor.py @@ -0,0 +1,109 @@ +"""Tests for LLM processor.""" + +import pytest +from unittest.mock import Mock, patch +from src.processors.llm_processor import LLMProcessor + + +class TestLLMProcessor: + """Test LLM processor functionality.""" + + @pytest.fixture + def llm_processor_disabled(self): + """Create an LLM processor with LLM disabled.""" + with patch.dict('os.environ', {'AZURE_OPENAI_API_KEY': ''}): + return LLMProcessor() + + @pytest.fixture + def llm_processor_enabled(self): + """Create an LLM processor with mocked LLM.""" + with patch.dict('os.environ', { + 'AZURE_OPENAI_API_KEY': 'test-key', + 'AZURE_OPENAI_ENDPOINT': 'https://test.openai.azure.com/', + 'AZURE_OPENAI_DEPLOYMENT_NAME': 'test-deployment' + }): + with patch('src.processors.llm_processor.AzureOpenAI'): + return LLMProcessor() + + def test_llm_disabled(self, llm_processor_disabled): + """Test LLM processor when disabled.""" + assert not llm_processor_disabled.enabled + assert llm_processor_disabled.client is None + + def test_organize_file_context_without_llm(self, llm_processor_disabled): + """Test organizing file context without LLM.""" + context = { + "file": {"path": "/src/main.py"}, + "contents": [{"name": "main", "_labels": ["FUNCTION"]}] + } + + result = llm_processor_disabled.organize_file_context(context) + assert "Context for File: /src/main.py" in result + assert "FUNCTION: main" in result + + def test_organize_symbol_context_without_llm(self, llm_processor_disabled): + """Test organizing symbol context without LLM.""" + context = { + "symbol": {"name": "UserService", "_labels": ["CLASS"]}, + "file": {"path": "/src/services/user.py"} + } + + result = llm_processor_disabled.organize_symbol_context(context) + assert "Context for Symbol: UserService" in result + assert "Type**: CLASS" in result + + def test_extract_entities_basic(self, llm_processor_disabled): + """Test basic entity extraction without LLM.""" + change_request = "Update the UserService class and AuthController to add email verification" + + entities = llm_processor_disabled.extract_entities_from_request(change_request) + assert "UserService" in entities + assert "AuthController" in entities + + def test_create_implementation_plan_without_llm(self, llm_processor_disabled): + """Test creating implementation plan without LLM.""" + change_request = "Add email verification" + impact_analysis = { + "UserService": { + "target": {"name": "UserService"}, + "dependents": [{"name": "UserController"}], + "containing_files": [{"path": "/src/services/user.py"}] + } + } + + plan = llm_processor_disabled.create_implementation_plan(change_request, impact_analysis) + assert "Implementation Plan" in plan + assert "Add email verification" in plan + assert "Entities Affected: 1" in plan + + @patch('src.processors.llm_processor.AzureOpenAI') + def test_organize_file_context_with_llm(self, mock_azure, llm_processor_enabled): + """Test organizing file context with LLM.""" + # Mock LLM response + mock_response = Mock() + mock_response.choices = [Mock(message=Mock(content="# Organized Context\nFile details..."))] + llm_processor_enabled.client.chat.completions.create.return_value = mock_response + + context = { + "file": {"path": "/src/main.py"}, + "contents": [] + } + + result = llm_processor_enabled.organize_file_context(context) + assert "Organized Context" in result + + def test_extract_entities_from_camelcase(self, llm_processor_disabled): + """Test entity extraction from CamelCase.""" + change_request = "The UserService and PaymentProcessor need updates" + entities = llm_processor_disabled._extract_entities_basic(change_request) + + assert "UserService" in entities + assert "PaymentProcessor" in entities + + def test_extract_entities_from_quotes(self, llm_processor_disabled): + """Test entity extraction from quoted strings.""" + change_request = 'Update "user_service.py" and `auth_controller.js`' + entities = llm_processor_disabled._extract_entities_basic(change_request) + + assert "user_service.py" in entities + assert "auth_controller.js" in entities \ No newline at end of file diff --git a/mcp-blarify-server/tests/test_query_builder.py b/mcp-blarify-server/tests/test_query_builder.py new file mode 100644 index 00000000..7587a170 --- /dev/null +++ b/mcp-blarify-server/tests/test_query_builder.py @@ -0,0 +1,86 @@ +"""Tests for query builder.""" + +import pytest +from src.tools.query_builder import QueryBuilder + + +class TestQueryBuilder: + """Test query builder functionality.""" + + def test_find_files_query(self): + """Test file finding query construction.""" + qb = QueryBuilder() + + # Single file + query = qb.find_files_query(["src/main.py"]) + assert "MATCH (n:FILE)" in query + assert "n.path ENDS WITH 'src/main.py'" in query + + # Multiple files + query = qb.find_files_query(["src/main.py", "tests/test_main.py"]) + assert "n.path ENDS WITH 'src/main.py'" in query + assert "n.path ENDS WITH 'tests/test_main.py'" in query + assert " OR " in query + + def test_get_file_context_query(self): + """Test file context query construction.""" + qb = QueryBuilder() + query = qb.get_file_context_query("src/main.py", max_depth=2) + + # Check main components + assert "MATCH (file:FILE)" in query + assert "file.path ENDS WITH $file_path" in query + assert "(file)-[:CONTAINS]->(content)" in query + assert "(file)-[:IMPORTS]->(imported)" in query + assert "(importer:FILE)-[:IMPORTS]->(file)" in query + assert "(doc:DOCUMENTATION_FILE)-[:DOCUMENTS]->(file)" in query + assert "RETURN file" in query + + def test_find_symbol_query(self): + """Test symbol finding query construction.""" + qb = QueryBuilder() + + # Without type filter + query = qb.find_symbol_query("UserService") + assert "n.name = $symbol_name" in query + assert "toLower(n.name) = toLower($symbol_name)" in query + assert "toLower(n.name) CONTAINS toLower($symbol_name)" in query + assert "UNION" in query + + # With type filter + query = qb.find_symbol_query("UserService", "class") + assert "AND (n:CLASS)" in query + + def test_get_symbol_context_query(self): + """Test symbol context query construction.""" + qb = QueryBuilder() + query = qb.get_symbol_context_query("123") + + # Check relationships + assert "id(symbol) = $symbol_id" in query + assert "(file:FILE)-[:CONTAINS]->(symbol)" in query + assert "(symbol)-[:INHERITS_FROM]->(parent)" in query + assert "(symbol)-[:HAS_METHOD]->(method:FUNCTION)" in query + assert "(caller)-[:CALLS]->(symbol)" in query + assert "RETURN symbol" in query + + def test_analyze_change_impact_query(self): + """Test change impact analysis query.""" + qb = QueryBuilder() + query = qb.analyze_change_impact_query(["UserService", "AuthController"]) + + assert "target.name IN ['UserService', 'AuthController']" in query + assert "(target)<-[:CALLS|REFERENCES|IMPORTS|INHERITS_FROM|IMPLEMENTS]-(dependent)" in query + assert "(file:FILE)-[:CONTAINS]->(target)" in query + assert "(test:FILE)-[:TESTS]->(target)" in query + assert "test.path CONTAINS 'test'" in query + + def test_find_related_patterns_query(self): + """Test pattern finding query.""" + qb = QueryBuilder() + query = qb.find_related_patterns_query("Repository Pattern") + + assert "MATCH (concept:CONCEPT)" in query + assert "concept.name CONTAINS $concept_name" in query + assert "(implementer)-[:IMPLEMENTS_CONCEPT]->(concept)" in query + assert "(doc:DOCUMENTATION_FILE)-[:CONTAINS_CONCEPT]->(concept)" in query \ No newline at end of file diff --git a/mcp-blarify-server/tests/test_server.py b/mcp-blarify-server/tests/test_server.py new file mode 100644 index 00000000..1eb90476 --- /dev/null +++ b/mcp-blarify-server/tests/test_server.py @@ -0,0 +1,132 @@ +"""Tests for MCP server.""" + +import pytest +import asyncio +from unittest.mock import Mock, patch, AsyncMock +from src.server import BlarifyMCPServer, GetContextForFilesArgs, GetContextForSymbolArgs, BuildPlanForChangeArgs + + +class TestBlarifyMCPServer: + """Test MCP server functionality.""" + + @pytest.fixture + def mock_neo4j_driver(self): + """Create a mock Neo4j driver.""" + driver = Mock() + driver.verify_connectivity = Mock() + driver.close = Mock() + driver.session = Mock() + return driver + + @pytest.fixture + async def server(self, mock_neo4j_driver): + """Create a server instance with mocked dependencies.""" + with patch('src.server.GraphDatabase') as mock_gdb: + mock_gdb.driver.return_value = mock_neo4j_driver + with patch.dict('os.environ', { + 'NEO4J_URI': 'bolt://test:7687', + 'NEO4J_USERNAME': 'test', + 'NEO4J_PASSWORD': 'test' + }): + server = BlarifyMCPServer() + await server._connect_to_neo4j() + return server + + @pytest.mark.asyncio + async def test_list_tools(self, server): + """Test listing available tools.""" + tools = await server.server.list_tools() + + assert len(tools) == 3 + tool_names = [tool.name for tool in tools] + assert "getContextForFiles" in tool_names + assert "getContextForSymbol" in tool_names + assert "buildPlanForChange" in tool_names + + @pytest.mark.asyncio + async def test_get_context_for_files_args(self): + """Test getContextForFiles argument validation.""" + args = GetContextForFilesArgs(file_paths=["src/main.py", "tests/test_main.py"]) + assert args.file_paths == ["src/main.py", "tests/test_main.py"] + + # Test validation + with pytest.raises(ValueError): + GetContextForFilesArgs() + + @pytest.mark.asyncio + async def test_get_context_for_symbol_args(self): + """Test getContextForSymbol argument validation.""" + args = GetContextForSymbolArgs(symbol_name="UserService", symbol_type="class") + assert args.symbol_name == "UserService" + assert args.symbol_type == "class" + + # Test without type + args2 = GetContextForSymbolArgs(symbol_name="process_data") + assert args2.symbol_name == "process_data" + assert args2.symbol_type is None + + @pytest.mark.asyncio + async def test_build_plan_for_change_args(self): + """Test buildPlanForChange argument validation.""" + args = BuildPlanForChangeArgs(change_request="Add email verification to registration") + assert args.change_request == "Add email verification to registration" + + @pytest.mark.asyncio + async def test_call_tool_get_context_for_files(self, server): + """Test calling getContextForFiles tool.""" + # Mock the context tools + server.context_tools.get_context_for_files = AsyncMock( + return_value="# Context for Files\n\nFile information..." + ) + + result = await server.server.call_tool( + "getContextForFiles", + {"file_paths": ["src/main.py"]} + ) + + assert len(result) == 1 + assert result[0].type == "text" + assert "Context for Files" in result[0].text + + @pytest.mark.asyncio + async def test_call_tool_error_handling(self, server): + """Test error handling in tool calls.""" + # Mock to raise an error + server.context_tools.get_context_for_files = AsyncMock( + side_effect=Exception("Test error") + ) + + result = await server.server.call_tool( + "getContextForFiles", + {"file_paths": ["src/main.py"]} + ) + + assert len(result) == 1 + assert "Error executing getContextForFiles" in result[0].text + assert "Test error" in result[0].text + + @pytest.mark.asyncio + async def test_call_unknown_tool(self, server): + """Test calling an unknown tool.""" + result = await server.server.call_tool( + "unknownTool", + {"arg": "value"} + ) + + assert len(result) == 1 + assert "Unknown tool: unknownTool" in result[0].text + + @pytest.mark.asyncio + async def test_neo4j_connection_error(self): + """Test handling Neo4j connection errors.""" + with patch('src.server.GraphDatabase') as mock_gdb: + mock_driver = Mock() + mock_driver.verify_connectivity.side_effect = Exception("Connection failed") + mock_gdb.driver.return_value = mock_driver + + server = BlarifyMCPServer() + + with pytest.raises(Exception) as exc_info: + await server._connect_to_neo4j() + + assert "Connection failed" in str(exc_info.value) \ No newline at end of file diff --git a/prompts/MCP.md b/prompts/MCP.md new file mode 100644 index 00000000..b16b3700 --- /dev/null +++ b/prompts/MCP.md @@ -0,0 +1,250 @@ +We have forked a program called Blarify that uses tree-sitter and language server protocol servers to create a graph of a codebase AST and its bindings to symbols. This is a powerful tool for understanding code structure and relationships. Analyze this code base and remember its structure so that you can make plans about the new features we will add. + +## Problem Statement + +AI coding agents need sophisticated tools to understand and navigate complex codebases. While Blarify creates rich graph representations of code structure in Neo4j, there's currently no standardized way for AI agents to query and analyze this data. The Model Context Protocol (MCP) provides a standard for exposing tools to AI agents, but Blarify lacks an MCP server to bridge this gap. + +AI agents struggle with: +1. Understanding the full context around specific code elements +2. Tracking dependencies and relationships across large codebases +3. Planning complex changes that affect multiple interconnected components +4. Accessing structured knowledge about code architecture and design patterns + +## Feature Overview + +We will build an MCP server that sits in front of the Neo4j graph database, providing a set of MCP tools that AI coding agents can use to query the graph and retrieve information about the codebase. The server will expose sophisticated query capabilities and use LLM-powered analysis to provide coherent, contextual information. + +The MCP server will: +1. Leverage existing Neo4j MCP servers as a foundation +2. Provide custom tools for context retrieval and change planning +3. Use LLM to organize and structure query results into consumable Markdown +4. Handle complex multi-hop graph traversals intelligently +5. Support both Cypher queries and high-level semantic operations + +## Technical Foundation + +### Base MCP Servers +We'll build upon: +- **neo4j-contrib/mcp-neo4j-cypher**: Provides base Cypher query execution +- **neo4j-contrib/gds-agent**: Offers graph data science capabilities + +### Custom MCP Tools + +#### 1. `getContextForFiles` +**Purpose**: Retrieve comprehensive context for a list of files +**Input**: Array of file paths +**Process**: +- Execute Cypher queries to find FILE nodes matching the paths +- Traverse graph to configurable depth (default: 2-3 hops) +- Collect related nodes: classes, functions, dependencies, imports, documentation +- Include LLM summaries, filesystem relationships, and documentation links +- Use LLM to organize results into coherent Markdown structure + +**Output Structure**: +```markdown +# Context for Files + +## File: src/services/auth.py +### Overview +[LLM summary if available] + +### Contains +- **Classes**: AuthService, TokenValidator +- **Functions**: authenticate(), validate_token() + +### Dependencies +- Imports: jwt, datetime, User model +- Imported by: api/routes/auth.py, tests/test_auth.py + +### Related Documentation +- docs/authentication.md describes the authentication flow +- README.md mentions AuthService configuration + +### Related Concepts +- Implements: JWT Authentication pattern +- Part of: Security Module +``` + +#### 2. `getContextForSymbol` +**Purpose**: Retrieve context for a specific symbol (class, function, variable) +**Input**: Symbol name and optional type hint +**Process**: +- Search for nodes matching the symbol name +- Use fuzzy matching for variations +- Traverse to find definitions, usages, and relationships +- Include inheritance chains, implementations, and callers +- Gather documentation references and LLM descriptions + +**Output Structure**: +```markdown +# Context for Symbol: UserService + +## Definition +- **Type**: Class +- **Location**: src/services/user_service.py:45 +- **Description**: [LLM summary] + +## Implementation Details +### Methods +- `create_user(data: dict) -> User` +- `get_user(id: int) -> Optional[User]` +- `update_user(id: int, data: dict) -> User` + +### Dependencies +- Inherits from: BaseService +- Uses: UserRepository, ValidationHelper +- Database: users table + +### Usage +- Used by: UserController (api/controllers/user.py) +- Tests: tests/services/test_user_service.py +- Examples: docs/examples/user_management.md + +### Related Patterns +- Implements: Repository Pattern +- Part of: Domain Model +``` + +#### 3. `buildPlanForChange` +**Purpose**: Analyze codebase and create implementation plan for a change request +**Input**: Change description/requirements +**Process**: +- Use LLM to extract entities and concepts from change request +- Query graph to find relevant existing code +- Analyze dependencies and impact radius +- Identify files to modify, create, or delete +- Consider test files and documentation updates +- Order changes by dependency + +**Output Structure**: +```markdown +# Implementation Plan: Add Email Verification + +## Summary +Add email verification to the user registration process. + +## Impact Analysis +- **Affected Components**: UserService, AuthController, User model +- **New Components Needed**: EmailVerificationService, email templates +- **Database Changes**: Add `email_verified` column to users table + +## Implementation Steps + +### 1. Database Migration +- **File**: migrations/add_email_verification.sql +- **Action**: Create +- **Description**: Add email_verified boolean and verification_token fields + +### 2. Update User Model +- **File**: src/models/user.py +- **Action**: Modify +- **Changes**: + - Add email_verified property + - Add verification_token property + - Update validation logic + +### 3. Create Email Service +- **File**: src/services/email_service.py +- **Action**: Create +- **Description**: Service for sending verification emails + +### 4. Update Registration Flow +- **File**: src/services/user_service.py +- **Action**: Modify +- **Changes**: + - Generate verification token on registration + - Send verification email + - Add verify_email method + +### 5. Add Verification Endpoint +- **File**: api/routes/auth.py +- **Action**: Modify +- **Changes**: Add /verify-email endpoint + +### 6. Update Tests +- **Files**: + - tests/services/test_user_service.py + - tests/api/test_auth.py +- **Action**: Modify +- **Changes**: Add tests for email verification flow + +### 7. Update Documentation +- **File**: docs/authentication.md +- **Action**: Modify +- **Changes**: Document email verification process + +## Dependencies to Consider +- Email service configuration (SMTP settings) +- Email template system +- Token expiration handling +- Rate limiting for email sends +``` + +## Implementation Requirements + +### MCP Server Structure +``` +mcp-blarify-server/ +├── src/ +│ ├── server.py # Main MCP server +│ ├── tools/ +│ │ ├── context_tools.py # getContextForFiles, getContextForSymbol +│ │ ├── planning_tools.py # buildPlanForChange +│ │ └── query_builder.py # Cypher query construction +│ ├── processors/ +│ │ ├── graph_traversal.py # Graph traversal logic +│ │ ├── context_builder.py # Context organization +│ │ └── llm_processor.py # LLM integration +│ └── config.py +├── tests/ +├── requirements.txt +└── README.md +``` + +### Configuration +- Neo4j connection settings (host, port, credentials) +- Azure OpenAI configuration for LLM processing +- Traversal depth limits +- Context size limits +- Cache settings for frequent queries + +### Neo4j Query Patterns +Efficient Cypher queries for: +1. Multi-hop traversals with relationship filtering +2. Full-text search across node properties +3. Pattern matching for code structures +4. Aggregation of related nodes by type + +### LLM Integration +- Use Azure OpenAI to: + 1. Parse change requests into structured queries + 2. Organize graph results into coherent narratives + 3. Generate implementation step descriptions + 4. Identify implicit dependencies + +### Error Handling +- Graceful handling of missing nodes +- Query timeout management +- LLM fallback for failed processing +- Clear error messages for AI agents + +## Testing Strategy + +1. **Unit Tests**: Test individual query builders and processors +2. **Integration Tests**: Test with sample Neo4j graphs +3. **End-to-End Tests**: Test MCP tool invocations +4. **LLM Mock Tests**: Test with mocked LLM responses + +## Implementation Plan + +1. Set up MCP server scaffold based on neo4j-cypher server +2. Implement basic Neo4j connection and query execution +3. Build graph traversal and context extraction logic +4. Integrate LLM for result processing +5. Implement the three custom tools +6. Add comprehensive error handling +7. Write tests for all components +8. Create documentation and usage examples +9. Package for deployment + +Once you have analyzed the codebase and this plan, create an issue in the remote repo (https://github.com/rysweet/cue) to describe the feature. Then create a new branch, implement the MCP server following test-driven development practices, and create a pull request when complete. \ No newline at end of file