Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions ROADMAP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Xyzen Roadmap

This roadmap outlines the development stages of the Xyzen AI Laboratory Server. It serves as a high-level guide for tracking major feature implementation and system architecture evolution.

## Phase 1: Core Consolidation (Current)
Focus: Cleaning up the legacy structure, unifying models, and establishing best practices.

- [ ] **Unified Agent System**: Complete the migration to a single `Agent` model for both regular and graph-based agents.
- [ ] **Idiomatic FastAPI Refactor**: Implement Dependency Injection (DI) for resource fetching and authorization across all API handlers.
- [ ] **Frontend State Management**: Finalize the migration of all server-side state to TanStack Query and clean up Zustand slices.
- [ ] **Error Handling**: Implement a global exception handler and unified error code system across backend and frontend.

## Phase 2: Agent Intelligence & Workflows
Focus: Expanding the capabilities of the agent engine.

- [ ] **LangGraph Orchestration**: Full integration of LangGraph for complex, stateful multi-agent workflows.
- [ ] **Advanced MCP Integration**: Dynamic discovery and management of Model Context Protocol (MCP) servers.
- [ ] **Tool Confirmation UI**: A robust interface for users to inspect and approve agent tool calls before execution.
- [ ] **Streaming Optimization**: Enhancing WebSocket performance for real-time agent thought process visualization.

## Phase 3: Knowledge Base & RAG
Focus: Providing agents with memory and specialized knowledge.

- [ ] **Vector Database Support**: Integration with PostgreSQL (pgvector) or a dedicated vector DB for RAG capabilities.
- [ ] **File Processing Pipeline**: Automated ingestion and chunking of documents (PDF, Markdown, Code).
- [ ] **Knowledge Graphs**: Exploring graph-based retrieval to complement vector search.

## Phase 4: Infrastructure & Scale
Focus: Making Xyzen production-ready.

- [ ] **Multi-Provider Support**: Seamless switching between OpenAI, Anthropic, Gemini, and local models (Ollama).
- [ ] **User Usage Tracking**: Monitoring token consumption and execution costs.
- [ ] **Deployment Templates**: Easy-to-use Docker Compose and Kubernetes configurations for various environments.

---

## Done ✅
- [x] **Project Foundation**: Initial FastAPI + SQLModel backend setup.
- [x] **Frontend Shell**: React + Tailwind + shadcn/ui dashboard layout.
- [x] **Basic Agent Chat**: Functional WebSocket-based chat with regular agents.
- [x] **Dockerized Environment**: Fully containerized development setup with PostgreSQL and MinIO.
30 changes: 30 additions & 0 deletions TODO.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Xyzen Task Tracker (TODO)

This file tracks tactical, short-term tasks and immediate technical debt. For high-level milestones, see `ROADMAP.md`.

## 🛠️ Immediate Priorities
- [ ] **Dependency Injection Refactor**: Move `auth_service` and `agent` fetching into FastAPI dependencies in `agents.py`.
- [ ] **Agent Repository Cleanup**: Remove legacy methods in `AgentRepository` that supported the old unified agent service (e.g., `get_agent_with_mcp_servers`).
- [ ] **Frontend Type Alignment**: Update `web/src/types/agents.ts` to match the simplified `AgentReadWithDetails` model from the backend.

## 🚀 Backend Tasks
- [ ] **Pydantic V2 Migration**: Verify all SQLModels and Schemas are utilizing Pydantic V2 features optimally.
- [ ] **Logging Middleware**: Add request/response logging for better debugging in the Docker environment.
- [ ] **Auth Error Mapping**: Finish mapping `ErrCodeError` to appropriate FastAPI `HTTPException` responses in `middleware/auth`.

## 🎨 Frontend Tasks
- [ ] **TanStack Query Refactor**: Move agent fetching from `agentSlice.ts` (Zustand) to a dedicated hook in `hooks/queries/useAgents.ts`.
- [ ] **AddAgentModal UI**: Allow users to select a specific LLM Provider during the creation of a regular agent.
- [ ] **Loading States**: Add skeleton loaders to the `AgentExplorer` sidebar.

## 🧪 Testing & Quality
- [ ] **Backend Unit Tests**: Add test cases for the newly unified `get_agent` endpoint.
- [ ] **Frontend Linting**: Fix existing `yarn lint` warnings in `web/src/components/layouts/ChatToolbar.tsx`.
- [ ] **API Documentation**: Update docstrings in `handler/api/v1/` to ensure Swagger UI is accurate.

## ✅ Completed Tasks
- [x] **Agent Unification**: Unified `get_agent` endpoint to return `AgentReadWithDetails` and removed `UnifiedAgentRead` dependencies.
- [x] **Default Agent Cloning**: Implemented logic in `SystemAgentManager` to clone system agents as user-owned default agents.
- [x] **Tag-based Identification**: Updated frontend (Chat, Agent List, Avatars) to identify default agents via tags (e.g., `default_chat`) rather than hardcoded UUIDs.
- [x] **Workshop Removal**: Completely removed the legacy "Workshop" feature from both backend and frontend to simplify the core agent experience.
- [x] **Policy Update**: Updated `AgentPolicy` to allow reading of system-scoped reference agents.
2 changes: 1 addition & 1 deletion service/app/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:

await initialize_providers_on_startup()

# Initialize system agents (Chat and Workshop agents)
# Initialize system agents (Chat agent)
from core.system_agent import SystemAgentManager
from infra.database import AsyncSessionLocal

Expand Down
6 changes: 4 additions & 2 deletions service/core/auth/policies/agent_policy.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
from sqlmodel.ext.asyncio.session import AsyncSession

from common.code import ErrCode
from models.agent import Agent
from models.agent import Agent, AgentScope
from repos.agent import AgentRepository

from .resource_policy import ResourcePolicyBase
Expand All @@ -17,7 +17,9 @@ async def authorize_read(self, resource_id: UUID, user_id: str) -> Agent:
agent = await self.agent_repo.get_agent_by_id(resource_id)
if not agent:
raise ErrCode.AGENT_NOT_FOUND.with_messages(f"Agent {resource_id} not found")
if agent.user_id == user_id:

# System agents are readable by everyone, user agents only by owner
if agent.scope == AgentScope.SYSTEM or agent.user_id == user_id:
return agent

raise ErrCode.AGENT_ACCESS_DENIED.with_messages(f"User {user_id} can not access agent {resource_id}")
Expand Down
17 changes: 5 additions & 12 deletions service/core/chat/langchain.py
Original file line number Diff line number Diff line change
Expand Up @@ -140,8 +140,6 @@ async def _load_db_history(db: AsyncSession, topic: TopicModel) -> list[Any]:
for message in messages:
role = (message.role or "").lower()
content = message.content or ""
if not content:
continue
if role == "user":
# Check if message has file attachments
try:
Expand All @@ -153,10 +151,6 @@ async def _load_db_history(db: AsyncSession, topic: TopicModel) -> list[Any]:
# Multimodal message: combine text and file content
multimodal_content: list[dict[str, Any]] = [{"type": "text", "text": content}]
multimodal_content.extend(file_contents)
# Debug: Log the exact content being sent
logger.debug(
f"Multimodal content types for message {message.id}: {[item.get('type') for item in multimodal_content]}"
)
for idx, item in enumerate(file_contents):
item_type = item.get("type")
if item_type == "image_url":
Expand All @@ -183,6 +177,7 @@ async def _load_db_history(db: AsyncSession, topic: TopicModel) -> list[Any]:
try:
file_contents = await process_message_files(db, message.id)
if file_contents:
logger.debug("Successfully processed files for message")
# Multimodal assistant message
# Combine text content with file content
multimodal_content: list[dict[str, Any]] = []
Expand Down Expand Up @@ -234,6 +229,7 @@ async def _load_db_history(db: AsyncSession, topic: TopicModel) -> list[Any]:
else:
# Skip unknown/tool roles for now
continue
logger.info(f"Length of history: {len(history)}")
return history
except Exception as e:
logger.warning(f"Failed to load DB chat history for topic {getattr(topic, 'id', None)}: {e}")
Expand Down Expand Up @@ -311,7 +307,7 @@ async def get_ai_response_stream_langchain_legacy(
model_name = agent.model

# Get system prompt with MCP awareness
system_prompt = await build_system_prompt(db, agent)
system_prompt = await build_system_prompt(db, agent, model_name)

yield {"type": ChatEventType.PROCESSING, "data": {"status": ProcessingStatus.PREPARING_REQUEST}}

Expand Down Expand Up @@ -366,10 +362,7 @@ async def get_ai_response_stream_langchain_legacy(
else:
history_messages.append(system_msg)

async for chunk in langchain_agent.astream(
{"messages": history_messages},
stream_mode=["updates", "messages"],
):
async for chunk in langchain_agent.astream({"messages": history_messages}, stream_mode=["updates", "messages"]):
# chunk is a tuple: (stream_mode, data)
try:
mode, data = chunk
Expand Down Expand Up @@ -398,7 +391,7 @@ async def get_ai_response_stream_langchain_legacy(
continue

last_message = messages[-1]
logger.debug("Last message in step '%s': %r", step_name, last_message)
# logger.debug("Last message in step '%s': %r", step_name, last_message)

# Check if this is a tool call request (from LLM node)
if hasattr(last_message, "tool_calls") and last_message.tool_calls:
Expand Down
17 changes: 10 additions & 7 deletions service/core/chat/messages.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ async def agent_has_dynamic_mcp(db: AsyncSession, agent: Optional[Agent]) -> boo
return any(s.name == "DynamicMCPServer" or "dynamic_mcp_server" in (s.url or "").lower() for s in mcp_servers)


async def build_system_prompt(db: AsyncSession, agent: Optional[Agent]) -> str:
async def build_system_prompt(db: AsyncSession, agent: Optional[Agent], model_name: str | None) -> str:
"""
Build system prompt for the agent.

Expand All @@ -54,11 +54,14 @@ async def build_system_prompt(db: AsyncSession, agent: Optional[Agent]) -> str:
if agent and agent.prompt:
base_prompt = agent.prompt

formatting_instructions = """
Please format your output using Markdown.
When writing code, use triple backticks with the language identifier (e.g. ```python).
If you generate HTML that should be previewed, use ```html.
If you generate ECharts JSON options, use ```echart.
"""
if model_name and "image" in model_name:
formatting_instructions = ""
else:
formatting_instructions = """
Please format your output using Markdown.
When writing code, use triple backticks with the language identifier (e.g. ```python).
If you generate HTML that should be previewed, use ```html.
If you generate ECharts JSON options, use ```echart.
"""

return f"{base_prompt}\n{formatting_instructions}"
Loading
Loading