Skip to content

ARCHITECTURE

TenzinGayche edited this page Dec 15, 2025 · 4 revisions

Architecture

This document provides a comprehensive overview of the LangGraph Translation API architecture, including the system design, LangGraph workflow, and data flow.


🏗️ High-Level Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                              Client Layer                                    │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐ │
│  │   Web UI    │  │    cURL     │  │   Python    │  │   JavaScript/TS     │ │
│  │  (Browser)  │  │   Client    │  │   Client    │  │      Client         │ │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘  └──────────┬──────────┘ │
└─────────┼────────────────┼────────────────┼────────────────────┼────────────┘
          │                │                │                    │
          └────────────────┴────────────────┴────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                           FastAPI Application                                │
│                                                                              │
│  ┌────────────────────────────────────────────────────────────────────────┐ │
│  │                         API Endpoints                                  │ │
│  │  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │ │
│  │  │Translate │ │ Glossary │ │   UCCA   │ │  Gloss   │ │   Workflow   │ │ │
│  │  └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └──────┬───────┘ │ │
│  └───────┼────────────┼────────────┼────────────┼──────────────┼─────────┘ │
│          │            │            │            │              │           │
│          └────────────┴────────────┴────────────┴──────────────┘           │
│                                    │                                        │
│                          ┌─────────▼─────────┐                             │
│                          │   Model Router    │                             │
│                          │  (Provider Agnostic)                            │
│                          └─────────┬─────────┘                             │
│                                    │                                        │
│          ┌─────────────────────────┼─────────────────────────┐             │
│          ▼                         ▼                         ▼             │
│   ┌────────────┐           ┌────────────┐           ┌────────────┐        │
│   │ Anthropic  │           │   Google   │           │  OpenAI    │        │
│   │   Claude   │           │   Gemini   │           │   GPT-4    │        │
│   └────────────┘           └────────────┘           └────────────┘        │
│                                                                              │
│  ┌────────────────────────────────────────────────────────────────────────┐ │
│  │                          Support Layer                                 │ │
│  │  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌────────────────────────┐ │ │
│  │  │  Cache   │  │ Prompts  │  │  Config  │  │   Streaming (SSE)      │ │ │
│  │  └──────────┘  └──────────┘  └──────────┘  └────────────────────────┘ │ │
│  └────────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘

🔄 Three-Stage Decoupled Pipeline

The API is designed as a set of independent services, giving users full control over each step of the translation process.

┌───────────────────┐     ┌───────────────────┐     ┌───────────────────┐
│                   │     │                   │     │                   │
│   1. TRANSLATION  │ ──▶ │   2. GLOSSARY     │ ──▶ │ 3. STANDARDIZATION│
│      SERVICE      │     │      SERVICE      │     │      SERVICE      │
│                   │     │                   │     │                   │
│  • Batch/Single   │     │  • Extract terms  │     │  • Analyze        │
│  • Multi-model    │     │  • Post-translate │     │  • Apply rules    │
│  • Streaming      │     │  • Streaming      │     │  • Re-translate   │
│                   │     │                   │     │                   │
└───────────────────┘     └───────────────────┘     └───────────────────┘

        User can call any service independently or chain them together

Stage 1: Translation Service

Purpose: Translate Tibetan Buddhist texts with specialized prompts.

Endpoint Description
POST /translate Batch translation (sync)
POST /translate/single Single text translation
POST /translate/stream Batch translation with SSE
POST /translate/single/stream Single text with SSE

Stage 2: Glossary Service

Purpose: Extract terminology glossaries from translation pairs.

Endpoint Description
POST /glossary/extract Extract glossary (sync)
POST /glossary/extract/stream Extract with SSE progress

Stage 3: Standardization Service

Purpose: Analyze and enforce terminological consistency.

Endpoint Description
POST /standardize/analyze Find inconsistent terms
POST /standardize/apply Apply standardization rules
POST /standardize/apply/stream Apply with SSE progress

🔀 LangGraph Workflow

The translation pipeline is orchestrated using LangGraph, a state machine framework for building complex AI workflows.

Workflow State Machine

                    ┌─────────────────────────────────────────┐
                    │           WORKFLOW START                │
                    └───────────────────┬─────────────────────┘
                                        │
                                        ▼
                    ┌─────────────────────────────────────────┐
                    │            INITIALIZE                    │
                    │  • Create batches from input texts       │
                    │  • Set up workflow state                 │
                    │  • Initialize metadata                   │
                    └───────────────────┬─────────────────────┘
                                        │
                                        ▼
                    ┌─────────────────────────────────────────┐
                    │          PROCESS_BATCH                   │
                    │  • Get model from router                 │
                    │  • Check cache for translations          │
                    │  • Call LLM if not cached                │
                    │  • Clean and store results               │
                    └───────────────────┬─────────────────────┘
                                        │
                                        ▼
                    ┌─────────────────────────────────────────┐
                    │         CHECK_COMPLETION                 │
                    │  • More batches remaining?               │
                    └───────────┬───────────────┬─────────────┘
                                │               │
                      ┌─────────┘               └─────────┐
                      │ YES                           NO  │
                      ▼                                   ▼
        ┌─────────────────────┐         ┌─────────────────────────────────┐
        │  Loop back to       │         │  EXTRACT_GLOSSARIES_IN_PARALLEL │
        │  PROCESS_BATCH      │         │  • Batch glossary extraction    │
        └─────────────────────┘         │  • Parallel LLM calls           │
                                        └───────────────┬─────────────────┘
                                                        │
                                                        ▼
                                        ┌─────────────────────────────────┐
                                        │            FINALIZE              │
                                        │  • Calculate statistics          │
                                        │  • Set completion status         │
                                        │  • Prepare final output          │
                                        └───────────────┬─────────────────┘
                                                        │
                                                        ▼
                                        ┌─────────────────────────────────┐
                                        │           WORKFLOW END           │
                                        └─────────────────────────────────┘

Workflow Nodes

Node Function Description
initialize initialize_workflow() Sets up state, creates batches
process_batch process_batch() Translates one batch of texts
check_completion check_completion() Routing decision
extract_glossaries_in_parallel extract_glossaries_in_parallel() Parallel glossary extraction
finalize finalize_workflow() Calculates stats, completes workflow

Workflow State

class TranslationWorkflowState(TypedDict):
    original_request: TranslationRequest    # Input request
    batches: List[TranslationBatch]         # Created batches
    current_batch_index: int                # Processing position
    batch_results: List[BatchResult]        # Results per batch
    final_results: List[TranslationResult]  # All translations
    total_texts: int                        # Total input count
    processed_texts: int                    # Completed count
    workflow_start_time: float              # Timing
    workflow_status: str                    # Current status
    errors: List[Dict[str, Any]]            # Error log
    retry_count: int                        # Retry tracking
    model_name: str                         # Selected model
    model_params: Dict[str, Any]            # Model config
    custom_steps: Dict[str, Any]            # Extension point
    metadata: Dict[str, Any]                # Processing metadata

🤖 Model Router Architecture

The Model Router provides a unified interface to multiple LLM providers.

┌─────────────────────────────────────────────────────────────────┐
│                        Model Router                              │
│                                                                  │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │                    get_model(name)                        │  │
│  │                                                           │  │
│  │  1. Check cache for existing instance                     │  │
│  │  2. Identify provider from model name                     │  │
│  │  3. Validate API key availability                         │  │
│  │  4. Create and configure model instance                   │  │
│  │  5. Cache and return                                      │  │
│  └───────────────────────────────────────────────────────────┘  │
│                              │                                   │
│       ┌──────────────────────┼──────────────────────┐           │
│       ▼                      ▼                      ▼           │
│  ┌──────────┐          ┌──────────┐          ┌──────────┐      │
│  │Anthropic │          │  Google  │          │  OpenAI  │      │
│  │          │          │          │          │          │      │
│  │ claude-  │          │ gemini-  │          │ gpt-4    │      │
│  │ sonnet-4 │          │ 2.5-pro  │          │ gpt-4-   │      │
│  │ claude-  │          │ gemini-  │          │ turbo    │      │
│  │ opus     │          │ 2.5-flash│          │          │      │
│  └──────────┘          └──────────┘          └──────────┘      │
│                                                                  │
│  ┌──────────┐                                                   │
│  │Dharmamitra│ (Translation-only wrapper)                       │
│  └──────────┘                                                   │
└─────────────────────────────────────────────────────────────────┘

Supported Models

Provider Model ID Description
Anthropic claude-sonnet-4-20250514 Claude Sonnet 4.0
Anthropic claude-sonnet-4-5-20250929 Claude Sonnet 4.5
Anthropic claude-haiku-4-5-20251001 Claude Haiku 4.5
Anthropic claude-3-5-haiku-20241022 Claude 3.5 Haiku
Anthropic claude-3-opus-20240229 Claude 3 Opus
Google gemini-2.5-pro Gemini 2.5 Pro (thinking enabled)
Google gemini-2.5-flash Gemini 2.5 Flash (fast, no thinking)
Google gemini-2.5-flash-thinking Gemini 2.5 Flash with thinking
OpenAI gpt-4 GPT-4
OpenAI gpt-4-turbo GPT-4 Turbo
Dharmamitra dharamitra Specialized Buddhist translation

💾 Caching System

The API implements multi-level caching to optimize performance:

┌─────────────────────────────────────────────────────────────────┐
│                        Cache Layer                               │
│                                                                  │
│  ┌───────────────────────┐    ┌───────────────────────┐        │
│  │  Translation Cache    │    │   Glossary Cache      │        │
│  │                       │    │                       │        │
│  │  Key: hash of         │    │  Key: hash of         │        │
│  │  • source text        │    │  • source text        │        │
│  │  • target language    │    │  • translated text    │        │
│  │  • text type          │    │  • model name         │        │
│  │  • model name         │    │                       │        │
│  │  • user rules         │    │                       │        │
│  │                       │    │                       │        │
│  │  Value: translated    │    │  Value: glossary      │        │
│  │         text          │    │         terms         │        │
│  └───────────────────────┘    └───────────────────────┘        │
│                                                                  │
│  Benefits:                                                       │
│  • Avoid redundant LLM calls                                    │
│  • Faster repeated requests                                     │
│  • Reduced API costs                                            │
└─────────────────────────────────────────────────────────────────┘

📡 Streaming Architecture

All long-running operations support Server-Sent Events (SSE):

┌──────────┐                    ┌──────────────┐                ┌─────────────┐
│  Client  │  ── HTTP POST ──▶  │   FastAPI    │  ── LLM ──▶   │   Model     │
│          │                    │   Endpoint   │                │   Provider  │
│          │                    │              │                │             │
│          │  ◀── SSE Stream ── │  EventSource │  ◀── Result ── │             │
│          │                    │  Response    │                │             │
└──────────┘                    └──────────────┘                └─────────────┘

Event Types:
  • batch_completed      - Translation batch finished
  • glossary_batch_completed - Glossary batch extracted
  • retranslation_completed - Standardization applied
  • ucca_item_completed  - UCCA graph generated
  • gloss_item_completed - Gloss analysis done
  • comment_delta        - Streaming token
  • completion           - Final result
  • error                - Error occurred

📁 Project Structure

langraph-api/
├── src/
│   └── translation_api/
│       ├── api.py              # FastAPI application & endpoints
│       ├── config.py           # Environment configuration
│       ├── cache.py            # Caching system
│       ├── models/
│       │   ├── model_router.py # Multi-provider LLM routing
│       │   ├── glossary.py     # Glossary data models
│       │   ├── standardization.py
│       │   ├── ucca.py         # UCCA models
│       │   ├── gloss.py        # Gloss models
│       │   ├── workflow.py     # Workflow models
│       │   └── comment.py      # Editor comment models
│       ├── workflows/
│       │   ├── translation_state.py  # LangGraph state
│       │   ├── streaming.py          # SSE helpers
│       │   ├── ucca.py               # UCCA generation
│       │   └── gloss.py              # Gloss generation
│       ├── prompts/
│       │   └── tibetan_buddhist.py   # Prompt templates
│       └── utils/
│           └── helpers.py            # Utility functions
├── graph.py                    # LangGraph workflow definition
├── main.py                     # Entry point
├── static/                     # Web UI files
│   ├── index.html
│   └── editor.html
└── tests/                      # Test suite

🔐 Configuration

Environment variables managed via .env:

# LLM Provider API Keys
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
GEMINI_API_KEY=AIza...

# Dharmamitra Integration
DHARMAMITRA_PASSWORD=...
DHARMAMITRA_TOKEN=...

# LangSmith Tracing (optional)
LANGSMITH_API_KEY=...
LANGSMITH_PROJECT=Translation
LANGSMITH_TRACING=true

# Server Configuration
API_HOST=0.0.0.0
API_PORT=8000
DEFAULT_MODEL=claude
MAX_BATCH_SIZE=50
DEFAULT_BATCH_SIZE=5

🔗 See Also

Clone this wiki locally