A sophisticated, multi-stage API for translating, analyzing, and standardizing Tibetan Buddhist texts. This project uses a modular, streaming-first architecture built with FastAPI and LangGraph to provide a flexible and powerful pipeline for high-quality, consistent translations.
- Modular Workflow: A decoupled three-stage pipeline for Translation, Glossary Extraction, and Standardization.
- Streaming First: Real-time events for all long-running operations, providing a transparent and interactive user experience.
- Intelligent Standardization: A powerful suite of tools to analyze and enforce terminological consistency across large datasets.
- Multi-Model Support: Dynamically route requests to models from Anthropic, OpenAI, and Google.
- Anthropic (exact IDs only):
claude-3-7-sonnet-20250219,claude-3-5-sonnet-20241022,claude-sonnet-4-20250514,claude-3-5-haiku-20241022,claude-3-opus-20240229. - Google:
gemini-2.5-pro,gemini-2.5-flash,gemini-2.5-flash-thinking(virtual alias; same underlying model with thinking enabled).
- Anthropic (exact IDs only):
- Performance Optimized: In-memory caching for repeated requests and parallel, batched processing for glossary and standardization tasks.
- Comprehensive Documentation: Includes a full API reference, an architectural overview, and a guide to the project's evolution.
The API is designed as a set of independent services, giving the user full control over each step of the process.
graph TD;
subgraph "User Interaction"
A[Client]
end
subgraph "API Services"
B[1. Translation Service]
C[2. Glossary Service]
D[3. Standardization Service]
end
subgraph "AI Backend"
E[Language Models]
end
A -- "Translate Texts" --> B;
A -- "Extract Glossary" --> C;
A -- "Analyze & Apply Standards" --> D;
B -- Calls --> E;
C -- Calls --> E;
D -- Calls --> E;
For a deep dive into the architecture, the key technical decisions, and the project's evolution, please see our Comprehensive Documentation.
- Python 3.10+
- An API key for at least one supported model provider (e.g., Anthropic).
-
Clone the repository:
git clone https://github.com/OpenPecha/langraph-api.git cd langraph-api -
Install dependencies:
pip install -r requirements.txt
-
Configure API Keys & Env: Create a
.envfile in the project root and add your API keys.# .env ANTHROPIC_API_KEY="your-anthropic-key" OPENAI_API_KEY="your-openai-key" GEMINI_API_KEY="your-google-key" # Optional: Dharmamitra upstream password used by proxy endpoints if request omits it
Start the application using Uvicorn.
uvicorn src.translation_api.api:app --reload --port 8001- The Web UI will be available at http://localhost:8001/.
- The interactive Swagger Docs will be at http://localhost:8001/docs.
Here is how to use the full pipeline from the command line.
Step 1: Get a Translation
curl -X POST http://localhost:8001/translate -H "Content-Type: application/json" -d '{
"texts": ["om mani padme hum"], "target_language": "english"
}' > translation_output.jsonStep 2: Extract the Glossary
curl -X POST http://localhost:8001/glossary/extract -H "Content-Type: application/json" -d '{
"items": '"$(jq '.results' translation_output.json)"'
}' > glossary_output.json(Requires jq to be installed)
Production endpoint to run the final translation with optional inputs and an optional custom prompt.
- POST
/workflow/run
Request body:
{
"combo_key": "source+ucca+gloss",
"input": {
"source": "...",
"ucca": "...",
"gloss": "...",
"commentaries": ["...", "..."],
"sanskrit": "...",
"target_language": "english",
"model": "claude-3-7-sonnet-20250219"
},
"model_name": "claude-3-7-sonnet-20250219",
"model_params": {},
"custom_prompt": "Translate {source} into {sanskrit} style..."
}Notes:
combo_keyis order-independent and must always includesource. Other tokens (ucca,gloss,sanskrit,commentariesK) are inferred for validation.custom_promptis optional but must include{source}. Optional placeholders:{ucca},{gloss},{commentary1},{commentary2},{commentary3},{sanskrit}.- Returns minimal JSON:
{ "combo_key": "gloss+source+ucca", "translation": "..." }- POST
/workflow/run/batchaccepts the samecombo_keyand an array ofitems(eachWorkflowInput), returning an array of{ index, translation | error }.
These proxy endpoints allow quick testing against Dharmamitra’s upstream APIs from the same UI/server.
-
Streaming KNN Mitra (SSE)
- POST
/dharmamitra/knn-translate-mitra - Body:
{ "query": "...", "language": "english", "password": "..." } - Notes:
languageis lowercased;do_grammaris forced tofalse. Ifpasswordis omitted, the server usesDHARMAMITRA_PASSWORDfrom.env.
- POST
-
Gemini (non-stream)
- POST
/dharmamitra/knn-translate-gemini-no-stream - Body:
{ "query": "...", "language": "english", "password": "..." } - Notes:
do_grammaranduse_pro_modelare forced tofalse.languagelowercased. Falls back toDHARMAMITRA_PASSWORDif not provided.
- POST
We have created an extensive set of documents covering every aspect of this project.
- Architectural Overview
- Project Evolution & Key Decisions
- Full API Reference
- Frontend UI Guide
- Setup & Deployment Guide
Please refer to these documents for a deep understanding of the system's design and capabilities.