Translate EPUB books using Large Language Models while preserving the original text. The translated content is displayed side-by-side with the original, creating bilingual books perfect for language learning and cross-reference reading.
- Bilingual Output: Preserves original text alongside translations for easy comparison
- LLM-Powered: Leverages large language models for high-quality, context-aware translations
- Format Preservation: Maintains EPUB structure, styles, images, and formatting
- Complete Translation: Translates chapter content, table of contents, and metadata
- Progress Tracking: Monitor translation progress with built-in callbacks
- Flexible LLM Support: Works with any OpenAI-compatible API endpoint
- Caching: Built-in caching for progress recovery when translation fails
pip install epub-translatorRequirements: Python 3.11, 3.12, or 3.13
The easiest way to use EPUB Translator is through OOMOL Studio with a visual interface:
from pathlib import Path
from epub_translator import LLM, translate, language
# Initialize LLM with your API credentials
llm = LLM(
key="your-api-key",
url="https://api.openai.com/v1",
model="gpt-4",
token_encoding="o200k_base",
)
# Translate EPUB file using language constants
translate(
source_path=Path("source.epub"),
target_path=Path("translated.epub"),
target_language=language.ENGLISH,
llm=llm,
)from tqdm import tqdm
with tqdm(total=100, desc="Translating", unit="%") as pbar:
last_progress = 0.0
def on_progress(progress: float):
nonlocal last_progress
increment = (progress - last_progress) * 100
pbar.update(increment)
last_progress = progress
translate(
source_path=Path("source.epub"),
target_path=Path("translated.epub"),
target_language="English",
llm=llm,
on_progress=on_progress,
)Initialize the LLM client for translation:
LLM(
key: str, # API key
url: str, # API endpoint URL
model: str, # Model name (e.g., "gpt-4")
token_encoding: str, # Token encoding (e.g., "o200k_base")
cache_path: PathLike | None = None, # Cache directory path
timeout: float | None = None, # Request timeout in seconds
top_p: float | tuple[float, float] | None = None,
temperature: float | tuple[float, float] | None = None,
retry_times: int = 5, # Number of retries on failure
retry_interval_seconds: float = 6.0, # Interval between retries
log_dir_path: PathLike | None = None, # Log directory path
)Translate an EPUB file:
translate(
source_path: PathLike | str, # Source EPUB file path
target_path: PathLike | str, # Output EPUB file path
target_language: str, # Target language (e.g., "English", "Chinese")
user_prompt: str | None = None, # Custom translation instructions
max_retries: int = 5, # Maximum retries for failed translations
max_group_tokens: int = 1200, # Maximum tokens per translation group
llm: LLM | None = None, # Single LLM instance for both translation and filling
translation_llm: LLM | None = None, # LLM instance for translation (overrides llm)
fill_llm: LLM | None = None, # LLM instance for XML filling (overrides llm)
on_progress: Callable[[float], None] | None = None, # Progress callback (0.0-1.0)
on_fill_failed: Callable[[FillFailedEvent], None] | None = None, # Error callback
)Note: Either llm or both translation_llm and fill_llm must be provided. Using separate LLMs allows for task-specific optimization.
EPUB Translator provides predefined language constants for convenience. You can use these constants instead of writing language names as strings:
from epub_translator import language
# Usage example:
translate(
source_path=Path("source.epub"),
target_path=Path("translated.epub"),
target_language=language.ENGLISH,
llm=llm,
)
# You can also use custom language strings:
translate(
source_path=Path("source.epub"),
target_path=Path("translated.epub"),
target_language="Icelandic", # For languages not in the constants
llm=llm,
)Monitor and handle translation errors using the on_fill_failed callback:
from epub_translator import FillFailedEvent
def handle_fill_error(event: FillFailedEvent):
print(f"Translation error (attempt {event.retried_count}):")
print(f" {event.error_message}")
if event.over_maximum_retries:
print(" Maximum retries exceeded!")
translate(
source_path=Path("source.epub"),
target_path=Path("translated.epub"),
target_language=language.ENGLISH,
llm=llm,
on_fill_failed=handle_fill_error,
)The FillFailedEvent contains:
error_message: str- Description of the errorretried_count: int- Current retry attempt numberover_maximum_retries: bool- Whether max retries has been exceeded
Use separate LLM instances for translation and XML structure filling with different optimization parameters:
# Create two LLM instances with different temperatures
translation_llm = LLM(
key="your-api-key",
url="https://api.openai.com/v1",
model="gpt-4",
token_encoding="o200k_base",
temperature=0.8, # Higher temperature for creative translation
)
fill_llm = LLM(
key="your-api-key",
url="https://api.openai.com/v1",
model="gpt-4",
token_encoding="o200k_base",
temperature=0.3, # Lower temperature for structure preservation
)
translate(
source_path=Path("source.epub"),
target_path=Path("translated.epub"),
target_language=language.ENGLISH,
translation_llm=translation_llm,
fill_llm=fill_llm,
)llm = LLM(
key="sk-...",
url="https://api.openai.com/v1",
model="gpt-4",
token_encoding="o200k_base",
)llm = LLM(
key="your-azure-key",
url="https://your-resource.openai.azure.com/openai/deployments/your-deployment",
model="gpt-4",
token_encoding="o200k_base",
)Any service with an OpenAI-compatible API can be used:
llm = LLM(
key="your-api-key",
url="https://your-service.com/v1",
model="your-model",
token_encoding="o200k_base", # Match your model's encoding
)- Language Learning: Read books in their original language with side-by-side translations
- Academic Research: Access foreign literature with bilingual references
- Content Localization: Prepare books for international audiences
- Cross-Cultural Reading: Enjoy literature while understanding cultural nuances
Provide specific translation instructions:
translate(
source_path=Path("source.epub"),
target_path=Path("translated.epub"),
target_language="English",
llm=llm,
user_prompt="Use formal language and preserve technical terminology",
)Enable caching to resume translation progress after failures:
llm = LLM(
key="your-api-key",
url="https://api.openai.com/v1",
model="gpt-4",
token_encoding="o200k_base",
cache_path="./translation_cache", # Translations are cached here
)PDF Craft converts PDF files into EPUB and other formats, with a focus on scanned books. Combine PDF Craft with EPUB Translator to convert and translate scanned PDF books into bilingual EPUB format.
Workflow: Scanned PDF → [PDF Craft] → EPUB → [EPUB Translator] → Bilingual EPUB
For a complete tutorial, watch: Convert scanned PDF books to EPUB format and translate them into bilingual books
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- Issues: GitHub Issues
- OOMOL Studio: Open in OOMOL Studio

