EPUB Translator

English | 中文

Translate EPUB books using Large Language Models while preserving the original text. The translated content is displayed side-by-side with the original, creating bilingual books perfect for language learning and cross-reference reading.

Features

Bilingual Output: Preserves original text alongside translations for easy comparison
LLM-Powered: Leverages large language models for high-quality, context-aware translations
Format Preservation: Maintains EPUB structure, styles, images, and formatting
Complete Translation: Translates chapter content, table of contents, and metadata
Progress Tracking: Monitor translation progress with built-in callbacks
Flexible LLM Support: Works with any OpenAI-compatible API endpoint
Caching: Built-in caching for progress recovery when translation fails

Installation

pip install epub-translator

Requirements: Python 3.11, 3.12, or 3.13

Quick Start

Using OOMOL Studio (Recommended)

The easiest way to use EPUB Translator is through OOMOL Studio with a visual interface:

Using Python API

from pathlib import Path
from epub_translator import LLM, translate, language

# Initialize LLM with your API credentials
llm = LLM(
    key="your-api-key",
    url="https://api.openai.com/v1",
    model="gpt-4",
    token_encoding="o200k_base",
)

# Translate EPUB file using language constants
translate(
    source_path=Path("source.epub"),
    target_path=Path("translated.epub"),
    target_language=language.ENGLISH,
    llm=llm,
)

With Progress Tracking

from tqdm import tqdm

with tqdm(total=100, desc="Translating", unit="%") as pbar:
    last_progress = 0.0

    def on_progress(progress: float):
        nonlocal last_progress
        increment = (progress - last_progress) * 100
        pbar.update(increment)
        last_progress = progress

    translate(
        source_path=Path("source.epub"),
        target_path=Path("translated.epub"),
        target_language="English",
        llm=llm,
        on_progress=on_progress,
    )

API Reference

`LLM` Class

Initialize the LLM client for translation:

LLM(
    key: str,                          # API key
    url: str,                          # API endpoint URL
    model: str,                        # Model name (e.g., "gpt-4")
    token_encoding: str,               # Token encoding (e.g., "o200k_base")
    cache_path: PathLike | None = None,           # Cache directory path
    timeout: float | None = None,                  # Request timeout in seconds
    top_p: float | tuple[float, float] | None = None,
    temperature: float | tuple[float, float] | None = None,
    retry_times: int = 5,                         # Number of retries on failure
    retry_interval_seconds: float = 6.0,          # Interval between retries
    log_dir_path: PathLike | None = None,         # Log directory path
)

`translate` Function

Translate an EPUB file:

translate(
    source_path: PathLike | str,       # Source EPUB file path
    target_path: PathLike | str,       # Output EPUB file path
    target_language: str,              # Target language (e.g., "English", "Chinese")
    user_prompt: str | None = None,    # Custom translation instructions
    max_retries: int = 5,              # Maximum retries for failed translations
    max_group_tokens: int = 1200,      # Maximum tokens per translation group
    llm: LLM | None = None,            # Single LLM instance for both translation and filling
    translation_llm: LLM | None = None,  # LLM instance for translation (overrides llm)
    fill_llm: LLM | None = None,       # LLM instance for XML filling (overrides llm)
    on_progress: Callable[[float], None] | None = None,  # Progress callback (0.0-1.0)
    on_fill_failed: Callable[[FillFailedEvent], None] | None = None,  # Error callback
)

Note: Either llm or both translation_llm and fill_llm must be provided. Using separate LLMs allows for task-specific optimization.

Language Constants

EPUB Translator provides predefined language constants for convenience. You can use these constants instead of writing language names as strings:

from epub_translator import language

# Usage example:
translate(
    source_path=Path("source.epub"),
    target_path=Path("translated.epub"),
    target_language=language.ENGLISH,
    llm=llm,
)

# You can also use custom language strings:
translate(
    source_path=Path("source.epub"),
    target_path=Path("translated.epub"),
    target_language="Icelandic",  # For languages not in the constants
    llm=llm,
)

Error Handling with `on_fill_failed`

Monitor and handle translation errors using the on_fill_failed callback:

from epub_translator import FillFailedEvent

def handle_fill_error(event: FillFailedEvent):
    print(f"Translation error (attempt {event.retried_count}):")
    print(f"  {event.error_message}")
    if event.over_maximum_retries:
        print("  Maximum retries exceeded!")

translate(
    source_path=Path("source.epub"),
    target_path=Path("translated.epub"),
    target_language=language.ENGLISH,
    llm=llm,
    on_fill_failed=handle_fill_error,
)

The FillFailedEvent contains:

error_message: str - Description of the error
retried_count: int - Current retry attempt number
over_maximum_retries: bool - Whether max retries has been exceeded

Dual-LLM Architecture

Use separate LLM instances for translation and XML structure filling with different optimization parameters:

# Create two LLM instances with different temperatures
translation_llm = LLM(
    key="your-api-key",
    url="https://api.openai.com/v1",
    model="gpt-4",
    token_encoding="o200k_base",
    temperature=0.8,  # Higher temperature for creative translation
)

fill_llm = LLM(
    key="your-api-key",
    url="https://api.openai.com/v1",
    model="gpt-4",
    token_encoding="o200k_base",
    temperature=0.3,  # Lower temperature for structure preservation
)

translate(
    source_path=Path("source.epub"),
    target_path=Path("translated.epub"),
    target_language=language.ENGLISH,
    translation_llm=translation_llm,
    fill_llm=fill_llm,
)

Configuration Examples

OpenAI

llm = LLM(
    key="sk-...",
    url="https://api.openai.com/v1",
    model="gpt-4",
    token_encoding="o200k_base",
)

Azure OpenAI

llm = LLM(
    key="your-azure-key",
    url="https://your-resource.openai.azure.com/openai/deployments/your-deployment",
    model="gpt-4",
    token_encoding="o200k_base",
)

Other OpenAI-Compatible Services

Any service with an OpenAI-compatible API can be used:

llm = LLM(
    key="your-api-key",
    url="https://your-service.com/v1",
    model="your-model",
    token_encoding="o200k_base",  # Match your model's encoding
)

Use Cases

Language Learning: Read books in their original language with side-by-side translations
Academic Research: Access foreign literature with bilingual references
Content Localization: Prepare books for international audiences
Cross-Cultural Reading: Enjoy literature while understanding cultural nuances

Advanced Features

Custom Translation Prompts

Provide specific translation instructions:

translate(
    source_path=Path("source.epub"),
    target_path=Path("translated.epub"),
    target_language="English",
    llm=llm,
    user_prompt="Use formal language and preserve technical terminology",
)

Caching for Progress Recovery

Enable caching to resume translation progress after failures:

llm = LLM(
    key="your-api-key",
    url="https://api.openai.com/v1",
    model="gpt-4",
    token_encoding="o200k_base",
    cache_path="./translation_cache",  # Translations are cached here
)

Related Projects

PDF Craft

PDF Craft converts PDF files into EPUB and other formats, with a focus on scanned books. Combine PDF Craft with EPUB Translator to convert and translate scanned PDF books into bilingual EPUB format.

Workflow: Scanned PDF → [PDF Craft] → EPUB → [EPUB Translator] → Bilingual EPUB

For a complete tutorial, watch: Convert scanned PDF books to EPUB format and translate them into bilingual books

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

Issues: GitHub Issues
OOMOL Studio: Open in OOMOL Studio

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
.github/workflows		.github/workflows
.vscode		.vscode
docs		docs
epub_translator		epub_translator
scripts		scripts
tests		tests
.gitignore		.gitignore
.pylintrc		.pylintrc
LICENSE		LICENSE
README.md		README.md
README_zh-CN.md		README_zh-CN.md
format.template.json		format.template.json
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

EPUB Translator

Features

Installation

Quick Start

Using OOMOL Studio (Recommended)

Using Python API

With Progress Tracking

API Reference

`LLM` Class

`translate` Function

Language Constants

Error Handling with `on_fill_failed`

Dual-LLM Architecture

Configuration Examples

OpenAI

Azure OpenAI

Other OpenAI-Compatible Services

Use Cases

Advanced Features

Custom Translation Prompts

Caching for Progress Recovery

Related Projects

PDF Craft

Contributing

License

Support

About

Uh oh!

Releases 4

Packages

Contributors 2

Languages

License

oomol-lab/epub-translator

Folders and files

Latest commit

History

Repository files navigation

EPUB Translator

Features

Installation

Quick Start

Using OOMOL Studio (Recommended)

Using Python API

With Progress Tracking

API Reference

LLM Class

translate Function

Language Constants

Error Handling with on_fill_failed

Dual-LLM Architecture

Configuration Examples

OpenAI

Azure OpenAI

Other OpenAI-Compatible Services

Use Cases

Advanced Features

Custom Translation Prompts

Caching for Progress Recovery

Related Projects

PDF Craft

Contributing

License

Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Contributors 2

Languages

`LLM` Class

`translate` Function

Error Handling with `on_fill_failed`

Packages