Skip to content

第17周作业 Langchain glm translator #101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,348 changes: 674 additions & 674 deletions openai-translator/LICENSE

Large diffs are not rendered by default.

200 changes: 93 additions & 107 deletions openai-translator/README-CN.md
Original file line number Diff line number Diff line change
@@ -1,107 +1,93 @@
# OpenAI-Translator

<p align="center">
<br> <a href="README.md"> English </a> | 中文
</p>
<p align="center">
<em>所有的代码和文档完全由 OpenAI 的 GPT-4 模型生成</em>
</p>

## 介绍

OpenAI 翻译器是一个使用 AI 技术将英文 PDF 书籍翻译成中文的工具。这个工具使用了大型语言模型 (LLMs),如 ChatGLM 和 OpenAI 的 GPT-3 以及 GPT-3.5 Turbo 来进行翻译。它是用 Python 构建的,并且具有灵活、模块化和面向对象的设计。

## 为什么做这个项目

在现今的环境中,缺乏非商业而且有效的 PDF 翻译工具。很多用户有包含敏感数据的 PDF 文件,他们更倾向于不将其上传到公共商业服务网站,以保护隐私。这个项目就是为了解决这个问题,为需要翻译他们的 PDF 文件同时又要保护数据隐私的用户提供解决方案。

## 示例结果

OpenAI 翻译器目前还处于早期开发阶段,我正在积极地添加更多功能和改进其性能。我们非常欢迎任何反馈或贡献!

![The_Old_Man_of_the_Sea](images/sample_image_0.png)

<p align="center">
<em>"老人与海"</em>
</p>

## 特性

- [X] 使用大型语言模型 (LLMs) 将英文 PDF 书籍翻译成中文。
- [X] 支持 ChatGLM 和 OpenAI 模型。
- [X] 通过 YAML 文件或命令行参数灵活配置。
- [X] 对健壮的翻译操作进行超时和错误处理。
- [X] 模块化和面向对象的设计,易于定制和扩展。
- [ ] 实现图形用户界面 (GUI) 以便更易于使用。
- [ ] 添加对多个 PDF 文件的批处理支持。
- [ ] 创建一个网络服务或 API,以便在网络应用中使用。
- [ ] 添加对其他语言和翻译方向的支持。
- [ ] 添加对保留源 PDF 的原始布局和格式的支持。
- [ ] 通过使用自定义训练的翻译模型来提高翻译质量。


## 开始使用

### 环境准备

1.克隆仓库 `git clone git@github.com:DjangoPeng/openai-translator.git`。

2.OpenAI-翻译器 需要 Python 3.6 或更高版本。使用 `pip install -r requirements.txt` 安装依赖项。

3.设置您的 OpenAI API 密钥(`$OPENAI_API_KEY`)或 ChatGLM 模型 URL(`$GLM_MODEL_URL`)。您可以将其添加到环境变量中,或者在 config.yaml 文件中指定。

### 使用示例

您可以通过指定配置文件或提供命令行参数来使用 OpenAI-翻译器。

#### 使用配置文件

根据您的设置调整 `config.yaml` 文件:

```yaml
OpenAIModel:
model: "gpt-3.5-turbo"
api_key: "your_openai_api_key"

GLMModel:
model_url: "your_chatglm_model_url"
timeout: 300

common:
book: "test/test.pdf"
file_format: "markdown"
```

然后命令行直接运行:

```bash
python ai_translator/main.py
```

![sample_out](images/sample_image_1.png)

#### 使用命令行参数

您也可以直接在命令行上指定设置。这是使用 OpenAI 模型的例子:

```bash
# 将您的 api_key 设置为环境变量
export OPENAI_API_KEY="sk-xxx"
python ai_translator/main.py --model_type OpenAIModel --openai_api_key $OPENAI_API_KEY --file_format markdown --book tests/test.pdf --openai_model gpt-3.5-turbo
```

这是使用 GLM 模型的例子:

```bash
# 将您的 GLM 模型 URL 设置为环境变量
export GLM_MODEL_URL="http://xxx:xx"
python ai_translator/main.py --model_type GLMModel --glm_model_url $GLM_MODEL_URL --book tests/test.pdf
```

## 许可证

该项目采用 GPL-3.0 许可证。有关详细信息,请查看 [LICENSE](LICENSE) 文件。




# OpenAI-Translator

<p align="center">
<br> <a href="README.md"> English </a> | 中文
</p>
<p align="center">
<em>所有的代码和文档完全由 OpenAI 的 GPT-4 模型生成</em>
</p>

## 介绍

OpenAI 翻译器是一个使用 AI 技术将英文 PDF 书籍翻译成中文的工具。这个工具使用了大型语言模型 (LLMs),如 ChatGLM 和 OpenAI 的 GPT-3 以及 GPT-3.5 Turbo 来进行翻译。它是用 Python 构建的,并且具有灵活、模块化和面向对象的设计。

## 为什么做这个项目

在现今的环境中,缺乏非商业而且有效的 PDF 翻译工具。很多用户有包含敏感数据的 PDF 文件,他们更倾向于不将其上传到公共商业服务网站,以保护隐私。这个项目就是为了解决这个问题,为需要翻译他们的 PDF 文件同时又要保护数据隐私的用户提供解决方案。

## 示例结果

OpenAI 翻译器目前还处于早期开发阶段,我正在积极地添加更多功能和改进其性能。我们非常欢迎任何反馈或贡献!

![The_Old_Man_of_the_Sea](images/sample_image_0.png)

<p align="center">
<em>"老人与海"</em>
</p>

## 特性

- [X] 使用大型语言模型 (LLMs) 将英文 PDF 书籍翻译成中文。
- [X] 支持 ChatGLM 和 OpenAI 模型。
- [X] 通过 YAML 文件或命令行参数灵活配置。
- [X] 对健壮的翻译操作进行超时和错误处理。
- [X] 模块化和面向对象的设计,易于定制和扩展。
- [x] 添加对其他语言和翻译方向的支持。
- [ ] 实现图形用户界面 (GUI) 以便更易于使用。
- [ ] 创建一个网络服务或 API,以便在网络应用中使用。
- [ ] 添加对多个 PDF 文件的批处理支持。
- [ ] 添加对保留源 PDF 的原始布局和格式的支持。
- [ ] 通过使用自定义训练的翻译模型来提高翻译质量。


## 开始使用

### 环境准备

1.克隆仓库 `git clone git@github.com:DjangoPeng/openai-translator.git`。

2.OpenAI-翻译器 需要 Python 3.10 或更高版本。使用 `pip install -r requirements.txt` 安装依赖项。

3.设置您的 OpenAI API 密钥(`$OPENAI_API_KEY`)。您可以将其添加到环境变量中,或者在 config.yaml 文件中指定。

### 使用示例

您可以通过指定配置文件或提供命令行参数来使用 OpenAI-Translator 工具。

#### 使用配置文件

根据您的设置调整 `config.yaml` 文件:

```yaml
model_name: "gpt-3.5-turbo"
input_file: "tests/test.pdf"
output_file_format: "markdown"
source_language: "English"
target_language: "Chinese"
```

然后命令行直接运行:

```bash
python ai_translator/main.py
```

![sample_out](images/sample_image_1.png)

#### 使用命令行参数

您也可以直接在命令行上指定设置。这是使用 OpenAI 模型的例子:

```bash
# 将您的 api_key 设置为环境变量
export OPENAI_API_KEY="sk-xxx"
python ai_translator/main.py --model_name "gpt-3.5-turbo" --input_file "your_input.pdf" --output_file_format "markdown" --source_language "English" --target_language "Chinese"
```

## 许可证

该项目采用 GPL-3.0 许可证。有关详细信息,请查看 [LICENSE](LICENSE) 文件。




192 changes: 89 additions & 103 deletions openai-translator/README.md
Original file line number Diff line number Diff line change
@@ -1,104 +1,90 @@
# OpenAI-Translator

<p align="center">
<br> English | <a href="README-CN.md">中文</a>
</p>
<p align="center">
<em>All the code and documentation are entirely generated by OpenAI's GPT-4 Model</em>
</p>


## Introduction

OpenAI Translator is an AI-powered translation tool designed to translate English PDF books to Chinese. The tool leverages large language models (LLMs) like ChatGLM and OpenAI's GPT-3 and GPT-3.5 Turbo for translation. It's built in Python and has a flexible, modular, and object-oriented design.

## Why this project

In the current landscape, there's a lack of non-commercial yet efficient PDF translation tools. Many users have PDF documents with sensitive data that they prefer not to upload to public commercial service websites due to privacy concerns. This project was developed to address this gap, providing a solution for users who need to translate their PDFs while maintaining data privacy.

### Sample Results

The OpenAI Translator is still in its early stages of development, and I'm actively working on adding more features and improving its performance. We appreciate any feedback or contributions!

![The_Old_Man_of_the_Sea](images/sample_image_0.png)

<p align="center">
<em>"The Old Man and the Sea"</em>
</p>

## Features

- [X] Translation of English PDF books to Chinese using LLMs.
- [X] Support for both [ChatGLM](https://github.com/THUDM/ChatGLM-6B) and [OpenAI](https://platform.openai.com/docs/models) models.
- [X] Flexible configuration through a YAML file or command-line arguments.
- [X] Timeouts and error handling for robust translation operations.
- [X] Modular and object-oriented design for easy customization and extension.
- [ ] Implement a graphical user interface (GUI) for easier use.
- [ ] Add support for batch processing of multiple PDF files.
- [ ] Create a web service or API to enable usage in web applications.
- [ ] Add support for other languages and translation directions.
- [ ] Add support for preserving the original layout and formatting of the source PDF.
- [ ] Improve translation quality by using custom-trained translation models.


## Getting Started

### Environment Setup

1.Clone the repository `git clone git@github.com:DjangoPeng/openai-translator.git`.

2.The `OpenAI-Translator` requires Python 3.6 or later. Install the dependencies with `pip install -r requirements.txt`.

3.Set up your OpenAI API key(`$OPENAI_API_KEY`) or ChatGLM Model URL(`$GLM_MODEL_URL`). You can either add it to your environment variables or specify it in the config.yaml file.

### Usage

You can use OpenAI-Translator either by specifying a configuration file or by providing command-line arguments.

#### Using a configuration file:

Adapt `config.yaml` file with your settings:

```yaml
OpenAIModel:
model: "gpt-3.5-turbo"
api_key: "your_openai_api_key"

GLMModel:
model_url: "your_chatglm_model_url"
timeout: 300

common:
book: "test/test.pdf"
file_format: "markdown"
```

Then run the tool:

```bash
python ai_translator/main.py
```

![sample_out](images/sample_image_1.png)

#### Using command-line arguments:

You can also specify the settings directly on the command line. Here's an example of how to use the OpenAI model:

```bash
# Set your api_key as an env variable
export OPENAI_API_KEY="sk-xxx"
python ai_translator/main.py --model_type OpenAIModel --openai_api_key $OPENAI_API_KEY --file_format markdown --book tests/test.pdf --openai_model gpt-3.5-turbo
```

And an example of how to use the GLM model:

```bash
# Set your GLM Model URL as an env variable
export GLM_MODEL_URL="http://xxx:xx"
python ai_translator/main.py --model_type GLMModel --glm_model_url $GLM_MODEL_URL --book tests/test.pdf
```

## License

# OpenAI-Translator

<p align="center">
<br> English | <a href="README-CN.md">中文</a>
</p>
<p align="center">
<em>All the code and documentation are entirely generated by OpenAI's GPT-4 Model</em>
</p>


## Introduction

OpenAI Translator is an AI-powered translation tool designed to translate English PDF books to Chinese. The tool leverages large language models (LLMs) like ChatGLM and OpenAI's GPT-3 and GPT-3.5 Turbo for translation. It's built in Python and has a flexible, modular, and object-oriented design.

## Why this project

In the current landscape, there's a lack of non-commercial yet efficient PDF translation tools. Many users have PDF documents with sensitive data that they prefer not to upload to public commercial service websites due to privacy concerns. This project was developed to address this gap, providing a solution for users who need to translate their PDFs while maintaining data privacy.

### Sample Results

The OpenAI Translator is still in its early stages of development, and I'm actively working on adding more features and improving its performance. We appreciate any feedback or contributions!

![The_Old_Man_of_the_Sea](images/sample_image_0.png)

<p align="center">
<em>"The Old Man and the Sea"</em>
</p>

## Features

- [X] Translation of English PDF books to Chinese using LLMs.
- [X] Support for both [ChatGLM](https://github.com/THUDM/ChatGLM-6B) and [OpenAI](https://platform.openai.com/docs/models) models.
- [X] Flexible configuration through a YAML file or command-line arguments.
- [X] Timeouts and error handling for robust translation operations.
- [X] Modular and object-oriented design for easy customization and extension.
- [x] Add support for other languages and translation directions.
- [ ] Implement a graphical user interface (GUI) for easier use.
- [ ] Create a web service or API to enable usage in web applications.
- [ ] Add support for batch processing of multiple PDF files.
- [ ] Add support for preserving the original layout and formatting of the source PDF.
- [ ] Improve translation quality by using custom-trained translation models.


## Getting Started

### Environment Setup

1.Clone the repository `git clone git@github.com:DjangoPeng/openai-translator.git`.

2.The `OpenAI-Translator` requires Python 3.10 or later. Install the dependencies with `pip install -r requirements.txt`.

3.Set up your OpenAI API key(`$OPENAI_API_KEY`). You can either add it to your environment variables or specify it in the config.yaml file.

### Usage

You can use OpenAI-Translator either by specifying a configuration file or by providing command-line arguments.

#### Using a configuration file:

Adapt `config.yaml` file with your settings:

```yaml
model_name: "gpt-3.5-turbo"
input_file: "tests/test.pdf"
output_file_format: "markdown"
source_language: "English"
target_language: "Chinese"
```

Then run the tool:

```bash
python ai_translator/main.py
```

![sample_out](images/sample_image_1.png)

#### Using command-line arguments:

You can also specify the settings directly on the command line. Here's an example of how to use the OpenAI model:

```bash
# Set your api_key as an env variable
export OPENAI_API_KEY="sk-xxx"
python ai_translator/main.py --model_name "gpt-3.5-turbo" --input_file "your_input.pdf" --output_file_format "markdown" --source_language "English" --target_language "Chinese"
```

## License

This project is licensed under the GPL-3.0 License. See the [LICENSE](LICENSE) file for details.
Loading