Skip to content

Latest commit

 

History

History
280 lines (197 loc) · 9.45 KB

README-EN.md

File metadata and controls

280 lines (197 loc) · 9.45 KB


中文 Downloads Telegram Group Latest Release PyPI - Version License Open In Colab


Note

🌟 If this project helps you, please remember to give it a Star 🌟 for support!

📝 It is recommended to use the Large model for better experience.

📖 Installation Guide | ❓ FAQ | 💬 Telegram Group


Project Introduction

Chenyme-AAVT Automatic Video Translation Project aims to provide a simple, efficient, and free automation process for media recognition and translation, helping you quickly complete tasks such as audio and video subtitle recognition, translation, and processing. Currently, the project not only helps you recognize and translate sound but also automates the generation of marketing blog content, and even performs separate subtitle translations. Future plans include adding more interesting tools based on existing basic functions, such as real-time recognition, lip-sync correction, voice cloning, timbre differentiation, and more. Stay tuned!

Basic supported features, not all features:


20240820210851.jpg


Project Highlights

📃 TODO | Tasks

Recognition

  • Replaced with faster Whisper project
  • Supports local model loading
  • Supports personal fine-tuning of Whisper models
  • VAD-assisted optimization
  • Word-level sentence segmentation optimization
  • More language recognition

Translation

  • Translation optimization
  • More language translations
  • More translation models
  • More translation engines
  • Supports local large language model translation

Subtitles

  • Personalized subtitles
  • More subtitle formats
  • Subtitle preview, real-time editing
  • Automated subtitle text proofreading
  • Dual subtitles

Other

  • AI Assistant
  • Video preview
  • Blog generation from videos
  • Real-time voice translation
  • Chinese voiceover for videos
  • Timbre differentiation
  • Voice cloning
  • Lip-sync correction
  • Supports recognition and translation of multiple languages
  • Supports localized, free deployment of the entire process
  • Supports one-click generation of blog content, marketing blog from videos
  • Supports automated translation, secondary subtitle editing, video preview
  • Supports GPU acceleration, VAD assistance, FFmpeg acceleration
  • Supports using various large models like ChatGPT, Claude, Gemini, DeepSeek for translation engines

Windows Deployment

👉 Prerequisites: Python, FFmpeg, CUDA Instructions

Python | 📖 Guide

  • 💡 Choose Python version > 3.8
  • Go to the official Python website to download the installer
  • Run the installation and make sure to check the ADD TO PATH option

FFMpeg | 📖 Guide

  • 💡 If you are unsure how to install and compile, directly download the Win version from the project’s Release page, which comes with a pre-compiled FFmpeg
  • Go to the official FFmpeg website to download the compiled Windows version
  • Set FFmpeg as an environment variable

CUDA (Skip for CPU) | 📖 Guide

  • 💡 Recommended versions are CUDA 11.8, 12.1, 12.4
  • Go to the CUDA website to download the installer
  • Install CUDA

 


‼️ Make sure the prerequisites are ready before proceeding to the following steps‼️

1. Run Deployment Script

  • Go to the Release page to download the latest Win version (Win/Small)
  • Run 1_Install.bat and wait for the script to check
  • After passing, follow the prompts to choose the version for installation

2. Run the Project Web

  • Run 2_WebUI.bat
  • Enter chenymeaavt to access the project (this is a protection feature of the new version, can be turned off)

 

ℹ️ The WebUI will automatically launch, if it doesn’t, manually enter localhost:8501 in your browser


Mac OS Deployment

👉 Prerequisites: Python, Brew Instructions

Python

  • 💡 Choose Python version > 3.8
  • Go to the Python website to download the PGK installer
  • Run the installation and select the standard install on the page

Brew

  • 💡 Use the following command for one-click installation of brew
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

 


‼️ Make sure the prerequisites are ready before proceeding to the following steps‼️

1. Install FFmpeg

brew install FFMpeg

2. Install Project Dependencies

  • Go to the Release page to download the latest Mac version (Mac/Small)
  • cd to the project root directory
pip3 install -r requirements.txt

3. Run the Project Web

streamlit run Chenyme-AAVT.py
  • Enter chenymeaavt to access the project (this is a protection feature of the new version, can be turned off)

 

ℹ️ The WebUI will automatically launch, if it doesn’t, manually enter localhost:8501 in your browser


Docker Deployment

💡 Currently, the latest project version is V0.9.0. The Docker method is for version V0.8.x.

Thanks to @Eisaichen for providing this version

For detailed usage, please refer to: 📖 eisai/chenyme-aavt

docker pull eisai/chenyme-aavt

Other Deployment Methods

Google Colab Deployment

Thanks to @Kirie233 for providing this version

For detailed usage instructions, please refer to: Open In Colab


Docker Deployment

💡 The current latest project version is V0.9.0. This Docker method is for version V0.8.x,

Thanks to @Eisaichen for providing this version

docker pull eisai/chenyme-aavt

For detailed usage instructions, please refer to: 📖 eisai/chenyme-aavt


Linux Deployment

As my computer is currently left at school, I haven’t studied this yet. However, I believe solving FFmpeg and CUDA should work fine.



Star History

Star History Chart



Homepage BOT


11


Some Settings


12


Audio Recognition


13


Video Recognition


14


Blog Generation


15


Subtitle Translation


16


Voice Simulation


17


Acknowledgements

I have greatly benefited from the AI era, and this project has largely been realized by standing on the shoulders of giants. Thanks to the open-source spirit, and thanks to the developers of OpenAI, Streamlit, FFmpeg, Faster-whisper, and more!