Skip to content

Releases: Huanshere/VideoLingo

v2.0

17 Nov 09:02
495e407
Compare
Choose a tag to compare

V 2.0

🚀 New Features:

  • The default LLM has been switched to SiliconFlow's Qwen-72B, which is exceptionally useful!
  • Integrated SiliconFlow's Fish TTS cloning voiceover feature, making VideoLingo accessible with just one key!
  • Solved the inconsistency issue between dubbing and subtitles by using on-screen subtitles for dubbing tasks!
  • Introduced OneKeyInstall to simplify the installation process, so there's no need to understand coding anymore!

🐛 Bug Fixes:

  • Fixed a critical error causing subtitles to become misaligned after multiple splits.
  • Resolved the issue with Rich box import.
  • Fixed FFmpeg error.
  • Replaced MoviePy due to an error it caused.

🔧 Improvements:

  • Enhanced code readability.
  • Added checks for Hugging Face's mirror sites.
  • Modified the audio extraction method to retain the original audio, compressing only during Whisper processing to improve audio quality.
  • Batch execution of TTS.

V 2.0

🚀 新功能:

  • 将默认 LLM 切换为硅基流动的 Qwen-72B,超好用!
  • 集成了硅基流动的 Fish TTS 克隆配音功能,现在一个 Key 就能畅行 VideoLingo!
  • 通过使用显示的字幕进行配音任务,解决了配音和字幕之间的不一致问题!
  • 引入了 OneKeyInstall 简化安装流程,再也不需要懂代码了!

🐛 问题修复:

  • 修复了导致多次分割后字幕错位的严重错误。
  • 修复了 Rich box 导入问题。
  • 修复了 FFmpeg 错误。
  • 由于 MoviePy 引起错误,已将其替换。

🔧 改进:

  • 提高代码可读性。
  • 增加了 Hugging Face 镜像站点的检查。
  • 修改了音频提取方式,保留原始音频,仅在 Whisper 处理过程中进行压缩,从而提高了音频质量。
  • 批量执行 TTS。

v1.8.0

13 Nov 16:25
Compare
Choose a tag to compare

Release Notes

🔧 Improvements:

  • Refined and optimized the overall code structure for better performance and maintainability.
  • Enhanced the prompt to be more concise and applicable to a wider range of models.
  • Improved fuzzy and precise matching in translation processes.

🐛 Bug Fixes:

  • Fixed several errors occurring during the FFmpeg compression process.
  • Resolved an issue with phrase errors caused by lack of initialization and null returns from model translations.

📝 Updates:

  • Demucs vocal separation is no longer performed by default before transcription, addressing the issue of missing sentences and improving processing speed.
  • Removed support for the whisperX replicate API to simplify the project as an open-source initiative.
  • Adjusted the translation process to handle smaller segments, reducing the likelihood of errors.

发布说明

🔧 改进:

  • 精简优化了整体代码结构,提高了性能和可维护性。
  • 优化了提示词,使其更加精简,适用于更多模型。
  • 改进了翻译过程中的模糊和精确匹配。

🐛 问题修复:

  • 修复了在 FFmpeg 压制过程中发生的一些错误。
  • 解决了由于未初始化和模型翻译返回空值导致的 phrase 错误。

📝 更新:

  • 默认不在转录前进行 Demucs 人声分离,以解决遗漏句子的问题并提高处理速度。
  • 删除了 whisperX replicate API 的支持,以简化开源项目。
  • 调整翻译过程为处理更小的块,减少错误的可能性。

v1.7.1

11 Nov 07:09
Compare
Choose a tag to compare

Release Notes

🚀 New Features:

  • Added MPS support for Demucs to improve performance.
  • Implemented an error retry mechanism for batch processing.
  • Set the default to use the Gemini model and updated documentation accordingly.
  • Added an auto-update feature for ytdlp.

🐛 Bug Fixes:

  • Corrected a long video segmentation error.
  • Fixed loading issues with the local Chinese Whisper model.
  • Improved audio splitting robustness and encoding handling.
  • Resolved issues with handling reference audio prerequisites for GPT-SoVITS batch processing.
  • Correctly implemented retry on translation failures.

🔧 Improvements:

  • Updated the CPU-specific torch version in the installation process.
  • Refactored to simplify the prompt reasoning chain due to minimal improvement.

📝 Updates:

  • Removed the install option in the OneKey batch script.
  • Added a section for the SaaS website in the documentation.

🚀 新功能:

  • 为 Demucs 增加了 MPS 支持以提升性能。
  • 实现了批处理的错误重试机制。
  • 设置默认使用 Gemini 模型并相应更新了文档。
  • 新增 ytdlp 自动更新功能。

🐛 问题修复:

  • 修正了长视频分段错误。
  • 修复了本地中文 Whisper 模型的加载问题。
  • 改进了音频分割的鲁棒性和编码处理。
  • 解决了 GPT-SoVITS 批处理的参考音频前提条件问题。
  • 正确实现了翻译失败时的重试机制。

🔧 改进:

  • 更新了安装过程中与 CPU 相关的 torch 版本。
  • 重构以简化提示推理链条,因其改进效果有限。

📝 更新:

  • 移除了 OneKey 批处理脚本中的安装选项。
  • 在文档中添加了 SaaS 网站的部分。

v1.7.0

30 Oct 10:14
eb7c8fd
Compare
Choose a tag to compare

🚀 New Features:

  • Enabled GPU acceleration for FFmpeg encoding (5x speed boost)
  • Replaced UVR with Demucs for vocal isolation (5x speed boost)
  • Automated torch version selection based on GPU in install.py

🐛 Bug Fixes:

  • Resolved local whisperX video segmentation issue
  • Fixed support for uppercase file extensions

🔧 Improvements:

  • Simplified code structure
  • Improved spacing between Chinese and English in subtitles with autocorrect
  • Streamlined PyPI sources to official and Tsinghua mirrors

📝 Updates:

  • One-click package and free test key are no longer provided
  • Commercial SaaS version will be released tomorrow

🚀 新功能:

  • 为FFmpeg编码启用GPU加速(5倍速度提升)
  • 用Demucs替换UVR进行人声分离(5倍速度提升)
  • 在install.py中根据GPU自动选择torch版本

🐛 问题修复:

  • 解决本地whisperX视频分段问题
  • 修复对大写文件扩展名的支持

🔧 改进:

  • 简化代码结构
  • 使用自动纠正改善字幕中中英文之间的间距
  • 精简PyPI源为官方和清华镜像

📝 更新:

  • 不再提供一键包和免费的测试key
  • 商业SaaS版本将于明天发布

v1.6.4

17 Oct 07:32
Compare
Choose a tag to compare

🚀 New Features:

  • Added m4a file support
  • Automated Chinese transcription model download
  • Optimized multilingual PyPI source selection

🐛 Bug Fixes:

  • Fixed uppercase file extension issue
  • Adjusted summary length to 4k characters

📝 Updates:

  • New logo and documentation improvements
  • Model uploaded to Docker Hub

🚀 新功能:

  • 支持m4a文件格式
  • 自动下载中文转录模型
  • 优化多语言PyPI源选择

🐛 问题修复:

  • 解决文件扩展名大写问题
  • 调整摘要长度为4k字符

📝 更新:

  • 新logo和文档改进
  • 模型上传至Docker Hub

v1.6.3

12 Oct 11:03
978ca2a
Compare
Choose a tag to compare
  • 🌐 Simplified the implementation of multilingual support . Chinese users, please apply the localization patch according to the installation documentation 🇨🇳. Support for other languages is TODO 📝

  • 📚 Updated the technical documentation on the official website

  • 🌐 简化了多语言的实现方式,中文用户请根据安装文档打上汉化补丁,其他语言的支持TODO

  • 📚 更新了官网的技术文档

v1.6.2

10 Oct 02:54
Compare
Choose a tag to compare

📢 Announcement: Starting from this version, the open-source edition will only receive stability updates. Our commercial SaaS version is coming soon, stay tuned!

🎉 New Features

  • Added support for audio file uploads
  • Added Docker support

🛠️ Stability Fixes

  • Fixed upload-related bugs
  • Fixed audio format processing issues
  • Optimized configuration file import
  • Fixed OpenAI TTS configuration issues
  • Fixed and optimized ffmpeg-related issues
  • Added int8 support for older GPUs
  • Optimized pip source selection and Hugging Face mirror choice during installation

📢 公告:从这个版本开始,开源版本将只进行稳定性更新。我们的商业SaaS版本即将推出,敬请期待!

🎉 新功能

  • 添加对音频文件上传的支持
  • 添加 Docker 支持

🛠️ 稳定性修复

  • 修复上传相关 bug
  • 修复音频格式处理问题
  • 优化配置文件导入
  • 修复 OpenAI TTS 配置问题
  • 修复并优化 ffmpeg 相关问题
  • 为旧款 GPU 添加 int8 支持
  • 安装时优选pip源和选择huggingface镜像

v1.6.1

08 Oct 08:36
Compare
Choose a tag to compare

Enhance System Stability

  • Revert FFmpeg installation due to encountered issues
  • Resolve SoVITS configuration import problems
  • Address YAML installation errors
  • Replace Librosa with FFmpeg to mitigate compatibility issues
  • Boost stability in batch processing mode"

提升系统稳定性

  • 由于出现问题,回退 FFmpeg 安装
  • 解决 SoVITS 配置导入问题
  • 修复 YAML 安装错误
  • 用 FFmpeg 替换 Librosa 以解决兼容性问题
  • 增强批处理模式的稳定性

v1.6

07 Oct 06:51
Compare
Choose a tag to compare
  • Refactored config.py into config.yaml, with corresponding restructuring of the codebase. The UI no longer requires clicking a save button.

  • Automatic source selection during local installation

  • Fixed an issue where language settings in batch mode were ineffective

  • Improved stability for accessing Replicate

  • config.py 重构为 config.yaml ,代码库也跟随着进行了重构,UI中现在不需要点击保存了。

  • 本地安装时自动选择源

  • 修复了 batch mode 中填写语言无效的问题

  • 现在访问replicate的方法更稳定了

v1.5.1

06 Oct 15:27
40c0056
Compare
Choose a tag to compare
  • Emergency fix for the blank nan bug and zh check bug in batch mode

  • Fixed an issue where the language set in the tasks settings of version 1.5 was ineffective.

  • Added mirror source selection for installation steps

  • 紧急修复了 batch 模式的留空 nan bug和 zh 检查 bug,

  • 修复了1.5版本tasks setting中设置的语言无效的问题

  • 增加了安装步骤的镜像源选择