Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

Huanshere / VideoLingo Public

Notifications You must be signed in to change notification settings
Fork 690
Star 7.2k

Code
Issues 30
Pull requests 4
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Releases: Huanshere/VideoLingo

Releases · Huanshere/VideoLingo

v1.5

06 Oct 10:21

Huanshere

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v1.5

Major Updates:

Added batch processing functionality
Simplified JSON key structure in prompts

Minor Improvements:

Support for running in 6GB gpu memory environments
Improved empty line detection in NLP step
Fixed font issues in Linux environments
Handled phrase alignment errors and prompted to retry UVR
Limited file upload size to 500MB in Streamlit
Made video compression a separate optional step
Restructured README documentation

主要更新：

新增批量处理功能
简化promot中的JSON key结构

小改进：

改善了6G显存环境下的运行速度
改进了NLP步骤中的空行检测
修复Linux环境下的字体问题
处理短语对齐错误并提示重试UVR
在Streamlit中限制文件上传大小为500MB
是否压制视频单独作为可选项
重构了README文档

Assets 2

Loading

All reactions

v1.4.1

02 Oct 12:18

Huanshere

Compare

Choose a tag to compare

Loading

v1.4.1

添加 Colab 支持！！！
增加代理配置功能
冻结环境依赖和 numpy 版本，提升稳定性
修复 oaitts API 使用问题和 Mac 路径问题
优化文本处理，删除超长单词和空行

Assets 2

Loading

All reactions

v1.4

30 Sep 14:40

Huanshere

Compare

Choose a tag to compare

Loading

v1.4

主要更新

官网 videolingo.io 上线，右下角有安装和运行 AI 小助手可以免费使用~
转录前是否进行UVR作为可选项可以设置，并且会切割成15min进行处理，避免大内存占用。
压缩了发送给replicate前的音频大小，不会报错了。
trim过程如果遇到claude因敏感而拒绝回答时自动跳过

bug修复

修复了在 Mac 平台上的一些 bug
修复了标点符号和空格未被正确识别为空行的 bug

Assets 2

Loading

All reactions

v1.3

28 Sep 12:15

Huanshere

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v1.3

主要更新：

加入中文输入支持！需要手动下载模型放在model中
增加了一步骤的UVR人声分离，效果更好
在WhisperX转录之前进行人声分离，保证转录质量（仅WhisperX本地版）
改进配音流程，修复了音频时间比视频长的问题
大幅修复了phrase错误！！！
补充了trim后的检查

详细更新：

新增功能：
- 增加中文语言支持
- 新增 UVR 模型，提高音频质量
性能优化：
- 设置访问 GPT 超时
- 使用 pip 安装 torch，取代 conda
- WhisperX 本地版会在转录前进行 UVR 处理，保证质量
- 根据显存自动调整 WhisperX 本地运行的 batch size
用户体验改进：
- 缩减 prompt，备选方案从 3 个变成 2 个
- 最大长度更新至 70
- 字号更大，每行更短
- 侧边栏加入输入语言选项
- 上传时检查视频名
配音优化：
- 修复了中文转英文配音时的一些问题
依赖优化：
- 移除了大部分的 ffmpeg 依赖
- 不再需要 ffprobe
- Mac 用户无需手动安装 ffmpeg
文档更新：
- 更新 README.md
- 更新 config.example.py

小细节：

取消了大部分 ffmpeg 依赖，简化安装流程
完善了配置文档

Assets 2

Loading

All reactions

v1.2

24 Sep 12:40

Huanshere

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v1.2

加入了 FishTTS ，效果超绝
下载YouTube时可以选择分辨率了，现在还会把封面一起下下来！

小细节：

本地运行 WhisperX 将根据显存自动调整batch size
移除了 Edge TTS，太假了。
完善了 GPT-SoVITS 的配置文档

Assets 2

Loading

All reactions

v1.1.0

23 Sep 04:28

Huanshere

Compare

Choose a tag to compare

Loading

v1.1.0

增加了语速控制 speed 1~1.35，避免过快的语音，更加自然
加入了翻译后的长度裁切，观感更好，且容易配合后续配音
修改了非sonnet模型的翻译chunk大小，避免Qwen报错
补充了gptsovtis模式1：仅使用提供的参考音频

小细节

优化了文档，增加了整合包和源码部署的说明
mac用户使用sovits会提示手动打开
更完善的llm翻译格式验证
更改了config，高级设置仅能在 config.py 中设置，优化了使用的模型

Assets 2

Loading

All reactions

v1.0 !!!!!

21 Sep 14:27

Huanshere

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v1.0 !!!!!

终于把配音功能加上了，支持Azure，Openai，Edge，GPT-SoVITS-v2（beta）
修复了若干bug，翻译错误率更低了
移除了whisper_timestamped

Assets 2

Loading

All reactions

v0.8.3

20 Sep 09:17

Huanshere

Compare

Choose a tag to compare

Loading

v0.8.3

加入了不输出压制视频的选项，在主动选择或者视频长度大于40min是会不进行压制视频

Assets 2

Loading

All reactions

v0.8.2

20 Sep 07:30

Huanshere

Compare

Choose a tag to compare

Loading

v0.8.2

加入了翻译内容更严格的检查，避免了空行翻译错误
api 更换默认使用硅基流动的 Qwen2.5 ，性能能达到 sonnet 的 80%！（但 sonnet 仍然是最佳选择，没有 ai 味且更连贯）

Assets 2

Loading

All reactions

v0.8.1

17 Sep 10:55

Huanshere

Compare

Choose a tag to compare

Loading

v0.8.1

修复了whisperX本地运行版本的bug
更换了默认api提供商，使用稳定的deepbricks，并且安装指南中加入了Qwen和deepseek的推荐

Assets 2

Loading

All reactions

Previous 1 2 3 4 Next

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.