Skip to content

Commit 2ab6ace

Browse files
committed
create singer
1 parent 503b32e commit 2ab6ace

File tree

3 files changed

+46
-5
lines changed

3 files changed

+46
-5
lines changed

README.md

+24-4
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,8 @@
2121

2222
- 【无 泄漏】支持多发音人
2323

24+
- 【捏 音色】创造独有发音人
25+
2426
- 【带 伴奏】也能进行转换,轻度伴奏
2527

2628
- 【用 Excel】进行原始调教,纯手工
@@ -29,9 +31,9 @@
2931

3032
本项目将继续完成基于BIGVGAN的模型(32K),在此之后,有成果再更新项目
3133

32-
## 模型和日志:https://github.com/PlayVoice/so-vits-svc-5.0/releases/tag/v5.3
34+
## 模型和日志:https://github.com/PlayVoice/so-vits-svc-5.0/releases/tag/base_release_hifigan
3335

34-
- [5.0.epoch1200.full.pth](https://github.com/PlayVoice/so-vits-svc-5.0/releases/download/v5.3/5.0.epoch1200.full.pth)模型包括:生成器+判别器=176M,可用作预训练模型
36+
- [5.0.epoch1200.full.pth](https://github.com/PlayVoice/so-vits-svc-5.0/releases/download/base_release_hifigan/5.0.epoch1200.full.pth)模型包括:生成器+判别器=176M,可用作预训练模型
3537
- 发音人(56个)文件在configs/singers目录中,可进行推理测试,尤其测试音色泄露
3638
- 发音人22,30,47,51辨识度较高,音频样本在configs/singers_sample目录中
3739

@@ -42,7 +44,7 @@
4244
| natural speech | Microsoft || 减少发音错误 | - |
4345
| neural source-filter | NII || 解决断音问题 | 参数优化 |
4446
| speaker encoder | Google || 音色编码与聚类 | - |
45-
| GRL for speaker | Ubisoft || 防止编码器泄露音色 | 原理类似判别器的对抗训练 |
47+
| GRL for speaker | Ubisoft || 防止编码器泄漏音色 | 原理类似判别器的对抗训练 |
4648
| one shot vits | Samsung || VITS 一句话克隆 | - |
4749
| SCLN | Microsoft || 改善克隆 | - |
4850
| band extention | Adobe || 16K升48K采样 | 数据处理 |
@@ -60,7 +62,7 @@
6062
💗必要的前处理:
6163
- 1 降噪&去伴奏
6264
- 2 频率提升
63-
- 3 音质提升,基于https://github.com/openvpi/vocoders ,待整合
65+
- 3 音质提升
6466
- 4 将音频剪裁为小于30秒的音频段,whisper的要求
6567

6668
然后以下面文件结构将数据集放入dataset_raw目录
@@ -255,6 +257,24 @@ data_svc/
255257
| --- | --- | --- | --- | --- | --- | --- | --- |
256258
| name | 配置文件 | 模型文件 | 音色文件 | 音频文件 | 音频内容 | 音高内容 | 升降调 |
257259

260+
## 捏音色
261+
纯属巧合的取名:average -> ave -> eva,夏娃代表者孕育和繁衍
262+
263+
> python svc_eva.py
264+
265+
```python
266+
eva_conf = {
267+
'./configs/singers/singer0022.npy': 0,
268+
'./configs/singers/singer0030.npy': 0,
269+
'./configs/singers/singer0047.npy': 0.5,
270+
'./configs/singers/singer0051.npy': 0.5,
271+
}
272+
```
273+
274+
生成的音色文件为:eva.spk.npy
275+
276+
💗Flow和Decoder均需要输入,您甚至可以给两个模块输入不同的音色参数,捏出更独特的音色。
277+
258278
## 数据集
259279

260280
| Name | URL |

svc_eva.py

+20
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
import os
2+
import numpy as np
3+
4+
# average -> ave -> eva :haha
5+
6+
eva_conf = {
7+
'./configs/singers/singer0022.npy': 0,
8+
'./configs/singers/singer0030.npy': 0,
9+
'./configs/singers/singer0047.npy': 0.5,
10+
'./configs/singers/singer0051.npy': 0.5,
11+
}
12+
13+
if __name__ == "__main__":
14+
15+
eva = np.zeros(256)
16+
for k, v in eva_conf.items():
17+
assert os.path.isfile(k), k
18+
spk = np.load(k)
19+
eva = eva + spk * v
20+
np.save("eva.spk.npy", eva, allow_pickle=False)

svc_inference.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -56,8 +56,9 @@ def main(args):
5656
ppg = torch.FloatTensor(ppg)
5757

5858
pit = load_csv_pitch(args.pit)
59+
print("pitch shift: ", args.shift)
5960
if (args.shift == 0):
60-
print("don't use pitch shift")
61+
pass
6162
else:
6263
pit = np.array(pit)
6364
source = pit[pit > 0]

0 commit comments

Comments
 (0)