Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: PlayVoice/VI-SVS
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: VISinger
Choose a base ref
...
head repository: jerryuhoo/VISinger
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: main
Choose a head ref
Can’t automatically merge. Don’t worry, you can still create the pull request.

Commits on Oct 23, 2022

  1. add eval plot

    jerryuhoo committed Oct 23, 2022
    Copy the full SHA
    7442f98 View commit details
  2. fix readme

    jerryuhoo committed Oct 23, 2022
    Copy the full SHA
    8226d56 View commit details

Commits on Oct 24, 2022

  1. visinger

    jerryuhoo committed Oct 24, 2022
    Copy the full SHA
    9369054 View commit details
  2. bug fix, align wav

    jerryuhoo committed Oct 24, 2022
    Copy the full SHA
    eccdbaf View commit details

Commits on Oct 25, 2022

  1. fix dp bug

    jerryuhoo committed Oct 25, 2022
    Copy the full SHA
    e5ae61b View commit details
  2. fix preprocess bug

    jerryuhoo committed Oct 25, 2022
    Copy the full SHA
    48878cf View commit details
  3. add pitch f0 input

    jerryuhoo committed Oct 25, 2022
    Copy the full SHA
    c87ec83 View commit details

Commits on Oct 28, 2022

  1. fix data preprocess

    jerryuhoo committed Oct 28, 2022
    Copy the full SHA
    546ecdc View commit details
  2. add evaluate metrics

    jerryuhoo committed Oct 28, 2022
    Copy the full SHA
    e01394d View commit details

Commits on Oct 29, 2022

  1. Copy the full SHA
    6ce4134 View commit details
  2. Copy the full SHA
    7c245bb View commit details
  3. fix remove pth bug

    jerryuhoo committed Oct 29, 2022
    Copy the full SHA
    9e986b8 View commit details

Commits on Oct 30, 2022

  1. Copy the full SHA
    df01d3d View commit details
  2. fix device bug

    jerryuhoo committed Oct 30, 2022
    Copy the full SHA
    56d837a View commit details

Commits on Oct 31, 2022

  1. add resample preprocess

    jerryuhoo committed Oct 31, 2022
    Copy the full SHA
    18ea5d1 View commit details
  2. change fs to 24k

    jerryuhoo committed Oct 31, 2022
    Copy the full SHA
    e64d029 View commit details
  3. fix fs

    jerryuhoo committed Oct 31, 2022
    Copy the full SHA
    bd12482 View commit details

Commits on Nov 1, 2022

  1. fix input pitch

    jerryuhoo committed Nov 1, 2022
    Copy the full SHA
    74c22aa View commit details
  2. remove pitch loss limit

    jerryuhoo committed Nov 1, 2022
    Copy the full SHA
    6ce1681 View commit details

Commits on Nov 2, 2022

  1. fix

    jerryuhoo committed Nov 2, 2022
    Copy the full SHA
    f77a6de View commit details
  2. change beta

    jerryuhoo committed Nov 2, 2022
    Copy the full SHA
    f67ed17 View commit details

Commits on Nov 3, 2022

  1. fix pitch

    jerryuhoo committed Nov 3, 2022
    Copy the full SHA
    f7d62e9 View commit details

Commits on Nov 8, 2022

  1. fix eval

    jerryuhoo committed Nov 8, 2022
    Copy the full SHA
    78ba79f View commit details
  2. plot f0

    jerryuhoo committed Nov 8, 2022
    Copy the full SHA
    3974415 View commit details

Commits on Nov 10, 2022

  1. fix pitch preprocess

    jerryuhoo committed Nov 10, 2022
    Copy the full SHA
    23b96a1 View commit details

Commits on Nov 11, 2022

  1. support ofuton

    jerryuhoo committed Nov 11, 2022
    Copy the full SHA
    d8372a7 View commit details

Commits on Nov 12, 2022

  1. fix jp preprocess

    jerryuhoo committed Nov 12, 2022
    Copy the full SHA
    a5c6940 View commit details

Commits on Nov 29, 2022

  1. Update README.md

    jerryuhoo committed Nov 29, 2022
    Copy the full SHA
    1ab64f3 View commit details
  2. jp support

    jerryuhoo committed Nov 29, 2022
    Copy the full SHA
    88ce519 View commit details
  3. update

    jerryuhoo committed Nov 29, 2022
    Copy the full SHA
    dcfd5e8 View commit details
  4. Copy the full SHA
    ce8652b View commit details

Commits on Dec 2, 2022

  1. fix bug

    jerryuhoo committed Dec 2, 2022
    Copy the full SHA
    c18ea1c View commit details

Commits on Dec 11, 2022

  1. update readme

    jerryuhoo committed Dec 11, 2022
    Copy the full SHA
    8e1e6d3 View commit details

Commits on Feb 15, 2023

  1. fix deprecated stft

    jerryuhoo committed Feb 15, 2023
    Copy the full SHA
    dda977a View commit details

Commits on Feb 24, 2023

  1. Create _config.yml

    jerryuhoo committed Feb 24, 2023
    Copy the full SHA
    9753626 View commit details
  2. update github page

    jerryuhoo authored Feb 24, 2023
    Copy the full SHA
    4b0ea01 View commit details
  3. Update _config.yml

    jerryuhoo committed Feb 24, 2023
    Copy the full SHA
    ad8bc16 View commit details
Showing with 7,911 additions and 3,946 deletions.
  1. +10 −0 .gitignore
  2. +44 −23 README.md
  3. +3 −0 _config.yml
  4. +46 −12 configs/singing_base.json
  5. +117 −45 data_utils.py
  6. +346 −0 evaluate/evaluate_f0.py
  7. +334 −0 evaluate/evaluate_mcd.py
  8. +358 −0 evaluate/evaluate_semitone.py
  9. +347 −0 evaluate/evaluate_vuv.py
  10. +53 −0 evaluate_score.sh
  11. +150 −0 filelists/singing_test.txt
  12. +3,456 −3,556 filelists/singing_train.txt
  13. +150 −200 filelists/singing_valid.txt
  14. +4 −2 mel_processing.py
  15. +509 −23 models.py
  16. +17 −0 normalize_wav.py
  17. +82 −0 plot_f0.py
  18. +98 −0 prepare/align_wav_spec.py
  19. +421 −0 prepare/data_vits_phn.py
  20. +421 −0 prepare/data_vits_phn_ofuton.py
  21. +3 −0 prepare/dur_to_frame.py
  22. +163 −0 prepare/gen_ofuton_transcript.py
  23. +52 −3 prepare/phone_map.py
  24. +8 −2 prepare/preprocess.py
  25. +39 −0 prepare/preprocess_jp.py
  26. +47 −0 prepare/resample_wav.py
  27. +7 −0 prepare/resample_wav.sh
  28. BIN resource/2005000151.wav
  29. BIN resource/2005000152.wav
  30. BIN resource/2006000186.wav
  31. BIN resource/2006000187.wav
  32. BIN resource/2008000268.wav
  33. BIN resource/sample_20220317.wav
  34. BIN resource/sample_20220318.wav
  35. BIN resource/sample_20220321.wav
  36. BIN resource/sample_20220406.wav
  37. BIN resource/sample_20220420.wav
  38. BIN resource/sample_20220424.wav
  39. BIN resource/vising_loss.png
  40. BIN resource/vising_mel.png
  41. BIN resource/vising_sample.wav
  42. +257 −41 train.py
  43. +1 −0 train.sh
  44. +45 −29 vsinging_infer.py
  45. +150 −10 vsinging_infer.txt
  46. +109 −0 vsinging_infer_jp.py
  47. +64 −0 vsinging_infer_jp.txt
10 changes: 10 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
*.pth
*.pyc
filelists/singing_train.txt
filelists/singing_valid.txt
filelists/vits_file.txt
logs
singing_out
*/*_res
*.zip
nohup.out
67 changes: 44 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,56 +1,77 @@
# Init
Use VITS and Opencpop to develop singing voice synthesis;
Different from VISinger, It is just VITS without MAS and DurationPredictor.
Unofficial Implement of VISinger

# 本项目基于
# Reference Repos
https://github.com/jaywalnut310/vits

https://github.com/MoonInTheRiver/DiffSinger

https://wenet.org.cn/opencpop/

# 数据预处理
https://github.com/PlayVoice/VI-SVS

# Data Preprocess
```bash
export PYTHONPATH=.
```

python prepare/data_vits.py
Generate ../VISinger_data/label_vits_phn/XXX._label.npy|XXX._label_dur.npy|XXX_score.npy|XXX_score_dur.npy|XXX_pitch.npy|XXX_slurs.npy

生成文件../VISinger_data/label_vits/XXX._label.npy|XXX_score.npy|XXX_pitch.npy|XXX_slurs.npy
```bash
python prepare/data_vits_phn.py
```

生成文件filelists/vits_file.txt; 内容格式:wave path|label path|score path|pitch path|slurs path;
Generate filelists/vits_file.txt
Format: wave path|label path|label duration path|score path|score duration path|pitch path|slurs path;

```bash
python prepare/preprocess.py
```

# VITS训练
# VISinger training

```bash
python train.py -c configs/singing_base.json -m singing_base
```

# 测试验证
or

1,训练集生成验证:F0根据音频提取
```bash
./train.sh
```

python vsinging_debug.py
# Inference

2,推理验证:F0根据规则生成
```bash
./evaluate_score.sh
```

python vsinging_infer.py
![LOSS](/resource/vising_loss.png)
![MEL](/resource/vising_mel.png)

3,完整歌曲合成(**使用release模型**
# Samples

pyton vsinging_song.py
<audio id="audio" controls="" preload="none">
<source id="wav" src="/resource/2005000151.wav">
</audio>

4,F0的问题可以额外训练F0预测器,或者使用UTAU绘制pit曲线
<audio id="audio" controls="" preload="none">
<source id="wav" src="/resource/2005000152.wav">
</audio>

<audio id="audio" controls="" preload="none">
<source id="wav" src="/resource/2005000186.wav">
</audio>

![LOSS值](/resource/vising_loss.png)
![MEL谱](/resource/vising_mel.png)
<audio id="audio" controls="" preload="none">
<source id="wav" src="/resource/2005000187.wav">
</audio>

<audio id="audio" controls="" preload="none">
<source id="wav" src="/resource/vising_sample.wav">
<source id="wav" src="/resource/2005000268.wav">
</audio>

# 样例音频

[vits_singing_样例.wav](/resource/vising_sample.wav)

# AI修复
https://github.com/brentspell/hifi-gan-bwe


3 changes: 3 additions & 0 deletions _config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
remote_theme: pages-themes/cayman@v0.2.0
plugins:
- jekyll-remote-theme # add this line to the plugins list if you already have one
58 changes: 46 additions & 12 deletions configs/singing_base.json
Original file line number Diff line number Diff line change
@@ -1,26 +1,30 @@
{
"train": {
"log_interval": 200,
"eval_interval": 10000,
"eval_interval": 2000,
"seed": 1234,
"epochs": 20000,
"learning_rate": 1e-4,
"betas": [0.8, 0.99],
"betas": [
0.8,
0.99
],
"eps": 1e-9,
"batch_size": 16,
"batch_size": 6,
"fp16_run": false,
"lr_decay": 0.999875,
"segment_size": 8192,
"init_lr_ratio": 1,
"warmup_epochs": 0,
"c_mel": 45,
"c_kl": 1.0
"c_kl": 1.0,
"keep_n_models": 20
},
"data": {
"training_files":"filelists/singing_train.txt",
"validation_files":"filelists/singing_valid.txt",
"training_files": "filelists/singing_train.txt",
"validation_files": "filelists/singing_valid.txt",
"max_wav_value": 32768.0,
"sampling_rate": 16000,
"sampling_rate": 24000,
"filter_length": 1024,
"hop_length": 256,
"win_length": 1024,
@@ -38,11 +42,41 @@
"kernel_size": 3,
"p_dropout": 0.1,
"resblock": "1",
"resblock_kernel_sizes": [3,7,11],
"resblock_dilation_sizes": [[1,3,5], [1,3,5], [1,3,5]],
"upsample_rates": [8,8,2,2],
"resblock_kernel_sizes": [
3,
7,
11
],
"resblock_dilation_sizes": [
[
1,
3,
5
],
[
1,
3,
5
],
[
1,
3,
5
]
],
"upsample_rates": [
8,
8,
2,
2
],
"upsample_initial_channel": 384,
"upsample_kernel_sizes": [16,16,4,4],
"upsample_kernel_sizes": [
16,
16,
4,
4
],
"use_spectral_norm": false
}
}
}
Loading