PitchVC: Pitch Conditioned Any-to-Many Voice Conversion

🎧 Audio Samples. $\quad\quad$ 🤗 Play Online.

Description

A simple VC framework.

(a) Training	(b) Inference


(c) Training (w/ optional properties)	(d) Inference (w/ optional properties)

Detailed description.

Pre-requisites

Clone this repo: git clone https://github.com/OlaWod/PitchVC.git
CD into this repo: cd PitchVC
Install python requirements: pip install -r requirements.txt
Download files on demand (e.g. pretrained checkpoint) (download link)

Inference Example

Files on demand:

Pretrained checkpoint (e.g. exp/default/g_00700000)
Source wavs (e.g. src1.wav) and target wavs&embs (e.g. p244_008.wav&p244_008.npy) in convert.txt
Utils/JDC/bst.t7
(Optional) speakerlab/pretrained/speech_eres2net_sv_en_voxceleb_16k/pretrained_eres2net.ckpt and speakerlab/pretrained/speech_eres2net_sv_zh-cn_16k-common/pretrained_eres2net_aug.ckpt

# single process
CUDA_VISIBLE_DEVICES=0 python convert_sp.py --hpfile config_v1_16k.json --ptfile exp/default/g_00700000 --txtpath convert.txt --outdir outputs/test

# single process; finetune input f0 automatically
CUDA_VISIBLE_DEVICES=0 python convert_sp.py --hpfile config_v1_16k.json --ptfile exp/default/g_00700000 --txtpath convert.txt --outdir outputs/test --search

# multi process
CUDA_VISIBLE_DEVICES=0 python convert_mp.py --hpfile config_v1_16k.json --ptfile exp/default/g_00700000 --txtpath convert.txt --outdir outputs/test --n_processes 6

# multi process; finetune input f0 automatically
CUDA_VISIBLE_DEVICES=0 python convert_mp.py --hpfile config_v1_16k.json --ptfile exp/default/g_00700000 --txtpath convert.txt --outdir outputs/test --n_processes 6 --search

convert.txt:

{title}|{source_wav_path}|{target_spk_reference_wav_path}|{target_spk_id}|{target_spk_reference_embedding_path}
e.g.
title1|src1.wav|dataset/audio/p244/p244_008.wav|p244|dataset/spk/p244/p244_008.npy

Training Example

Files on demand:

VCTK dataset
speaker_encoder/ckpt/pretrained_bak_5805000.pt
Utils/JDC/bst.t7

Preprocess:

export PYTHONPATH=.

python preprocess/1_downsample.py --in_dir </path/to/VCTK/wavs> # dataset/vctk-16k/{spk}/{xx}.wav
python preprocess/2_get_flist.py    # filelists/{situation}.txt
python preprocess/3_get_spk2id.py   # filelists/spk2id.json
python preprocess/4_get_spk_emb.py  # dataset/spk/{spk}/{xx}.npy
python preprocess/5_get_spk_emb_best.py # filelists/spk_stats.json
python preprocess/6_get_f0.py       # dataset/f0/{spk}/{xx}.pt
python preprocess/7_get_f0_stats.py # filelists/f0_stats.json

cd dataset
ln -s vctk-16k audio
cd ..

Training:

CUDA_VISIBLE_DEVICES=0 python train.py --config config_v1_16k.json --checkpoint_path exp/test

Test Example

python test/1_select_tgt.py # test/TEST_TGT/{xx}.wav
python test/2_select_src.py # test/TEST_SRC_{CORPUS}/{xx}.wav
python test/3_get_txts.py   # test/txts/{scenario}.txt

CUDA_VISIBLE_DEVICES=0 python convert_mp.py --hpfile config_v1_16k.json --ptfile exp/default/g_00700000 --txtpath test/txts/<scenario>.txt --outdir outputs/<scenario> --n_processes 6 --search

cd metrics/<metrics>
bash run.sh

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.vscode		.vscode
Utils		Utils
filelists		filelists
metrics		metrics
onnx_		onnx_
openvino_		openvino_
preprocess		preprocess
resources		resources
speaker_encoder		speaker_encoder
speakerlab		speakerlab
test		test
.gitignore		.gitignore
Description.md		Description.md
LICENSE		LICENSE
README.md		README.md
asv.py		asv.py
config_v1_16k.json		config_v1_16k.json
convert.txt		convert.txt
convert_mp.py		convert_mp.py
convert_sp.py		convert_sp.py
env.py		env.py
meldataset.py		meldataset.py
models.py		models.py
requirements.txt		requirements.txt
stft.py		stft.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PitchVC: Pitch Conditioned Any-to-Many Voice Conversion

Description

Pre-requisites

Inference Example

Training Example

Test Example

References

About

Releases

Packages

Languages

License

OlaWod/PitchVC

Folders and files

Latest commit

History

Repository files navigation

PitchVC: Pitch Conditioned Any-to-Many Voice Conversion

Description

Pre-requisites

Inference Example

Training Example

Test Example

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages