You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to train this repo on LibriTTS dataset. Starting with ASR training.
question 1: is the data formatting the same "path|transcription|speaker#"? also i see in your config you are now using csv's, do i have to convert to a csv as well.
question 2: does this training log look correct? the text that it prints doesnt make any sense.
I changed the code to use a single string as train_data and val_data instead of a list. config.yml:
//...
class MelDataset(torch.utils.data.Dataset):
def __init__(self, data_list, dict_path=DEFAULT_DICT_PATH, sr=22050):
_data_list = [l[:-1].split("|") for l in data_list]
self.min_seq_len = int(0.6 * 22050)
self.max_sql_len = int(10.0 * 22050)
self.text_cleaner = TextCleaner(dict_path)
self.sr = sr
self.data_list = self._filter(_data_list)
def _filter(self, data):
data_list = [
(data[0], data[1], data[2])
for data in data
if (
self.max_sql_len
> (Path(data[0]).stat().st_size // 2)
> self.min_seq_len
and len(data[1]) > 5
)
]
print("data_list length: ", len(data))
print("filtered data_list length: ", len(data_list))
return data_list
def __len__(self):
return len(self.data_list)
//....
utils.py:
//...
def get_data_path_list(train_path=None, val_path=None):
train_list = []
val_list = []
if train_path:
with open(train_path, "r") as f:
train_list.extend(f.readlines())
if val_path:
with open(val_path, "r") as f:
val_list.extend(f.readlines())
return train_list, val_list
//...
Example dataset format "train_data_test.txt".
LibriTTS/train-clean-100/1088/129236/1088_129236_000019_000008.wav|The lover sees no resemblance except to summer evenings and diamond mornings, to rainbows and the song of birds.|6
LibriTTS/train-clean-100/1088/129236/1088_129236_000020_000003.wav|It is destroyed for the imagination by any attempt to refer it to organization.|6
LibriTTS/train-clean-100/1098/133695/1098_133695_000012_000001.wav|He thought a great deal about her; she was constantly present to his mind. At a time when his thoughts had been a good deal of a burden to him her sudden arrival, which promised nothing and was an open handed gift of fate, had refreshed and quickened them, given them wings and something to fly for.|9
//...
training logs.
data_list length: 29493
filtered data_list length: 23397
speaker_samples_weight tensor([0.0027, 0.0027, 0.0027, ..., 0.0019, 0.0019, 0.0180])
/opt/conda/envs/py39/lib/python3.9/site-packages/torch/utils/data/dataloader.py:554: UserWarning: This DataLoader will create 8 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
data_list length: 2276
filtered data_list length: 1680
/opt/conda/envs/py39/lib/python3.9/site-packages/torch/utils/data/dataloader.py:554: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
0%| | 0/365 [00:00<?, ?it/s]/opt/conda/envs/py39/lib/python3.9/site-packages/torch/utils/data/dataloader.py:554: UserWarning: This DataLoader will create 8 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
A
"
W
R
"
W
O
W
R
"
D
M
"
S
;
"
A
"
M
"
;
S
W
H
O
M
E
O
"
;
"
"
"
I
"
A
"
I
T
A
B
T
T
Y
I
I
A
"
"
M
"
"
T
"
D
A
"
P
"
I
I
"
S
S
J
S
M
O
S
A
I
T
B
A
I
T
I
O
I
H
M
C
"
I
H
I
"
I
I
T
I
"
"
I
I
H
"
T
T
A
A
L
L
E
;
I
I
O
I
H
I
F
"
O
D
P
F
"
W
"
A
O
W
G
I
"
F
V
M
T
C
I
W
B
;
B
T
S
I
I
"
I
F
I
B
C
"
T
C
"
I
"
F
C
"
C
"
F
C
T
T
A
G
L
I
T
T
E
R
I
G
I
G
H
T
F
L
O
W
E
R
T
H
E
U
S
E
O
F
A
A
M
E
I
M
B
W
I
L
;
A
T
I
I
M
;
I
O
J
V
T
"
W
D
G
"
"
W
;
O
I
T
I
"
F
M
P
"
"
I
I
"
S
"
L
S
;
O
;
H
;
P
H
A
"
S
S
S
"
I
I
"
T
T
S
I
"
T
M
"
S
O
;
O
M
B
I
"
I
T
A
H
A
T
W
T
"
U
J
W
Y
"
H
"
"
M
S
O
I
D
;
B
B
"
A
T
;
;
S
"
I
"
C
"
B
A
D
"
B
A
M
F
A
"
B
W
I
"
R
"
G
B
S
P
I
T
"
I
G
B
"
;
"
"
T
"
H
"
"
S
"
I
L
J
R
I
A
H
"
"
H
H
W
I
"
P
P
K
F
A
"
H
S
"
I
J
H
F
H
S
;
T
F
Y
"
W
"
P
S
T
I
C
W
B
O
S
T
C
P
;
C
I
H
Y
"
"
I
"
B
D
"
S
"
"
T
"
J
H
F
S
A
A
"
"
"
T
W
"
I
I
"
U
"
H
"
"
I
I
K
"
S
B
I
"
Y
"
Q
"
T
"
T
;
"
I
I
"
T
I
"
"
"
I
B
A
T
"
O
I
"
I
"
S
B
;
F
"
P
A
R
I
S
S
I
"
"
K
"
"
"
T
M
H
"
"
I
I
"
R
I
S
"
D
"
O
I
"
E
"
S
"
"
T
E
S
P
T
M
T
B
"
I
"
"
Y
"
"
W
M
"
"
I
S
"
P
"
"
S
P
"
"
L
P
"
J
T
H
I
I
T
B
"
D
;
"
B
"
I
"
B
"
H
"
Y
E
I
"
W
"
Q
T
R
M
I
M
A
P
C
"
T
L
"
"
I
C
"
I
I
A
"
D
"
C
M
T
I
W
P
C
"
A
"
A
;
"
I
"
T
"
A
I
B
S
"
L
I
"
B
T
"
I
T
C
I
I
T
H
I
P
I
T
H
W
S
T
"
O
;
;
I
"
B
I
"
I
"
"
T
;
I
I
B
B
V
T
D
T
I
;
I
F
P
A
S
"
A
"
"
H
T
I
T
I
M
I
E
I
;
H
S
T
U
T
"
I
"
W
I
I
B
I
I
S
L
;
I
A
C
L
F
F
C
H
A
B
S
A
T
Y
S
S
D
R
B
O
A
A
L
L
"
W
I
L
H
S
H
"
I
"
P
A
"
W
I
O
"
Y
I
"
I
F
"
H
"
E
H
"
C
"
D
"
I
B
"
P
T
A
T
A
"
"
M
B
I
I
L
R
H
;
M
"
T
"
;
"
G
"
A
"
"
I
C
"
T
M
H
D
I
T
T
S
M
I
A
"
A
"
G
T
"
B
H
"
S
I
I
"
W
F
"
O
"
B
B
A
M
G
W
P
D
I
E
"
"
"
A
W
B
"
I
V
J
H
"
W
I
I
I
O
I
I
O
T
I
G
A
T
I
I
"
B
H
B
"
D
T
"
T
"
"
T
H
L
"
P
"
"
T
T
I
H
"
J
J
T
T
A
I
"
A
W
"
B
E
J
R
R
E
J
"
V
"
;
T
"
I
S
T
B
I
"
L
T
T
B
T
"
B
H
"
S
F
"
H
T
I
;
I
"
W
I
"
Y
;
"
"
I
;
I
C
J
"
I
I
"
T
T
"
"
D
M
"
I
"
A
"
S
E
S
S
W
"
C
"
B
O
I
G
T
M
S
P
H
I
I
I
"
S
C
B
T
A
I
A
W
H
I
"
I
"
A
A
C
"
I
M
H
C
T
"
I
"
S
;
O
S
I
"
W
I
"
D
"
Y
"
H
"
H
H
I
I
A
H
S
B
I
A
P
J
D
T
M
T
I
"
F
C
H
G
O
B
T
W
"
"
M
"
"
I
C
O
"
U
Y
L
"
D
"
"
I
"
I
;
I
B
"
I
"
W
I
"
B
"
W
"
S
C
B
I
D
F
B
D
I
"
D
M
"
T
D
C
L
T
"
A
;
"
A
I
;
"
O
B
I
"
H
A
J
W
T
D
"
A
"
;
"
A
M
"
I
I
I
I
I
"
C
"
Y
T
"
S
"
A
F
P
W
T
L
C
T
P
T
"
W
"
Q
O
I
S
[
]
I
I
W
V
W
"
I
"
T
F
"
G
T
"
M
P
"
"
D
"
;
T
O
M
I
"
"
W
I
B
;
K
I
"
A
B
"
C
I
"
T
A
S
L
"
T
"
W
I
"
S
"
R
D
I
;
I
I
U
"
A
O
T
"
"
A
B
T
C
P
A
I
T
F
T
O
M
T
T
T
F
;
S
M
S
I
T
T
I
O
B
S
"
W
I
J
I
"
B
T
I
"
"
Y
"
"
T
W
"
"
I
I
"
S
G
W
H
C
A
I
"
B
(
)
W
T
L
A
L
"
W
B
"
T
J
B
"
W
"
"
I
;
I
"
A
I
"
H
S
K
W
G
F
"
"
L
R
S
D
M
I
"
W
M
"
T
"
W
I
"
V
"
J
"
I
I
I
I
"
I
T
"
H
F
A
"
I
T
I
"
"
T
"
"
P
"
C
P
I
A
M
H
H
"
M
C
"
"
J
"
D
"
P
P
O
"
W
"
"
I
"
S
"
Y
C
"
T
T
M
I
I
H
H
T
I
T
T
G
"
H
E
J
;
T
O
C
K
A
H
C
S
R
"
I
J
"
P
I
W
H
"
C
I
"
T
T
"
I
I
I
W
"
T
I
"
O
"
G
"
I
T
T
B
"
A
"
H
W
"
"
T
S
O
O
C
"
O
I
"
L
;
"
"
T
K
S
G
G
W
S
T
A
L
"
P
;
"
S
L
"
A
"
B
W
"
T
K
"
"
"
"
D
I
"
W
"
"
I
"
S
S
W
C
B
H
S
"
T
M
A
I
G
G
I
"
"
Y
"
P
"
W
"
W
B
A
A
H
C
M
"
O
"
A
P
T
I
I
"
H
F
"
L
;
"
"
I
"
H
I
I
H
T
W
"
I
;
I
"
"
W
"
M
C
H
P
T
"
W
T
A
S
H
"
W
M
H
L
"
T
"
W
"
H
I
T
H
T
I
"
I
"
"
W
I
I
"
"
Y
;
W
"
B
I
"
"
W
"
A
C
I
"
H
W
"
J
"
"
I
I
"
"
W
"
W
"
B
B
R
"
;
I
"
A
T
L
(
)
W
I
S
I
"
J
T
S
W
"
A
K
F
G
C
"
S
A
"
T
A
I
D
C
A
T
L
H
"
W
"
"
I
G
"
A
I
"
W
"
"
A
"
H
H
H
"
I
I
B
I
"
Y
I
"
A
"
C
A
A
I
C
R
E
A
T
E
D
H
S
T
"
S
S
I
"
I
I
L
D
H
"
L
"
I
I
A
R
I
"
A
"
G
I
"
S
H
I
I
Y
I
"
I
B
I
C
"
A
"
T
"
I
I
"
L
E
O
F
P
P
T
O
E
I
I
C
"
"
W
"
I
"
"
S
B
;
P
B
I
I
B
I
I
C
"
D
G
C
I
I
A
U
S
H
I
I
"
M
;
I
"
"
I
"
W
"
"
"
W
A
;
I
I
"
P
"
M
A
F
I
L
"
W
"
A
O
W
T
W
;
I
"
B
I
"
"
A
I
"
T
A
;
S
I
B
"
L
"
C
C
I
T
"
"
W
"
T
"
I
"
E
G
O
B
T
V
S
"
O
S
"
R
A
T
S
"
B
I
I
"
W
J
"
"
J
"
I
I
"
H
"
W
C
Y
I
C
"
C
W
I
"
"
T
I
"
"
T
"
F
"
S
R
"
"
I
"
L
"
T
"
H
"
I
T
I
I
S
O
I
P
F
G
T
R
;
"
A
I
"
I
"
H
"
T
F
W
W
"
A
I
"
I
"
H
A
J
A
S
Y
I
L
"
S
I
F
;
I
W
"
S
I
W
"
B
"
I
W
A
"
H
"
H
H
S
L
T
O
B
J
W
B
B
;
"
B
I
"
"
I
"
L
B
V
T
M
T
M
Y
"
W
"
"
R
"
E
T
"
W
T
F
E
"
L
"
A
"
A
"
I
I
W
"
A
"
P
A
I
"
Y
I
"
B
I
L
I
I
"
W
"
I
I
"
L
"
I
T
W
"
K
C
I
G
B
C
"
F
"
G
Y
H
W
H
T
M
"
"
I
"
B
T
"
I
T
"
A
"
E
G
D
;
D
W
"
B
"
O
"
R
"
T
J
I
[
E
J
"
I
"
S
K
A
I
T
I
T
H
I
O
B
"
I
F
"
J
I
V
I
Y
"
T
J
"
I
I
Y
W
;
W
I
"
F
I
;
I
;
I
"
L
;
"
"
"
I
C
H
"
"
I
"
M
T
"
I
T
"
O
;
;
I
"
"
A
"
Y
W
W
"
"
B
"
T
"
H
"
"
"
P
O
H
H
C
"
M
"
W
I
"
I
"
"
Y
I
"
J
"
I
"
I
"
L
"
A
T
G
L
"
B
L
H
"
"
I
"
A
"
I
"
P
J
I
L
D
"
Y
;
I
I
O
H
E
J
"
"
I
"
M
"
A
D
I
I
A
"
M
I
"
;
"
I
I
I
"
C
"
H
B
B
"
P
"
G
T
B
"
B
T
I
"
B
C
R
I
I
"
T
S
E
I
H
F
J
D
B
"
W
"
J
I
"
F
"
A
"
I
I
I
"
"
W
"
P
G
W
"
P
C
T
"
A
H
B
A
H
P
T
R
L
F
C
B
T
I
"
A
"
I
I
C
"
I
T
I
"
W
H
"
T
J
"
I
C
"
I
I
"
S
;
"
I
T
"
D
H
"
A
I
"
I
"
L
"
"
G
F
I
"
C
"
Q
I
A
T
M
W
M
"
I
;
I
;
;
I
E
;
O
H
T
E
C
D
"
T
S
B
H
T
L
I
I
M
P
"
I
F
S
H
G
"
I
S
I
P
T
T
C
A
"
I
S
I
"
I
A
;
I
"
O
I
"
S
"
T
I
"
T
A
"
I
A
"
M
I
I
I
A
"
B
B
"
S
M
C
"
S
M
H
"
S
"
W
F
P
"
I
"
H
"
B
I
B
T
"
A
I
;
H
"
I
L
A
M
B
S
"
A
W
A
G
R
I
E
D
I
C
O
A
C
J
M
B
"
T
"
M
M
M
H
;
"
I
"
D
"
W
I
"
B
"
H
"
H
"
I
"
I
S
H
E
L
L
I
E
S
S
E
U
I
B
T
"
I
"
M
"
I
"
L
T
M
"
"
B
S
D
"
I
I
"
G
"
L
"
E
"
P
"
"
S
S
S
"
I
M
"
I
"
T
"
R
T
I
W
W
"
Y
M
F
T
S
F
S
A
I
R
A
Z
T
S
S
O
S
"
T
S
"
"
W
"
C
K
I
I
"
S
R
G
R
A
"
I
"
I
A
L
"
M
J
"
Y
"
M
M
M
O
"
T
W
J
M
C
R
S
"
I
T
S
H
B
"
W
"
M
I
"
A
"
B
G
T
P
P
S
T
I
"
P
C
"
P
A
I
W
E
S
T
M
I
S
T
E
R
G
A
Z
E
T
T
E
I
(
I
)
"
T
"
Y
H
S
E
"
Y
"
"
Y
"
S
C
I
A
C
B
T
E
J
R
R
E
J
O
I
L
L
;
I
G
G
"
I
I
"
I
I
"
"
Y
"
I
G
G
I
K
"
O
P
"
U
T
T
"
I
I
"
C
"
W
A
I
A
P
I
I
W
A
"
A
"
R
H
;
"
F
"
H
I
"
T
"
"
I
T
B
B
A
E
"
A
"
K
"
D
"
;
"
"
"
I
"
I
"
S
"
F
[Loss: 15.0932, LR: 0.00050]: 0%| | 1/365 [00:22<2:13:41, 22.04s/it]"
G
J
C
"
Y
W
W
A
T
"
H
I
"
W
W
"
Y
E
I
B
J
I
T
S
L
P
"
A
"
T
"
T
I
B
B
B
"
L
D
"
R
R
R
"
O
H
B
J
T
H
I
P
I
H
W
"
T
A
F
I
L
Y
"
I
T
"
"
H
I
"
B
T
;
H
W
R
R
O
S
P
H
B
O
M
"
S
M
"
I
"
Y
"
A
R
T
;
I
"
"
A
"
I
F
I
"
I
"
O
I
H
T
M
M
M
S
T
"
I
"
T
"
O
"
R
O
T
P
B
;
"
I
"
"
"
[Loss: 13.3964, LR: 0.00050]: 1%| | 2/365 [00:23<1:00:05, 9.93s/it]E
"
"
H
Y
U
H
P
S
I
W
T
B
W
I
Y
T
"
"
I
"
J
S
S
W
C
B
H
I
I
I
I
I
"
B
"
A
"
F
A
S
"
I
A
"
M
A
I
I
B
I
T
I
M
W
W
I
I
S
I
I
T
I
"
T
"
W
J
"
I
;
M
A
"
L
W
"
I
"
W
"
L
C
"
S
"
W
"
A
"
O
H
"
T
A
F
;
"
W
;
I
"
[Loss: 9.3134, LR: 0.00050]: 1%| | 3/365 [00:24<36:07, 5.99s/it]B
A
C
M
"
W
I
I
S
I
I
I
"
I
"
B
I
I
I
R
I
"
H
C
H
W
"
J
H
J
"
P
I
R
U
C
"
A
"
"
"
P
"
I
"
S
W
T
T
A
I
B
;
"
T
"
B
I
T
A
L
D
C
W
"
T
"
T
A
F
A
P
;
L
"
W
I
E
C
"
W
I
S
F
F
S
H
T
I
;
I
I
T
;
W
C
B
W
B
I
F
B
H
"
Y
"
C
A
H
W
T
A
H
;
I
"
I
;
"
H
W
A
C
T
T
T
"
T
"
"
I
T
D
B
G
B
I
H
M
Q
W
T
H
"
F
I
E
"
C
B
;
;
G
O
T
"
"
L
"
H
T
P
B
L
H
P
I
A
T
H
[Loss: 8.1618, LR: 0.00050]: 1%|▏ | 4/365 [00:26<26:49, 4.46s/it]A
/.....
The text was updated successfully, but these errors were encountered:
you would have to add the the missing characters/ phonemes used in your dataset to this file: word_index_dict_new.txt
or your own file and reference it in the code.
I am trying to train this repo on LibriTTS dataset. Starting with ASR training.
question 1: is the data formatting the same "path|transcription|speaker#"? also i see in your config you are now using csv's, do i have to convert to a csv as well.
question 2: does this training log look correct? the text that it prints doesnt make any sense.
I changed the code to use a single string as train_data and val_data instead of a list.
config.yml:
meldataset.py:
utils.py:
Example dataset format "train_data_test.txt".
training logs.
The text was updated successfully, but these errors were encountered: