Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decoding error in preprocessing synthesizer #439

Closed
amintavakol opened this issue Jul 23, 2020 · 10 comments
Closed

decoding error in preprocessing synthesizer #439

amintavakol opened this issue Jul 23, 2020 · 10 comments

Comments

@amintavakol
Copy link

amintavakol commented Jul 23, 2020

I get the following error while running synthesizer_preprocess_audio.py.

Arguments:
    datasets_root:   /home/amin/voice_cloning/libri_100
    out_dir:         /home/amin/voice_cloning/libri_100/SV2TTS/synthesizer
    n_processes:     None
    skip_existing:   True
    hparams:         

Using data from:
    /home/amin/voice_cloning/libri_100/LibriSpeech/train-clean-100
LibriSpeech:   0%|                                                                                                                                       | 0/502 [00:00<?, ?speakers/s]
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/amin/voice_cloning/Real-Time-Voice-Cloning-master/synthesizer/preprocess.py", line 62, in preprocess_speaker
    alignments = [line.rstrip().split(" ") for line in alignments_file]
  File "/home/amin/voice_cloning/Real-Time-Voice-Cloning-master/synthesizer/preprocess.py", line 62, in <listcomp>
    alignments = [line.rstrip().split(" ") for line in alignments_file]
  File "/usr/lib/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa2 in position 37: invalid start byte
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "synthesizer_preprocess_audio.py", line 52, in <module>
    preprocess_librispeech(**vars(args))    
  File "/home/amin/voice_cloning/Real-Time-Voice-Cloning-master/synthesizer/preprocess.py", line 36, in preprocess_librispeech
    for speaker_metadata in tqdm(job, "LibriSpeech", len(speaker_dirs), unit="speakers"):
  File "/home/amin/.local/lib/python3.6/site-packages/tqdm/std.py", line 1130, in __iter__
    for obj in iterable:
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 735, in next
    raise value
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa2 in position 37: invalid start byte

Can anyone help? It can save a lot of time for me.
Thanks.

@ghost
Copy link

ghost commented Jul 23, 2020

Can you try it with the 392_single_threaded_preprocess branch of my fork and post the traceback? It will help to know which alignment file it is breaking on.

@ghost
Copy link

ghost commented Jul 23, 2020

with alignments_fpath.open("r") as alignments_file:

Try making this modification:

with alignments_fpath.open("r", encoding="ascii") as alignments_file:

@ghost
Copy link

ghost commented Jul 24, 2020

@amintavakol Did you resolve the issue?

@amintavakol
Copy link
Author

amintavakol commented Jul 24, 2020

Yes, that fixes the issue.
Also changing the try, except block in synthesizer/preprocess.py to this:

try:
     alignments_fpath = next(book_dir.glob("*.alignment.txt"))
     with alignments_fpath.open("r") as alignments_file:
          alignments = [line.rstrip().split(" ") for line in alignments_file]
      except :
           # A few alignment files will be missing
           continue

keeps the preprocessing running for the non-problematic files.

@shoegazerstella
Copy link

I am having the same issue but appearing in synthesizer/synthesizer_dataset.py line 13 which I tried to solve like:

        metadata = []
        with metadata_fpath.open("r", encoding="ascii") as metadata_file:
            #metadata = [line.split("|") for line in metadata_file]
            try:
                for line in metadata_file:
                    metadata.append(line.split("|"))
            except Exception as e:
                ex = e

But now I have this much samples in the training dataset and I am not sure it is correct:
Found 24353 samples

@ghost
Copy link

ghost commented Aug 13, 2020

@shoegazerstella My processed train-clean-100 and train-clean-360 for LibriTTS has 111,521 samples.

Can you print the exception?

@ghost ghost reopened this Aug 13, 2020
@shoegazerstella
Copy link

2020-08-12 15:33:52 - INFO - b'Starting the training of Tacotron from scratch\n'
2020-08-12 15:33:52 - INFO - b'\n'
2020-08-12 15:33:52 - INFO - b'Using inputs from:\n'
2020-08-12 15:33:52 - INFO - b'\t/opt/ml/input/data/train/train.txt\n'
2020-08-12 15:33:52 - INFO - b'\t/opt/ml/input/data/train/mels\n'
2020-08-12 15:33:52 - INFO - b'\t/opt/ml/input/data/train/embeds\n'
2020-08-12 15:33:52 - INFO - b'Traceback (most recent call last):\n'
2020-08-12 15:33:52 - INFO - b'  File "synthesizer_train.py", line 33, in <module>\n'
2020-08-12 15:33:52 - INFO - b'    train(**vars(args))\n'
2020-08-12 15:33:52 - INFO - b'  File "/root/voicecloning/synthesizer/train.py", line 112, in train\n'
2020-08-12 15:33:52 - INFO - b'    dataset = SynthesizerDataset(metadata_fpath, mel_dir, embed_dir)\n'
2020-08-12 15:33:52 - INFO - b'  File "/root/voicecloning/synthesizer/synthesizer_dataset.py", line 14, in __init__\n'
2020-08-12 15:33:52 - INFO - b'    metadata = [line.split("|") for line in metadata_file]\n'
2020-08-12 15:33:52 - INFO - b'  File "/root/voicecloning/synthesizer/synthesizer_dataset.py", line 14, in <listcomp>\n'
2020-08-12 15:33:52 - INFO - b'    metadata = [line.split("|") for line in metadata_file]\n'
2020-08-12 15:33:52 - INFO - b'  File "/opt/conda/lib/python3.6/encodings/ascii.py", line 26, in decode\n'
2020-08-12 15:33:52 - INFO - b'    return codecs.ascii_decode(input, self.errors)[0]\n'
2020-08-12 15:33:52 - INFO - b"UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1481: ordinal not in range(128)\n"

@ghost
Copy link

ghost commented Aug 13, 2020

Does it help if you change line 13 synthesizer_dataset.py to:

with metadata_fpath.open("r", encoding="utf-8") as metadata_file:

I think your system locale causes files to be saved as utf-8 by default so certain characters are out of range when loading them as ascii.

@shoegazerstella
Copy link

shoegazerstella commented Aug 13, 2020

You are right, now I see:
Found 76052 samples

I should also mention this log from the preprocessing:

The dataset consists of 76052 utterances, 21949820 mel frames, 6025708370 audio timesteps (75.91 hours).
Max input length (text chars): 158
Max mel frames length: 500
Max audio timesteps length: 137374

So this seems to be in line with it!

@ghost
Copy link

ghost commented Aug 13, 2020

@shoegazerstella If restricting the max mel frames length to 500, I have 76,153 samples. (The other number is using the default of 900) So everything seems to be working well now!

@ghost ghost closed this as completed Aug 13, 2020
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants