Cache folders not created when transcribing audio #181

exengo · 2024-12-08T16:16:42Z

I use UltraSinger with a model to transcribe Swedish songs, specifically like this:

python UltraSinger.py -i 'some-swedish-song' --whisper_align_model 'KBLab/wav2vec2-large-voxrex-swedish'

It fails with a FileNotFoundError.
Traceback (most recent call last): File "/home/user/Dev/UltraSinger/src/UltraSinger.py", line 693, in <module> main(sys.argv[1:]) File "/home/user/Dev/UltraSinger/src/UltraSinger.py", line 573, in main run() File "/home/user/Dev/UltraSinger/src/UltraSinger.py", line 147, in run TranscribeAudio(process_data) File "/home/user/Dev/UltraSinger/src/UltraSinger.py", line 353, in TranscribeAudio transcription_result = transcribe_audio(process_data.process_data_paths.cache_folder_path, File "/home/user/Dev/UltraSinger/src/UltraSinger.py", line 483, in transcribe_audio with open(transcription_path, "w", encoding=FILE_ENCODING) as file: FileNotFoundError: [Errno 2] No such file or directory: '/home/user/Dev/UltraSinger/src/output/Kent - Ingen kunde röra oss/cache/whisper_large-v3_cuda_KBLab/wav2vec2-large-voxrex-swedish_KBLab/wav2vec2-large-voxrex-swedish_16_None_None.json'

I figured out that the folders were not created and open therefore failed. Adding this on line 483 fixes the issue:
os.makedirs(os.path.dirname(transcription_path), exist_ok=True)

The text was updated successfully, but these errors were encountered:

Calamdor · 2024-12-11T19:14:01Z

Thanks for this, was trying to figure out why every model I tried was not working!

agwosdz · 2024-12-14T14:30:37Z

Weird, folder creation occurs in procedure CreateProcessAudio, line 142 - " process_data.process_data_paths.processing_audio_path = CreateProcessAudio(process_data)", as defined in CreateProcessAudio - os_helper.create_folder(process_data.process_data_paths.cache_folder_path) , line (423).

I wonder what is happening there. transcription_path is just a .json file in the cache_folder_path (transcription_path = os.path.join(cache_folder_path, f"{transcription_config}.json"))

Was there any other error message prior?

rakuri255 · 2024-12-16T00:46:49Z

Can someone make an PR or give an song link?

agwosdz · 2024-12-17T02:23:24Z

Can someone make an PR or give an song link?

https://www.youtube.com/watch?v=17HIRea5C6Y

agwosdz · 2024-12-17T02:33:41Z

I am pretty sure I know what the problem is. Will create a PR.

Issue is the "/" in the --whisper_align_model option/parameter.

The "/" get's interpreted literally, confusing python about the cache path.

agwosdz · 2024-12-17T03:27:31Z

Workaround, if you are in a hurry is to edit UltraSinger.py at line 466:

def transcribe_audio(cache_folder_path: str, processing_audio_path: str) -> TranscriptionResult:
"""Transcribe audio with AI"""
transcription_result = None
### whisper_align_model_string = None
if settings.transcriber == "whisper":
### if not settings.whisper_align_model is None: whisper_align_model_string = settings.whisper_align_model.replace("/", "_")
### transcription_config = f"{settings.transcriber}{settings.whisper_model.value}{settings.pytorch_device}{whisper_align_model_string}{settings.whisper_batch_size}{settings.whisper_compute_type}{settings.language}"
transcription_path = os.path.join(cache_folder_path, f"{transcription_config}.json")
cached_transcription_available = check_file_exists(transcription_path)
if settings.skip_cache_transcription or not cached_transcription_available:
transcription_result = transcribe_with_whisper(
processing_audio_path,
settings.whisper_model,
settings.pytorch_device,
settings.whisper_align_model,
settings.whisper_batch_size,
settings.whisper_compute_type,
settings.language,
)
with open(transcription_path, "w", encoding=FILE_ENCODING) as file:
file.write(transcription_result.to_json())
else:
print(f"{ULTRASINGER_HEAD} {green_highlighted('cache')} reusing cached transcribed data")
with open(transcription_path) as file:
json = file.read()
transcription_result = TranscriptionResult.from_json(json)
else:
raise NotImplementedError
return transcription_result

Marked and highlighted are the changes:

Essentially, we check if the argument for --whisper_align_model was provided, and if it was, we replace any "/" in the name with "_" in the cache folder path. That way, the setting remains valid for the model, but does not confuse the OS with a "/" in a pathname, making it interpret it as a folder and hence resulting in a file not found error.

rakuri255 added the bug Something isn't working label Dec 16, 2024

agwosdz mentioned this issue Dec 17, 2024

"/" in whisper align model string get interpreted as path #190

Merged

rakuri255 linked a pull request Dec 17, 2024 that will close this issue

"/" in whisper align model string get interpreted as path #190

Merged

rakuri255 closed this as completed in #190 Dec 17, 2024

rakuri255 mentioned this issue Dec 17, 2024

Whisper align bollywood #180

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache folders not created when transcribing audio #181

Cache folders not created when transcribing audio #181

exengo commented Dec 8, 2024

Calamdor commented Dec 11, 2024

agwosdz commented Dec 14, 2024

rakuri255 commented Dec 16, 2024

agwosdz commented Dec 17, 2024

agwosdz commented Dec 17, 2024 •

edited

Loading

agwosdz commented Dec 17, 2024 •

edited

Loading

Cache folders not created when transcribing audio #181

Cache folders not created when transcribing audio #181

Comments

exengo commented Dec 8, 2024

Calamdor commented Dec 11, 2024

agwosdz commented Dec 14, 2024

rakuri255 commented Dec 16, 2024

agwosdz commented Dec 17, 2024

agwosdz commented Dec 17, 2024 • edited Loading

agwosdz commented Dec 17, 2024 • edited Loading

agwosdz commented Dec 17, 2024 •

edited

Loading

agwosdz commented Dec 17, 2024 •

edited

Loading