I want to make LRC files #8
-
Hi, I wanted to use the same 2 packages you are using to make LRC files. But apparently your 2 packages are in python, your player is in Java, so the only place where you make LRC files is in Java? I was wondering if you could provide a snippet of how to make an LRC file using your python libraries. Currently it takes me 20-30 seconds to produce an LRC with other packages by using pre-compiled EXE versions of whisper... but yours looks like it makes much better LRC files. |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 13 replies
-
I'm not sure to understand your question. Can you be more precise? On the main principle, to produce good SRT files:
WhisperHallu is an experimentation around Whisper, thus written in Python. WhisperTimeSync aligner is written in Java. Currently not available in Python. |
Beta Was this translation helpful? Give feedback.
-
I see. Thanks for the info. I'm already creating LRC files with an all-exe (demucs.exe + whisper-faster.exe) solution at about 30 seconds per song, but the LRC contents are AI-only and thus have a lot of errors. Sometimes I have a .TXT file (previously automatically downloaded lyrics) So I wanted to align my produced LRC with any pre-existing TXT. To basically fix the AI-generated errors in the LRC by referring to the TXT file of the lyrics. I was hoping to stick in all-EXE-file territory because this is about a 15X speedup and the job i plan to run will take about 4 weeks to run. So if I don't have an EXE file, it's about 15X closer = over a year to run. |
Beta Was this translation helpful? Give feedback.
-
Also do you know how long per typical song it takes to make an LRC with your product? (With all-EXE it's about 30 seconds, about 60 seconds if you split the vocal track apart first with demucs.) |
Beta Was this translation helpful? Give feedback.
-
Ok, but, on my side, I don't know your exact conditions to give you a pertinent answer. Without having tried your own processing by myself, it's up to you to test and compare.
To be more precise: WhisperTimeSync is aligning all words between your SRT and your TXT = it tries to find the words of your SRT matching the words of your TXT. Knowing all most-likely word pairs, it then put the timestamps of the SRT at the right place in the TXT, keeping the text of the TXT unchanged. See the SYNC part of the Colab output: |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
Ok, but, on my side, I don't know your exact conditions to give you a pertinent answer. Without having tried your own processing by myself, it's up to you to test and compare.
To be more precise: WhisperTimeSync is aligning all words between your SRT and your TXT = it tries to find the words of your SRT matching the words of your TXT. Knowing all most-likely word pairs, it then put the timestamps of the SRT at th…