GUI Tool to verify and modify TTS datasets #2900
rioharper
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi! I have been developing a new tool for my TTS dataset creator repo, VocalForge that allows you to import multiple audio files and text files into a web GUI tool that has a scrubbable waveform, and a window for the text content for each "utterance" (each line in a CSV, for example) The waveform will have regions highlighted that represent the current timings of a certain utterance, colored by the confidence of an alignment model (green = most confident, red = least confident). You can then edit the region box to change the timestamps, or the text content itself and verify it for later. You can also delete regions that are fundamentally incorrect. Once you are done, you can then export the corrected dataset metadata.
A feature I'm thinking of implementing is using the corrected metadata to train the alignment model so that each correction you make can be used to make the model a little better!
It's my first foray into Javascript, so it's very much still in development stage, but I think it could be a major help to TTS dataset creation and validation!
2023-08-26.20-51-45.mp4
Beta Was this translation helpful? Give feedback.
All reactions