Replies: 3 comments 9 replies
-
Some evidence points to the limit being 224 tokens but I don't think this has been discussed in the past. Here is some more conflicting information. FYI from the Whisper prompting guide
https://cookbook.openai.com/examples/whisper_prompting_guide That said, this space is for the open source repo of whisper and it appears you are using the online API. So I would consider asking this question over at the community developer forum for the online version of whisper. |
Beta Was this translation helpful? Give feedback.
-
224 isn't enough. It should at least be enough to contain song lyrics for songs with lots of words. Some of us are transcribing lyrics to songs into TIMED lyrics, and whisper is awful at accuracy and needs to be primed with the official lyrics of the song. Songs have more than 224 words in them. Especially rap, which I don't really listen to, but which should be considered. These files aren't going to be "chunked up" |
Beta Was this translation helpful? Give feedback.
-
That sure is a lot of steps compared to just allowing the acceptance of more tokens 😉 But thanks for pointing out WhisperTimeSync. I couldn't remember it's name, couldn't find it yesterday when I was looking for that concept, and wasn't sure if that concept had been implemented by someone, or if I had dreamed that it had been implemented (version control is hard on your own mind! lol). Now I know it's real, it's name, and will check it out 🎉 |
Beta Was this translation helpful? Give feedback.
-
Hi,
Whisper's prompt is an awesome feature that allows for accurate handling of tricky words, such as proper nouns denoting company names (e.g. Whatagraph). However, it is a bit unclear from the documentation what the maximum length of the prompt is. As you can see in the screenshot below, the documentation first states that "Whisper only considers the first 244 tokens of the prompt."; however, the next paragraph mentions that "this technique is limited to only 244 characters".
Has anyone explored whether the prompt is limited to the first 244 characters (letters) or tokens (units determined by the tokenizer)?
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions