You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey - I was thinking about doing something with n-grams to speed up speculative decoding and I ran into your repo.
I was wondering if you've explored sourcing n-grams from just a large corpus of text (say wikitext2) or a large body of a specific LLM's output? Basically the idea is to build a ~100 MB lookup table of high likelihood ngrams, say 3 tokens that predict a standard pattern of the next 3-5 tokens. Over time you could refine it to be a lookup table that is worth the speculative decoding cost (i.e. only include entries that work ~20+% of the time). We know that models tend to have phrases they prefer (https://www.reddit.com/r/ChatGPT/comments/16e9l7a/what_are_the_most_common_words_and_phrases_used/)
I'm thinking of exploring something like this (perhaps reusing your work in LLama.cpp) but let me know if you've already looked into it, or think it's unlikely to work.
The text was updated successfully, but these errors were encountered:
Hey - I was thinking about doing something with n-grams to speed up speculative decoding and I ran into your repo.
I was wondering if you've explored sourcing n-grams from just a large corpus of text (say wikitext2) or a large body of a specific LLM's output? Basically the idea is to build a ~100 MB lookup table of high likelihood ngrams, say 3 tokens that predict a standard pattern of the next 3-5 tokens. Over time you could refine it to be a lookup table that is worth the speculative decoding cost (i.e. only include entries that work ~20+% of the time). We know that models tend to have phrases they prefer (https://www.reddit.com/r/ChatGPT/comments/16e9l7a/what_are_the_most_common_words_and_phrases_used/)
I'm thinking of exploring something like this (perhaps reusing your work in LLama.cpp) but let me know if you've already looked into it, or think it's unlikely to work.
The text was updated successfully, but these errors were encountered: