-
Notifications
You must be signed in to change notification settings - Fork 10.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible wrong implementation of beam search #3802
Comments
The beam search has to be completely reimplemented and moved out of the library into a separate example. |
Thanks for your reply. It seems currently examples/beam-search is still using the old internal beam_search function in llama.cpp. you mentioned the beam-search has to be completely reimplemented, is there any repo or link someone is working on it? |
AFAIK no body is working on it atm |
Hey, anybody working on this yet? |
Not that I'm aware of |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
Beam search should use different contexts for each beam.
Current Behavior
In the beam-search.cpp example, the beam search technique applies the identical context across all beams. Understandably, if the scenario presents two beams, namely a-b-c-d-e and a-b-c-f-g, the shared prefix a-b-c is identified so as not to repeat the inference process on this section. Subsequently, separate inferences are conducted on sections d-e and f-g.
However, it's important to note an inefficiency here: during the inference process of d-e, these elements are stored into the key-value (kv) cache. Later, when the program is running the f-g inference, it erroneously accesses the kv-cache which accordingly may include data from the d-e sequence. This is inappropriate since d-e is not relevant to the current beam, hence leading to potential fallacies.
Steps to Reproduce
mkdir build; cd build; cmake ..; make;
./build/bin/beam-search /path-to-gguf/llama-2-7b.Q4_0.gguf 2 "this is a nice day,"
The text was updated successfully, but these errors were encountered: