Improve HF tokenization hack to cover multiple special tokens #649
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Like some others (#620, #609, #581, #556, #555, #494, #479) I have been running into issues with an assertion failure when my prompt contains special tokens. Particularly, it seems like the existing workaround doesn't account for multiple special tokens being present in the prompt, but this is necessary for example if you want to use the chat template provided with models such as
HuggingFaceH4/zephyr-7b-beta
, which contains special tokens as part of the template.I don't fully understand the nature of the workaround or why it is necessary, but after some experimentation, the changes here made it work for my use case. I am posting them here in case it helps anyone experiencing the same issue and also in hopes of creating a discussion that might lead to a more robust solution. Perhaps @slundberg, who I believe implemented the original workaround, can chime in and let me know if I am on the right track with this change.
Thanks, Shawn