-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add example script for rendering jinja2 templates #7246
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
I'm thinking it would be better to just extract bos/eos from metadata instead of allowing user to set them from command line. I've been working on some improvements to If it's OK by you I'll wait until this is merged and then submit a PR with those improvements and remove |
@CISC No need to ask me for permission. If you think it's good, try it out. Would love to know about the results. |
Just checking, in case you had a particular use for swapping out BOS/EOS. A This script can be really useful for generating prompts for |
This is ready for a review/merge. |
Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems useful, and it mostly looks good to me.
…pp into gguf-model-template
Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
{file = "sentencepiece-0.2.0-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:188779e1298a1c8b8253c7d3ad729cb0a9891e5cef5e5d07ce4592c54869e227"}, | ||
{file = "sentencepiece-0.2.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:bed9cf85b296fa2b76fc2547b9cbb691a523864cebaee86304c43a7b4cb1b452"}, | ||
{file = "sentencepiece-0.2.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:d7b67e724bead13f18db6e1d10b6bbdc454af574d70efbb36f27d90387be1ca3"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I haven't verified the hashes; in my workflow I'm ignoring poetry's lock file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove poetry.lock
and add a ignore rule to .gitignore
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove poetry.lock and add a ignore rule to .gitignore?
I don't have to. What I'm saying is that pypi/poetry are a good place for a supply chain attack because it's infeasible to review these diffs without respective automation. This doesn't concern me personally, because I'm only relying on the toml file, not on the lock file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. That's always been the case with packaging.
I ran the following:
poetry remove sentencepiece
poetry add 'sentencepiece@^0.2.0'
I personally prefer requirements.txt
, but poetry
does make distribution less painful.
Regardless, seems outside scope. Still good to be aware of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. That's always been the case with packaging. I ran the following:
poetry remove sentencepiece poetry add 'sentencepiece@^0.2.0'
I personally prefer
requirements.txt
, butpoetry
does make distribution less painful.Regardless, seems outside scope. Still good to be aware of.
Forgive me if this is a stupid question, but why not conda (mamba)? Works great for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conda is orthogonal to this. Pyproject is the format consumed by the majority of python-related tools including and beyond pip (e.g. Nixpkgs' buildPythonPackage
). The lock file is just there because of the pyproject back-end/build-system we'd (arbitrarily) chosen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I reviewed the pyproject part)
Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
Sometimes I need to inspect the models chat templates and I created a script awhile back to do this. This is a updated and modified version of the same script.
It's useful for debugging and comprehending how the model creator might have intended the chat template to be rendered. I like being able to visualize these things and this script helps me do that.
Example usage:
This isn't a high priority, I just thought it might be useful.