-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update README to include chat templating #1372
Changes from 7 commits
c48f260
fc01f25
0df6215
d816749
d43feb1
e9ea22d
d69fe3e
960e723
1a49501
6b4b107
6fa1446
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -65,23 +65,44 @@ is to ensure that there is a well-defined interface between their output and | |
user-defined code. **Outlines** provides ways to control the generation of | ||
language models to make their output more predictable. | ||
|
||
The following methods of structured generation are supported: | ||
|
||
- [Multiple choices](#multiple-choices) | ||
- [Type constraints](#type-constraint) | ||
- [Efficient regex-structured generation](#efficient-regex-structured-generation) | ||
- [Efficient JSON generation following a Pydantic model](#efficient-json-generation-following-a-pydantic-model) | ||
- [Using context-free grammars to guide generation](#using-context-free-grammars-to-guide-generation) | ||
- [Open functions](#open-functions) | ||
|
||
### Multiple choices | ||
|
||
You can reduce the completion to a choice between multiple possibilities: | ||
|
||
``` python | ||
import outlines | ||
|
||
model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct") | ||
model_name = "HuggingFaceTB/SmolLM2-360M-Instruct" | ||
model = outlines.models.transformers(model_name) | ||
|
||
prompt = """You are a sentiment-labelling assistant. | ||
Is the following review positive or negative? | ||
# You must apply the chat template tokens to the prompt! | ||
# See below for an example. | ||
prompt = """ | ||
<|im_start|>system | ||
You extract information from text. | ||
<|im_end|> | ||
|
||
Review: This restaurant is just awesome! | ||
<|im_start|>user | ||
What food does the following text describe? | ||
|
||
Text: I really really really want pizza. | ||
<|im_end|> | ||
<|im_start|>assistant | ||
""" | ||
|
||
generator = outlines.generate.choice(model, ["Positive", "Negative"]) | ||
generator = outlines.generate.choice(model, ["Pizza", "Pasta", "Salad", "Dessert"]) | ||
answer = generator(prompt) | ||
|
||
# Likely answer: Pizza | ||
``` | ||
|
||
You can also pass these choices through en enum: | ||
|
@@ -107,7 +128,7 @@ generator = outlines.generate.choice(model, Sentiment) | |
answer = generator(prompt) | ||
```` | ||
|
||
### Type constraint | ||
### Type constraints | ||
|
||
You can instruct the model to only return integers or floats: | ||
|
||
|
@@ -140,43 +161,49 @@ import outlines | |
|
||
model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct") | ||
|
||
prompt = "What is the IP address of the Google DNS servers? " | ||
prompt = """ | ||
<|im_start|>system You are a helpful assistant. | ||
<|im_end|> | ||
|
||
<|im_start|>user | ||
What is an IP address of the Google DNS servers? | ||
<|im_end|> | ||
<|im_start|>assistant | ||
The IP address of a Google DNS server is | ||
|
||
""" | ||
|
||
generator = outlines.generate.text(model) | ||
unstructured = generator(prompt, max_tokens=30) | ||
|
||
generator = outlines.generate.regex( | ||
model, | ||
r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)", | ||
sampler=outlines.samplers.greedy(), | ||
) | ||
structured = generator(prompt, max_tokens=30) | ||
|
||
print(unstructured) | ||
# What is the IP address of the Google DNS servers? | ||
# 8.8.8.8 | ||
# | ||
# Passive DNS servers are at DNS servers that are private. | ||
# In other words, both IP servers are private. The database | ||
# does not contain Chelsea Manning | ||
# <|im_end|> | ||
|
||
print(structured) | ||
# What is the IP address of the Google DNS servers? | ||
# 2.2.6.1 | ||
# 8.8.8.8 | ||
``` | ||
|
||
Unlike other libraries, regex-structured generation in Outlines is almost as fast | ||
as non-structured generation. | ||
|
||
### Efficient JSON generation following a Pydantic model | ||
|
||
Outlines allows to guide the generation process so the output is *guaranteed* to follow a [JSON schema](https://json-schema.org/) or [Pydantic model](https://docs.pydantic.dev/latest/): | ||
Outlines users can guide the generation process so the output is *guaranteed* to follow a [JSON schema](https://json-schema.org/) or [Pydantic model](https://docs.pydantic.dev/latest/): | ||
|
||
```python | ||
from enum import Enum | ||
from pydantic import BaseModel, constr | ||
|
||
import outlines | ||
import torch | ||
|
||
|
||
class Weapon(str, Enum): | ||
sword = "sword" | ||
|
@@ -386,6 +413,39 @@ prompt = labelling("Just awesome", examples) | |
answer = outlines.generate.text(model)(prompt, max_tokens=100) | ||
``` | ||
|
||
### Chat template tokens | ||
|
||
Outlines does not manage chat templating tokens when using instruct models. You must apply the chat template tokens to the prompt yourself. Chat template tokens are not needed for base models. | ||
|
||
You can find the chat template tokens in the model's HuggingFace repo or documentation. As an example, the SmolLM2-360M-Instruct special tokens can be found [here](https://huggingface.co/HuggingFaceTB/SmolLM2-360M-Instruct/blob/main/special_tokens_map.json). | ||
|
||
A convenient way to do this is to use the `tokenizer` from the `transformers` library: | ||
|
||
```python | ||
from transformers import AutoTokenizer | ||
|
||
tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-360M-Instruct") | ||
prompt = tokenizer.apply_chat_template( | ||
[ | ||
{"role": "system", "content": "You extract information from text."}, | ||
{"role": "user", "content": "What food does the following text describe?"}, | ||
], | ||
tokenize=False, | ||
add_bos=True, | ||
add_generation_prompt=True, | ||
) | ||
``` | ||
|
||
yields | ||
|
||
``` | ||
<|im_start|>system | ||
You extract information from text.<|im_end|> | ||
<|im_start|>user | ||
What food does the following text describe?<|im_end|> | ||
<|im_start|>assistant | ||
``` | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can leave the warning but this section should be in the documentation and we can add a link from here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would go as far as having this warning before the first example instead of having a separate section. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed. I'm not sure what the best place is for it. We could add it to the prompting reference page or a new page. Not sure which is better. It wouldn't be too hard to cook up a new page for the chat templating issue, as we can provide lots of little bits of context. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My 5 cents here is that since chat templating issue kind of grows all over the examples and at the same time has nothing to do with structure-generating showcases it could be properly stated in the description before all the sections, and considering how many nuances could be there, probably, deserves its own page with a reference from the description. |
||
## Join us | ||
|
||
- 💡 **Have an idea?** Come chat with us on [Discord][discord] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems this part also needs to be adjusted, but maybe we can show only the difference:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO every example in a README should be completely copy-able with no slice-and-dice on the user's part. This is of course personal preference, so up to y'all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If one day we generate documentation with
mdbook
, it offers a feature for hiding code lines from the user. I realized there isn't a similar feature in GitHub Flavored Markdown (yet?). The closest thing I'm aware of is collapsed sections...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yvan-sraka yeah, sadly collapse sections doesn't work in the code snippets. Would be nice if visually repetitive code would be hidden by collapsing, but then it would copy everything including the hidden code, but without tags. This way "copy-paste and it works" magic would not be lost and "what you see is what you get" predictability will also be served.
In
mdbook
hiding code lines nicely works for running doc tests for example, but on "copying the code" side it still copies just what's visible, which might not work as it is without hidden code parts. Kind of on the same side of things there is also html comments, but also completely not helpful in copying the code snippets.@cpfiffer fair point!
But this enum example still needs to be updated, considering that we're just showing different ways of multiple choice, maybe we can just extend original section with
Enum
in the first place and list as an alternative, wdyt?:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, I like the updated example. I'll add it. In general it sounds like we'll need to just overhaul the docs, which I suspect is probably more fruitful after the 1.0 release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opted into collapsing these into one section as @torymur suggested. The second code block in this section is not copy-pastable, but I think the arrangement of code blocks makes it clear that the
Enum
variant is an extension of the first block.