Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README to include chat templating #1372

Merged
merged 11 commits into from
Jan 23, 2025
88 changes: 57 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,49 +65,69 @@ is to ensure that there is a well-defined interface between their output and
user-defined code. **Outlines** provides ways to control the generation of
language models to make their output more predictable.

The following methods of structured generation are supported:

- [Multiple choices](#multiple-choices)
- [Type constraints](#type-constraint)
- [Efficient regex-structured generation](#efficient-regex-structured-generation)
- [Efficient JSON generation following a Pydantic model](#efficient-json-generation-following-a-pydantic-model)
- [Using context-free grammars to guide generation](#using-context-free-grammars-to-guide-generation)
- [Open functions](#open-functions)

### Chat template tokens

Outlines does not manage chat templating tokens when using instruct models. You must apply the chat template tokens to the prompt yourself. Chat template tokens are not needed for base models.

Please see [the documentation](https://dottxt-ai.github.io/outlines/latest/reference/chat_templating) on chat templating for more.

### Multiple choices

You can reduce the completion to a choice between multiple possibilities:

``` python
import outlines

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
model_name = "HuggingFaceTB/SmolLM2-360M-Instruct"
model = outlines.models.transformers(model_name)

# You must apply the chat template tokens to the prompt!
# See below for an example.
prompt = """
<|im_start|>system
You extract information from text.
<|im_end|>

prompt = """You are a sentiment-labelling assistant.
Is the following review positive or negative?
<|im_start|>user
What food does the following text describe?

Review: This restaurant is just awesome!
Text: I really really really want pizza.
<|im_end|>
<|im_start|>assistant
"""

generator = outlines.generate.choice(model, ["Positive", "Negative"])
generator = outlines.generate.choice(model, ["Pizza", "Pasta", "Salad", "Dessert"])
answer = generator(prompt)

# Likely answer: Pizza
```

You can also pass these choices through en enum:
You can also pass in choices with an `Enum`:

````python
from enum import Enum

import outlines

class Sentiment(str, Enum):
positive = "Positive"
negative = "Negative"
class Food(str, Enum):
pizza = "Pizza"
pasta = "Pasta"
salad = "Salad"
dessert = "Dessert"

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")

prompt = """You are a sentiment-labelling assistant.
Is the following review positive or negative?

Review: This restaurant is just awesome!
"""

generator = outlines.generate.choice(model, Sentiment)
generator = outlines.generate.choice(model, Food)
answer = generator(prompt)
# Likely answer: Pizza
````
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this part also needs to be adjusted, but maybe we can show only the difference:

from enum import Enum

class Food(str, Enum):
      pizza = "Pizza"
      pasta = "Pasta"
      salad = "Salad"
      dessert = "Dessert"

...

generator = outlines.generate.choice(model, Food)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO every example in a README should be completely copy-able with no slice-and-dice on the user's part. This is of course personal preference, so up to y'all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one day we generate documentation with mdbook, it offers a feature for hiding code lines from the user. I realized there isn't a similar feature in GitHub Flavored Markdown (yet?). The closest thing I'm aware of is collapsed sections...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yvan-sraka yeah, sadly collapse sections doesn't work in the code snippets. Would be nice if visually repetitive code would be hidden by collapsing, but then it would copy everything including the hidden code, but without tags. This way "copy-paste and it works" magic would not be lost and "what you see is what you get" predictability will also be served.

In mdbook hiding code lines nicely works for running doc tests for example, but on "copying the code" side it still copies just what's visible, which might not work as it is without hidden code parts. Kind of on the same side of things there is also html comments, but also completely not helpful in copying the code snippets.

@cpfiffer fair point!

But this enum example still needs to be updated, considering that we're just showing different ways of multiple choice, maybe we can just extend original section with Enum in the first place and list as an alternative, wdyt?:

import outlines
from enum import Enum

model_name = "HuggingFaceTB/SmolLM2-360M-Instruct"
model = outlines.models.transformers(model_name)

# You must apply the chat template tokens to the prompt!
# See below for an example.
prompt = """
<|im_start|>system
You extract information from text.
<|im_end|>

<|im_start|>user
What food does the following text describe?

Text: I really really really want pizza.
<|im_end|>
<|im_start|>assistant
"""

class Food(str, Enum):
      pizza = "Pizza"
      pasta = "Pasta"
      salad = "Salad"
      dessert = "Dessert"

generator = outlines.generate.choice(model, Food)

# You can also pass these choices simply as list:
# generator = outlines.generate.choice(model, ["Pizza", "Pasta", "Salad", "Dessert"])

answer = generator(prompt)
# Likely answer: Pizza

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, I like the updated example. I'll add it. In general it sounds like we'll need to just overhaul the docs, which I suspect is probably more fruitful after the 1.0 release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opted into collapsing these into one section as @torymur suggested. The second code block in this section is not copy-pastable, but I think the arrangement of code blocks makes it clear that the Enum variant is an extension of the first block.


### Type constraint
### Type constraints

You can instruct the model to only return integers or floats:

Expand Down Expand Up @@ -140,43 +160,49 @@ import outlines

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")

prompt = "What is the IP address of the Google DNS servers? "
prompt = """
<|im_start|>system You are a helpful assistant.
<|im_end|>

<|im_start|>user
What is an IP address of the Google DNS servers?
<|im_end|>
<|im_start|>assistant
The IP address of a Google DNS server is

"""

generator = outlines.generate.text(model)
unstructured = generator(prompt, max_tokens=30)

generator = outlines.generate.regex(
model,
r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)",
sampler=outlines.samplers.greedy(),
)
structured = generator(prompt, max_tokens=30)

print(unstructured)
# What is the IP address of the Google DNS servers?
# 8.8.8.8
#
# Passive DNS servers are at DNS servers that are private.
# In other words, both IP servers are private. The database
# does not contain Chelsea Manning
# <|im_end|>

print(structured)
# What is the IP address of the Google DNS servers?
# 2.2.6.1
# 8.8.8.8
```

Unlike other libraries, regex-structured generation in Outlines is almost as fast
as non-structured generation.

### Efficient JSON generation following a Pydantic model

Outlines allows to guide the generation process so the output is *guaranteed* to follow a [JSON schema](https://json-schema.org/) or [Pydantic model](https://docs.pydantic.dev/latest/):
Outlines users can guide the generation process so the output is *guaranteed* to follow a [JSON schema](https://json-schema.org/) or [Pydantic model](https://docs.pydantic.dev/latest/):

```python
from enum import Enum
from pydantic import BaseModel, constr

import outlines
import torch


class Weapon(str, Enum):
sword = "sword"
Expand Down
38 changes: 38 additions & 0 deletions docs/reference/chat_templating.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Chat templating

Instruction-tuned language models use "special tokens" to indicate different parts of text, such as the system prompt, the user prompt, any images, and the assistant's response. A [chat template](https://huggingface.co/docs/transformers/main/en/chat_templating) is how different types of input are composited together into a single, machine-readable string.

Outlines does not manage chat templating tokens when using instruct models. You must apply the chat template tokens to the prompt yourself -- if you do not apply chat templating on instruction-tuned models, you will often get nonsensical output from the model.

Chat template tokens are not needed for base models.

You can find the chat template tokens in the model's HuggingFace repo or documentation. As an example, the `SmolLM2-360M-Instruct` special tokens can be found [here](https://huggingface.co/HuggingFaceTB/SmolLM2-360M-Instruct/blob/main/special_tokens_map.json).

However, it can be slow to manually look up a model's special tokens, and special tokens vary by models. If you change the model, your prompts may break if you have hard-coded special tokens.

If you need a convenient tool to apply chat templating for you, you should use the `tokenizer` from the `transformers` library:

```python
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-360M-Instruct")
prompt = tokenizer.apply_chat_template(
[
{"role": "system", "content": "You extract information from text."},
{"role": "user", "content": "What food does the following text describe?"},
],
tokenize=False,
add_bos=True,
add_generation_prompt=True,
)
```

yields

```
<|im_start|>system
You extract information from text.<|im_end|>
<|im_start|>user
What food does the following text describe?<|im_end|>
<|im_start|>assistant
```
97 changes: 48 additions & 49 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ site_author: The Outlines developers
site_description: >-
Structured text generation with LLMs


# Repository
repo_name: dottxt-ai/outlines
repo_url: https://github.com/dottxt-ai/outlines
Expand Down Expand Up @@ -76,7 +75,6 @@ markdown_extensions:
emoji_generator: !!python/name:material.extensions.emoji.to_svg
- pymdownx.snippets:


extra_css:
- stylesheets/extra.css

Expand Down Expand Up @@ -131,53 +129,54 @@ nav:
- Cerebrium: cookbook/deploy-using-cerebrium.md
- Modal: cookbook/deploy-using-modal.md
- Docs:
- reference/index.md
- Generation:
- Overview: reference/generation/generation.md
- Text: reference/text.md
- Samplers: reference/samplers.md
- Structured generation:
- How does it work?: reference/generation/structured_generation_explanation.md
- Classification: reference/generation/choices.md
- Regex: reference/generation/regex.md
- Type constraints: reference/generation/format.md
- JSON (function calling): reference/generation/json.md
- Grammar: reference/generation/cfg.md
- Creating Grammars: reference/generation/creating_grammars.md
- Custom FSM operations: reference/generation/custom_fsm_ops.md
- Utilities:
- Serve with vLLM: reference/serve/vllm.md
- Serve with LM Studio: reference/serve/lmstudio.md
- Custom types: reference/generation/types.md
- Prompt templating: reference/prompting.md
- Outlines functions: reference/functions.md
- Models:
- Overview: reference/models/models.md
- Open source:
- Transformers: reference/models/transformers.md
- Transformers Vision: reference/models/transformers_vision.md
- Llama.cpp: reference/models/llamacpp.md
- vLLM: reference/models/vllm.md
- TGI: reference/models/tgi.md
- ExllamaV2: reference/models/exllamav2.md
- MLX: reference/models/mlxlm.md
- Mamba: reference/models/transformers/#mamba
- API:
- OpenAI: reference/models/openai.md
- reference/index.md
- Generation:
- Overview: reference/generation/generation.md
- Chat templating: reference/chat_templating.md
- Text: reference/text.md
- Samplers: reference/samplers.md
- Structured generation:
- How does it work?: reference/generation/structured_generation_explanation.md
- Classification: reference/generation/choices.md
- Regex: reference/generation/regex.md
- Type constraints: reference/generation/format.md
- JSON (function calling): reference/generation/json.md
- Grammar: reference/generation/cfg.md
- Creating Grammars: reference/generation/creating_grammars.md
- Custom FSM operations: reference/generation/custom_fsm_ops.md
- Utilities:
- Serve with vLLM: reference/serve/vllm.md
- Serve with LM Studio: reference/serve/lmstudio.md
- Custom types: reference/generation/types.md
- Prompt templating: reference/prompting.md
- Outlines functions: reference/functions.md
- Models:
- Overview: reference/models/models.md
- Open source:
- Transformers: reference/models/transformers.md
- Transformers Vision: reference/models/transformers_vision.md
- Llama.cpp: reference/models/llamacpp.md
- vLLM: reference/models/vllm.md
- TGI: reference/models/tgi.md
- ExllamaV2: reference/models/exllamav2.md
- MLX: reference/models/mlxlm.md
- Mamba: reference/models/transformers/#mamba
- API:
- OpenAI: reference/models/openai.md
- API Reference:
- api/index.md
- api/models.md
- api/prompts.md
- api/json_schema.md
- api/guide.md
- api/parsing.md
- api/regex.md
- api/samplers.md
- api/index.md
- api/models.md
- api/prompts.md
- api/json_schema.md
- api/guide.md
- api/parsing.md
- api/regex.md
- api/samplers.md
- Community:
- community/index.md
- Feedback 🫶: community/feedback.md
- Chat with us ☕: https://discord.com/invite/R9DSu34mGd
- How to contribute 🏗️: community/contribute.md
- Your projects 👏: community/examples.md
- Versioning Guide 📌: community/versioning.md
- community/index.md
- Feedback 🫶: community/feedback.md
- Chat with us ☕: https://discord.com/invite/R9DSu34mGd
- How to contribute 🏗️: community/contribute.md
- Your projects 👏: community/examples.md
- Versioning Guide 📌: community/versioning.md
- Blog: blog/index.md
Loading