Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example is underwhelming #1347

Open
TimZaman opened this issue Dec 19, 2024 · 3 comments
Open

Example is underwhelming #1347

TimZaman opened this issue Dec 19, 2024 · 3 comments
Labels

Comments

@TimZaman
Copy link
Contributor

TimZaman commented Dec 19, 2024

Describe the issue as clearly as possible:

I used README.md's example, and changed the review to "Review: This restaurant is bad!" -instead of 'good'- or any similar negative phrasing. The answer given back is almost exclusively "Positive".

I understand this is the model's behaviour, but maybe either:

  1. this model isn't up to this task
  2. the prompt is too hard for this example.

Suggestion: We could reconsider a more illustrative example

Steps/code to reproduce the bug:

import outlines

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")

prompt = """You are a sentiment-labelling assistant.
Is the following review positive or negative?

Review: This restaurant is bad!
"""

generator = outlines.generate.choice(model, ["Positive", "Negative"])
answer = generator(prompt)
print(f'{answer=}')

Expected result:

"Negative"

Error message:

"Positive"

Outlines/Python version information:

outlines v 0.1.11
Python 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:51:49) [Clang 16.0.6 ]

Context for the issue:

first time user trying out the example

@TimZaman TimZaman added the bug label Dec 19, 2024
@denadai2
Copy link
Contributor

what happens without outlines?

@cpfiffer
Copy link
Contributor

cpfiffer commented Jan 2, 2025

This might be a result of not using the proper chat templating in the example. This works better:

import outlines
from transformers import AutoTokenizer

model_name = "microsoft/Phi-3-mini-4k-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = outlines.models.transformers(model_name)

def template(prompt: str) -> str:
    return tokenizer.apply_chat_template(
        [{"role": "user", "content": prompt}],
        tokenize=False,
        add_bos=True,
        add_generation_prompt=True,
    )

prompt = """You are a sentiment-labelling assistant.
Is the following review positive or negative?

Review: This restaurant is bad!
"""

generator = outlines.generate.choice(model, ["Positive", "Negative"])
answer = generator(template(prompt))
print(f'{answer=}')

@cpfiffer
Copy link
Contributor

cpfiffer commented Jan 2, 2025

This is a general issue with the documentation and in outlines use more generally. We have a few outstanding issues for this, like #987, #756,

There's a PR #1019 which might address this, though currently it seems to be in limbo.

cpfiffer added a commit to cpfiffer/outlines that referenced this issue Jan 10, 2025
The existing README has underwhelming or incorrect results (Example is underwhelming dottxt-ai#1347) due to lack of templating for instruct models.

This adds special tokens for each instruct model call, as well as provide comments on how to obtain/produce special tokens.
rlouf pushed a commit that referenced this issue Jan 23, 2025
The existing README has underwhelming or incorrect results (Example is
underwhelming #1347) due to lack of templating for instruct models.

This adds special tokens for each instruct model call, as well as
provide comments on how to obtain/produce special tokens.

---------

Co-authored-by: Victoria Terenina <torymur@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants