Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Failing to output non-EN #109

Open
NanoCode012 opened this issue May 25, 2024 · 7 comments
Open

[Bug] Failing to output non-EN #109

NanoCode012 opened this issue May 25, 2024 · 7 comments

Comments

@NanoCode012
Copy link

Hey! Thank you for the nice tool and integrations. I've been trying this out with English JSON parsing using vllm, and it works great!

However, when I tried with a JP model (like the recently released aya from Cohere and llama3 fine tunes), I received cut off outputs.

result = json.loads(result)

Failed parsing output: {
"Input": "ミ

Do you perhaps know why it's occurring? My initial guess after looking at the repo was that it's not able to build a character tree due to these unicode characters and early stopping.

I checked the other Issues, and they are having issues with the key being non-EN. In this case, it's the content itself. I've tried it models without lm-format-enhancer on, and it seems to output ok without cutoff early on (though it can't output JSON consistently as expected).

Env: vllm==0.4.1 lm-format-enforcer==0.9.8

@noamgat
Copy link
Owner

noamgat commented May 31, 2024

Hi! Can you please share the model+schema+prompt that you are trying to use? If this reproduces on a 7B (or less) model it will be much easier to debug.

@rdlwicked
Copy link

The formatter seems unable to proceed a generation every time it generates a Roman number.

I am currently generating book names using the qwen1.5-110b-32k model and I found that every time a book name with a roman number is generated, the generation just stops.

Here is an example:

{"实体1": "三体系列", "实体2": "三体Ⅱ

and the generation just stops even the schema hasn't finished yet.

This happens every time so I guess it's to do with the formatter as it doesn't happen when the formatter isn't applied.

@liqul
Copy link

liqul commented Jul 2, 2024

Got this exact problem. Any solution or workaround? I'm using Llama-3-8b-instruct and the HF transformers lib to do generation.

@ericperfect
Copy link

The formatter seems unable to proceed a generation every time it generates a Roman number.

I am currently generating book names using the qwen1.5-110b-32k model and I found that every time a book name with a roman number is generated, the generation just stops.

Here is an example:

{"实体1": "三体系列", "实体2": "三体Ⅱ

and the generation just stops even the schema hasn't finished yet.

This happens every time so I guess it's to do with the formatter as it doesn't happen when the formatter isn't applied.

The formatter seems unable to proceed a generation every time it generates a Roman number.

I am currently generating book names using the qwen1.5-110b-32k model and I found that every time a book name with a roman number is generated, the generation just stops.

Here is an example:

{"实体1": "三体系列", "实体2": "三体Ⅱ

and the generation just stops even the schema hasn't finished yet.

This happens every time so I guess it's to do with the formatter as it doesn't happen when the formatter isn't applied.

i got the same problem. Any soultion?thinks

@jamestwhedbee
Copy link

Just ran into this myself

@jamestwhedbee
Copy link

@noamgat here is a minimal example using guided decoding in vllm with LMFE v0.10.6

import os
from openai import OpenAI


openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

messages = [{"role": "user", "content": "Find the definite integral of f(x)=x^2 from x=1 to x=3."}]
chat_completion = client.chat.completions.create(
    model="meta-llama/Meta-Llama-3.1-8B-Instruct",
    messages=messages,
    temperature=0.0,
    stream=True,
    extra_body={
      "guided_json": {
         "type": "object",
         "properties": {
            "explanation": {
              "description": "make sure to use mathematical notation in your explanation",
              "type": "string"
            }
         },
         "required": ["explanation"]
      }
    }
)
for chunk in chat_completion:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content)

print()

Which outputs

{
 
 

"
e
xp
lan
ation
":
 "
The
 definite
 integral
 of
 a
 function
 f
(x
)
 from
 x
=a
 to
 x
=b
 is
 den
oted
 as
 ∫
 


@noamgat
Copy link
Owner

noamgat commented Sep 3, 2024

Thanks for the reproduction, this is something I hope to tackle in the next major version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants