Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The output is the same as the input. #81

Open
yz-qiang opened this issue May 31, 2023 · 1 comment
Open

The output is the same as the input. #81

yz-qiang opened this issue May 31, 2023 · 1 comment

Comments

@yz-qiang
Copy link

yz-qiang commented May 31, 2023

I have a strange problem. I want to use use the batch size data for unified prediction. So I first use the tokenizer.encode() to encode the data. and I also use the padding to control the uniform length. The specific code is shown below:

if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
model.resize_token_embeddings(len(tokenizer))
code = '''Judge whether there is a defect in ["int ff_get_wav_header"]. Respond with 'YES' or 'NO' only.'''
system_prompt = """# StableLM Tuned (Alpha version)
- StableLM is a helpful and harmless open-source AI language model developed by StabilityAI.
- StableLM is excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.
- StableLM is more than just an information source, StableLM is also able to write poetry, short stories, and make jokes.
- StableLM will refuse to participate in anything that could harm a human.
"""
prompt = f"<|SYSTEM|>{system_prompt}<|USER|>{code}<|ASSISTANT|>"
source_ids = tokenizer.encode(prompt, max_length=1500, padding='max_length', return_tensors='pt').to("cuda")
source_mask = source_ids.ne(tokenizer.pad_token_id).long()
type_ids = torch.zeros(source_ids.shape, dtype=torch.long).to("cuda")
print(source_mask.shape, '---', source_ids.shape)
preds = model.generate(
        source_ids, attention_mask = source_mask,
        max_new_tokens=512,
        temperature=0.2,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id,
        stopping_criteria=StoppingCriteriaList([StopOnTokens()])
      )
top_preds = list(preds.cpu().numpy())
pred_nls = [tokenizer.decode(id, skip_special_tokens=True, clean_up_tokenization_spaces=False) for id in top_preds]
print(pred_nls)

However, the output is the same as the prompt, as follows:

['# StableLM Tuned (Alpha version)\n- StableLM is a helpful and harmless open-source AI language model developed by StabilityAI.\n- StableLM is excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.\n- StableLM is more than just an information source, StableLM is also able to write poetry, short stories, and make jokes.\n- StableLM will refuse to participate in anything that could harm a human.\nJudge whether there is a defect in ["int ff_get_wav_header"]. Respond with \'YES\' or \'NO\' only.']

If I do the conversion without tokenizer.encode and direct use the tokenizer to make predict, I get the normal output, which the code looks like this:

code = '''Judge whether there is a defect in ["int ff_get_wav_header"]. Respond with 'YES' or 'NO' only.'''
system_prompt = """# StableLM Tuned (Alpha version)
- StableLM is a helpful and harmless open-source AI language model developed by StabilityAI.
- StableLM is excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.
- StableLM is more than just an information source, StableLM is also able to write poetry, short stories, and make jokes.
- StableLM will refuse to participate in anything that could harm a human.
"""
prompt = f"<|SYSTEM|>{system_prompt}<|USER|>{code}<|ASSISTANT|>"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
tokens = model.generate(
        **inputs,
        max_new_tokens=max_new_tokens,
        temperature=0.2,
        do_sample=True,
        stopping_criteria=StoppingCriteriaList([StopOnTokens()])
      )
print(tokenizer.decode(tokens[0], skip_special_tokens=True))```

The output is as follows:

# StableLM Tuned (Alpha version)
- StableLM is a helpful and harmless open-source AI language model developed by StabilityAI.
- StableLM is excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.
- StableLM is more than just an information source, StableLM is also able to write poetry, short stories, and make jokes.
- StableLM will refuse to participate in anything that could harm a human.
Judge whether there is a defect in ["int ff_get_wav_header"]. Respond with 'YES' or 'NO' only.**Yes**.

Why does this happen? Is there something important I'm missing

@yz-qiang
Copy link
Author

yz-qiang commented May 31, 2023

In addition, I just tried the following method, but the output is still the same as the input. The code as follows:

if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
model.resize_token_embeddings(len(tokenizer))
code = '''Judge whether there is a defect in ["int ff_get_wav_header"]. Respond with 'YES' or 'NO' only.'''
system_prompt = """# StableLM Tuned (Alpha version)
- StableLM is a helpful and harmless open-source AI language model developed by StabilityAI.
- StableLM is excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.
- StableLM is more than just an information source, StableLM is also able to write poetry, short stories, and make jokes.
- StableLM will refuse to participate in anything that could harm a human.
"""

prompt = f"<|SYSTEM|>{system_prompt}<|USER|>{code}<|ASSISTANT|>"
encoded_dict = tokenizer(prompt, max_length=1500, padding="max_length", truncation=True,return_tensors="pt").to("cuda")
print(encoded_dict['input_ids'].shape, '-----', encoded_dict["attention_mask"].shape)
example = {"input_ids":encoded_dict['input_ids'], "attention_mask":encoded_dict["attention_mask"]}
preds = model.generate(
        **example,
        max_new_tokens=512,
        temperature=0.2,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id,
        stopping_criteria=StoppingCriteriaList([StopOnTokens()])
      )
top_preds = list(preds.cpu().numpy())
pred_nls = [tokenizer.decode(id, skip_special_tokens=True, clean_up_tokenization_spaces=False) for id in top_preds]
print(pred_nls)

The output is:

['# StableLM Tuned (Alpha version)\n- StableLM is a helpful and harmless open-source AI language model developed by StabilityAI.\n- StableLM is excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.\n- StableLM is more than just an information source, StableLM is also able to write poetry, short stories, and make jokes.\n- StableLM will refuse to participate in anything that could harm a human.\nJudge whether there is a defect in ["int ff_get_wav_header"]. Respond with \'YES\' or \'NO\' only.']

So how do I solve this problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant