[Bug]: Mistral Large Instruct 2407 tool calling leakage #8301

dsingal0 · 2024-09-09T15:32:50Z

Your current environment

When using vllm 0.6.0, the mistral tool call parser does not work as expected for Mistral Large 2407 https://huggingface.co/mistralai/Mistral-Large-Instruct-2407 @K-Mistele

🐛 Describe the bug

It used to work fine when using Autotokenizer to instantiate the tokenizer, but not with MistralTokenizer from mistral_commons.
Basically, when you run the model and give it tools, the model thinks it is the tool.
Aka, if I give Mistral Large 2407 a tool for looking up movie information on IMDB, the model responds to "Who are you" with "I am a movie database lookup bot" instead of "I am an AI trained by Mistral AI"

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

K-Mistele · 2024-09-09T15:49:03Z

I originally built and tested tool calling with the chat template provided in the repo and AutoTokenizer on Mistral 7B Instruct v0.3 for the purposes of #5649. I confirmed it had per-token parity using AutoTokenizer with the tokenizer in mistral_common. I did not have issues with tool calls at temperature=0 which is what Mistral recommends/requires (the 7B model's tool calling behavior, and possibly the larger models' as well, is very brittle compared to e.g. Hermes -- it does NOT work well above near-zero temperatures)

Adding MistralTokenizer happened during/after the PR, so I'm not sure the behavior has been tested as thoroughly. I recently did try Mistral 7B and noticed the MistralTokenizer was used. I did have some issues with it then - the model didn't want to use tools (I actually tested it on the same thing lol - giving it a SQL tool to query IMDB; for me it would just generate the SQL in markdown instead of calling the tool). But, at temperature = 0, it does pass the test cases defined in CI although those are the most naive tool use examples possible (get_current_weather is the provided tool lol).

Is there a way to force the use of AutoTokenizer or to toggle it somehow? If so, that's probably preferable. I'm not able to debug Mistral-Large since I don't have the vram for it (I'm on a v100 32GB). I can dig into the AutoTokenizer vs. MistralTokenizer issue some more though.

K-Mistele · 2024-09-09T15:53:43Z

For what it's worth, I have found that Mistral's tool calling behavior is also VERY sensitive to system prompts. Generally, I have to give it very explicit instructions about what the tools are, when it should/shouldn't use them, I have to tell it that it's an AI agent that can call tools OR generate a text response; etc.

You can see this in examples/tool_chat_template_mistral_parallel.jinja -- Mistral's function calling format supports parallel tool calling, but it doesn't work out of the box at all, even with temperature=0 and using Mistral's recommended inference code that's specified in the model card in the repository on Hugging Face (instead of vLLM). The only way I was able to get parallel tool calls generated correctly was with that chat template's system prompt. I found that generally, the model's tool calling quality is poor, even with token-level parity between vLLM and the mistral_common package's tokenizer.

dsingal0 · 2024-09-09T16:32:52Z

I'll try and see if I can get a token comparison between the prompt after the chat template is applied for autotokenizer and mistral_commons on mistral-large.
Do you know why MistralTokenizer doesn't do tools? vllm/transformers_utils/tokenizers/mistral.py

 def apply_chat_template(self,
                            conversation: List["ConversationMessage"],
                            tools: Optional[Dict[str, Any]] = None,
                            **kwargs) -> List[int]:
        assert tools is None, "`tools` are not yet supported."

        request = ChatCompletionRequest(
            messages=conversation)  # type: ignore[type-var]
        encoded = self.mistral.encode_chat_completion(request)

        # encode-decode to get clean prompt
        return encoded.tokens

It sounds like we need to compare AnyTokenizer to HF's autotokenizer.

K-Mistele · 2024-09-09T16:53:47Z

So AnyTokenizer is a Union[PreTrainedTokenizer, PreTrainedTokenizerFast, MistralTokenizer]

The MistralTokenizer part of that was added recently

K-Mistele · 2024-09-09T16:55:49Z

Looks like #7739 is the source of this change. Can you try using --tokenizer-mode auto? The PR indicates that this should be the default (vs. --tokenizer-mode mistral), but that may not be the case. It might be worth trying it both ways and seeing which gives better results.

mgoin · 2024-09-09T16:57:26Z

@patrickvonplaten could you possibly help out here? It would be nice if tool calling in vLLM worked easily with the mistral tokenizer too

patrickvonplaten · 2024-09-10T08:34:27Z

Thanks for flagging - I'll look into this!

patrickvonplaten · 2024-09-16T19:16:22Z

PR to enable function calling for "mistral" formatted models is here: #8515 (should actually even work for Pixtral!)

dsingal0 added the bug Something isn't working label Sep 9, 2024

patrickvonplaten mentioned this issue Sep 16, 2024

[Model] Add mistral function calling format to all models loaded with "mistral" format #8515

Merged

DarkLight1337 closed this as completed in #8515 Sep 17, 2024

K-Mistele mentioned this issue Sep 29, 2024

[Frontend][Feature] support tool calling for internlm/internlm2_5-7b-chat model #8405

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Mistral Large Instruct 2407 tool calling leakage #8301

[Bug]: Mistral Large Instruct 2407 tool calling leakage #8301

dsingal0 commented Sep 9, 2024

K-Mistele commented Sep 9, 2024

K-Mistele commented Sep 9, 2024

dsingal0 commented Sep 9, 2024

K-Mistele commented Sep 9, 2024

K-Mistele commented Sep 9, 2024 •

edited

Loading

mgoin commented Sep 9, 2024

patrickvonplaten commented Sep 10, 2024

patrickvonplaten commented Sep 16, 2024

[Bug]: Mistral Large Instruct 2407 tool calling leakage #8301

[Bug]: Mistral Large Instruct 2407 tool calling leakage #8301

Comments

dsingal0 commented Sep 9, 2024

Your current environment

🐛 Describe the bug

Before submitting a new issue...

K-Mistele commented Sep 9, 2024

K-Mistele commented Sep 9, 2024

dsingal0 commented Sep 9, 2024

K-Mistele commented Sep 9, 2024

K-Mistele commented Sep 9, 2024 • edited Loading

mgoin commented Sep 9, 2024

patrickvonplaten commented Sep 10, 2024

patrickvonplaten commented Sep 16, 2024

K-Mistele commented Sep 9, 2024 •

edited

Loading