-
Notifications
You must be signed in to change notification settings - Fork 7.1k
feat(mistral): allow Mistral transforms for local/custom models #5053
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can you share your opencode.json? So you are using a mistral model but mistral isn't in the model name at all? Idk I think it may be cleaner just to rename the model in your config rather than introduce new config variations that complicate stuff further? Maybe im missing something |
|
Hi, thanks for your kind reply! True, I actually don't always have "mistral" in the model name, and that's most of the time fixable on the server side. 100%. However, it took me some time to realize model name has to contain "mistral". So yes that's easily fixable on the server side, once you know about there's a mechanism and the conditions under which it applies ;). What motivates my PR is to let the user control the application of that critical correction mechanism. It's currently applied without the user being involved at all. This silent interjection is truly helpful for most users, no doubt, but if you happen to dig a little bit more like fine-tuning or swapping models from different sources/quants, it may add up to the trickiness of the situation. See my funny rant below. Details
Debugging this kind of situation is tricky, at best. You end up attributing the issues to
I'm sure you know very well what I'm talking about. There are so many ways that these things can break. It's passionating, but sometimes, just a bit of control really helps stabilize things and debug. I'm not happy with the fact my PR changes the config template, it's very visible while you had tried to make this elegant and silent. Do you think another config approach would be better? Somewhere else than in the model provider? At the end of the day, I'll understand if you feel this is too much added complexity for no real benefit: the user still must know about this parameter 🤷♂️ |
|
For your usecase it sounds like it makes more sense as a plugin... I wonder if we should add a hook here: |
|
What do u think? |
|
Brilliant! |
|
Only thing is normalizeMessages has them in the ai sdk format, and wed need the plugin to expose our format |
|
Thanks, I'll try to wrap my head around plugins and suggest something. I think I'll close this PR for now. WDYT? |
|
someone started working on a plugin hook for this actually: I've been discussing w/ them on discord |
|
Thanks @rekram1-node! I'm happy to close this one. Cheers |
Thanks for this wonderful project!
This PR addresses an issue when running Mistral-family models (codestral, devstral, etc.) through custom providers like local vLLM/sglang/llama.cpp or ollama setups.
The problem I have
Mistral models have specific requirements:
The existing transform logic only kicked in when
providerID === "mistral"or the model name containedmistral. This is smart, but it means custom providers serving codestral/devstral would fail with cryptic API errors, typically for me, after tool calls:The fix I suggest
Two changes:
Added a transforms config option at the provider level, so you can explicitly opt-in:
{ "provider": { "my-local-llm": { "api": "http://localhost:8080/v1", "npm": "@ai-sdk/openai-compatible", "transforms": "mistral", "models": { ... } } } }Extended pattern matching to also catch
codestral,devstral,ministralandpixtralmodel names automatically.The detection now works as: explicit config > providerID match > model name pattern.
Also updated the DeepSeek detection to follow the same pattern for consistency (supports transforms: "deepseek" now too).
Testing
Added tests covering:
Happy to adjust the approach if there's a better way to handle this!