-
Notifications
You must be signed in to change notification settings - Fork 0
updated mistral3 model card #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
stevhliu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
docs/source/en/model_doc/mistral3.md
Outdated
| </div> | ||
| </div> | ||
|
|
||
| # Mistral3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| # Mistral3 | |
| # Mistral 3 |
docs/source/en/model_doc/mistral3.md
Outdated
| </hfoptions> | ||
|
|
||
| This example demonstrates how to perform inference on a single image with the Mistral3 models using chat templates. | ||
| ## Notes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets format the ## Notes section like this (make sure to close the code blocks):
## Notes
- Mistral 3 supports text-only generation.
```py
code example
- Mistral 3 accepts batched image and text inputs.
```py
code example
- Mistral 3 also supported batched image and text inputs with a different number of images for each text. The example below quantizes the model with bitsandbytes.
```py
code example
docs/source/en/model_doc/mistral3.md
Outdated
|
|
||
| [[autodoc]] Mistral3ForConditionalGeneration | ||
| - forward | ||
| ## Resources |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need this section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
stevhliu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, just a few more minor changes!
Remember to open your PR against the main Transformers branch!
docs/source/en/model_doc/mistral3.md
Outdated
| [Mistral 3](https://mistral.ai/news/mistral-small-3) is a latency optimized model with a lot fewer layers to reduce the time per forward pass. This model adds vision understanding and supports long context lengths of up to 128K tokens without compromising performance. | ||
| You can find the original Mistral 3 checkpoints under the [Mistral AI](https://huggingface.co/mistralai/models?search=mistral-small-3) organization. | ||
|
|
||
| This model was contributed by [cyrilvallez](https://huggingface.co/cyrilvallez) and [yonigozlan](https://huggingface.co/yonigozlan). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove duplicate here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
What does this PR do?
Updates
Mistral3model card as per huggingface#36979Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@stevhliu Please check the PR and see if it's alright 😄
The examples need to be ran to be double checked, if that's still ok. I will try to manage to run them for possible next contributions.