Skip to content

Conversation

@cassiasamp
Copy link
Owner

@cassiasamp cassiasamp commented Jul 17, 2025

What does this PR do?

Updates Mistral3 model card as per huggingface#36979

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@stevhliu Please check the PR and see if it's alright 😄

The examples need to be ran to be double checked, if that's still ok. I will try to manage to run them for possible next contributions.

Copy link

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

</div>
</div>

# Mistral3

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Mistral3
# Mistral 3

</hfoptions>

This example demonstrates how to perform inference on a single image with the Mistral3 models using chat templates.
## Notes

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets format the ## Notes section like this (make sure to close the code blocks):

## Notes

- Mistral 3 supports text-only generation.

   ```py
   code example

- Mistral 3 accepts batched image and text inputs.

   ```py
   code example

- Mistral 3 also supported batched image and text inputs with a different number of images for each text. The example below quantizes the model with bitsandbytes.

   ```py
   code example


[[autodoc]] Mistral3ForConditionalGeneration
- forward
## Resources

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need this section

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

cassiasamp and others added 2 commits July 17, 2025 17:48
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Copy link

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, just a few more minor changes!

Remember to open your PR against the main Transformers branch!

[Mistral 3](https://mistral.ai/news/mistral-small-3) is a latency optimized model with a lot fewer layers to reduce the time per forward pass. This model adds vision understanding and supports long context lengths of up to 128K tokens without compromising performance.
You can find the original Mistral 3 checkpoints under the [Mistral AI](https://huggingface.co/mistralai/models?search=mistral-small-3) organization.

This model was contributed by [cyrilvallez](https://huggingface.co/cyrilvallez) and [yonigozlan](https://huggingface.co/yonigozlan).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove duplicate here

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

cassiasamp and others added 2 commits July 20, 2025 01:37
@cassiasamp cassiasamp merged commit 6de6123 into main Jul 20, 2025
6 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants