AutoTokenizer: clear ImportError when loading Voxtral without mistral-common + unit test #41718
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary: Fix confusing TypeError from SentencePiece when loading Voxtral tokenizer without mistral-common by raising a clear ImportError instead. Fixes #41553.
Rationale: Without mistral-common, loading via AutoTokenizer ends in a low-level TypeError inside sentencepiece. Users should see a direct, actionable message.
What’s changed
AutoTokenizer guard:
In AutoTokenizer.from_pretrained, after resolving config and before class selection, raise a clear error if config.model_type == "voxtral" and mistral-common is missing.
Message:
"The Voxtral tokenizer requires the 'mistral-common' package. Please install it using pip install mistral-common."
Tests:
Added tests/models/voxtral/test_tokenization_voxtral.py
Mocks is_mistral_common_available to False and get_tokenizer_config to avoid network, then asserts ImportError mentioning "mistral-common".
Why this approach
Keeps mapping logic intact; avoids unexpected fallbacks.
Ensures the user sees a clear, actionable message as early as possible in the loading path.
Testing
Targeted test: pytest tests/models/voxtral/test_tokenization_voxtral.py -q → passes.
No network calls thanks to test monkeypatching.
Backward compatibility
No behavior change when mistral-common is installed.
Only affects Voxtral when the dependency is missing.