Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes runtime errors like "inputs are on different devices" when Qwen2 Audio runs on devices like "mps". This problem occurs when I tried to run the model on my Mac using
mps
device.Tests in
transformers/tests/models/qwen2_audio
have passed and I have also test it with the official demo from Qwen 2 Audio with a few modification to run it on MPS device (see below).Problem and Code References
transformers/src/transformers/models/qwen2_audio/processing_qwen2_audio.py
Line 94 in 342e3f9
Here it calls
transformers/src/transformers/models/whisper/feature_extraction_whisper.py
Line 180 in 342e3f9
which has an optional argument
device
that defaults to "cpu". So, the output of the whisper feature extractor will by defaultBut we can't just pass
device="mps"
when callingQwen2AudioProcessor.__call__
, which will cause another runtime error that saysself.tokenizer()
does not have adevice
argument.Who can review?
Probably @faychu @ylacombe can take a look because of #32137?
Modified Demo Code
Modified from https://github.com/QwenLM/Qwen2-Audio?tab=readme-ov-file#audio-analysis-inference