Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello!
I'm using this on an Apple Silicon device (M1 Max) and noticed that it fails to use MPS. It seems the issue comes from the amp.autocast call, which is hardcoded to CUDA here. While there is no M1 support for AMP yet, changing it to use "cpu" in this spot seems to fix it and get torch running on MPS anyways instead of falling back to CPU on Apple Silicon.
Let me know if there's anything else you'd like, if this is of interest to merge! I'm just tinkering at this point and will eventually move to a linux-based production deployment anyways so this isn't really mission-critical.