Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support mps backend. #59

Merged
merged 1 commit into from
Jan 15, 2024
Merged

support mps backend. #59

merged 1 commit into from
Jan 15, 2024

Conversation

ninehills
Copy link
Contributor

No description provided.

@ninehills ninehills mentioned this pull request Jan 15, 2024
@michaelfeil
Copy link
Owner

michaelfeil commented Jan 15, 2024

Thanks @ninehills. Short question: Does MPS really conflict with BetterTransformer, or whats the motivation in deactivating it?

@ninehills
Copy link
Contributor Author

Thanks @ninehills. Short question: Does MPS really conflict with BetterTransformer, or whats the motivation in deactivating it?

The following is the error message when using MPS and BetterTransformer:

  File "/Users/xxx/src/github.com/ninehills/infinity/libs/infinity_emb/.venv/lib/python3.11/site-packages/optimum/bettertransformer/models/encoder_models.py", line 301, in forward
    hidden_states = torch._nested_tensor_from_mask(hidden_states, ~attention_mask)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NotImplementedError: The operator 'aten::_nested_tensor_from_mask' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

Test torch different versions, all have the same problem.

  • torch==2.0.0
  • torch==2.1.2
  • torch nightly(torch==2.3.0.dev20240110)

Copy link
Owner

@michaelfeil michaelfeil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would merge the branch as is, if the tests are all passing. Thanks for submitting

@michaelfeil michaelfeil merged commit 959b462 into michaelfeil:main Jan 15, 2024
5 checks passed
@ninehills
Copy link
Contributor Author

NotImplementedError: The operator 'aten::_nested_tensor_from_mask' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on pytorch/pytorch#77764. As a temporary fix, you can set the environment variable PYTORCH_ENABLE_MPS_FALLBACK=1 to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

If we set PYTORCH_ENABLE_MPS_FALLBACK=1 to use the CPU as fallback for this op, it's also have error:

  File "/Users/xxx/src/github.com/ninehills/infinity/libs/infinity_emb/.venv/lib/python3.11/site-packages/optimum/bettertransformer/models/encoder_models.py", line 301, in forward
    hidden_states = torch._nested_tensor_from_mask(hidden_states, ~attention_mask)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NotImplementedError: Could not run 'aten::_to_copy' with arguments from the 'NestedTensorMPS' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::_to_copy' is only available for these backends: [CPU, CUDA, HIP, XLA, MPS, IPU, XPU, HPU, VE, Lazy, MTIA, PrivateUse1, PrivateUse2, PrivateUse3, Meta, FPGA, ORT, Vulkan, Metal, QuantizedCPU, QuantizedCUDA, QuantizedHIP, QuantizedXLA, QuantizedMPS, QuantizedIPU, QuantizedXPU, QuantizedHPU, QuantizedVE, QuantizedLazy, QuantizedMTIA, QuantizedPrivateUse1, QuantizedPrivateUse2, QuantizedPrivateUse3, QuantizedMeta, CustomRNGKeyId, MkldnnCPU, SparseCPU, SparseCUDA, SparseHIP, SparseXLA, SparseMPS, SparseIPU, SparseXPU, SparseHPU, SparseVE, SparseLazy, SparseMTIA, SparsePrivateUse1, SparsePrivateUse2, SparsePrivateUse3, SparseMeta, SparseCsrCPU, SparseCsrCUDA, NestedTensorCPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

@michaelfeil
Copy link
Owner

Okay, got it. There is no nested_tensor for Pytorch MPS, hence we should skip the bettertransformer implementation for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants