Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Put back MPS builds #8478

Closed
NicolasHug opened this issue Jun 7, 2024 · 6 comments
Closed

Put back MPS builds #8478

NicolasHug opened this issue Jun 7, 2024 · 6 comments

Comments

@NicolasHug
Copy link
Member

(this is a follow-up and more up-to-date version of #8456)

The M1 CI jobs were broken for ~1 week (#8456) and it turns out the problem was caused by the MPS build. We deactivated the MPS builds in #8472 and the M1 jobs (all using macos-m1-stable) are now green.

We have to put back the MPS build before the release though, otherwise torchvision won't provide MPS-compatible custom ops.

In #8476 (macos-m1-stable), #8473 (macos-m1-13) and #8477 (macos-m1-14) I'm trying to add back those MPS builds, but they all fail with the same error as previously seen back in #8456:

  File "/Users/ec2-user/runner/_work/vision/vision/pytorch/vision/test/smoke_test.py", line 7, in <module>
    import torchvision
  File "/Users/ec2-user/runner/_work/vision/vision/pytorch/vision/torchvision/__init__.py", line 10, in <module>
    from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils  # usort:skip
  File "/Users/ec2-user/runner/_work/vision/vision/pytorch/vision/torchvision/_meta_registrations.py", line 164, in <module>
    def meta_nms(dets, scores, iou_threshold):
  File "/opt/homebrew/Caskroom/miniconda/base/envs/ci/lib/python3.10/site-packages/torch/library.py", line 653, in register
    use_lib._register_fake(op_name, func, _stacklevel=stacklevel + 1)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/ci/lib/python3.10/site-packages/torch/library.py", line 153, in _register_fake
    handle = entry.abstract_impl.register(func_to_register, source)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/ci/lib/python3.10/site-packages/torch/_library/abstract_impl.py", line 30, in register
    if torch._C._dispatch_has_kernel_for_dispatch_key(self.qualname, "Meta"):
RuntimeError: operator torchvision::nms does not exist

CC @malfet @huydhn

@NicolasHug
Copy link
Member Author

From #8456 (comment)

Though it fails on my Mac even with MPS disabled, so I have no idea what is going on...

@malfet are the tests failing on your Mac with the same error message on main? 🤔

@huydhn
Copy link
Contributor

huydhn commented Jun 8, 2024

I could reproduce the issue locally and can confirm that disable MPS makes import torchvision work again. However, I think the recent upgrade to MacOS 14 might be a red herring because using macos-m1-13 still fails in the same way, i.e. https://github.com/pytorch/vision/actions/runs/9425764020/job/25967868583?pr=8485

@huydhn
Copy link
Contributor

huydhn commented Jun 9, 2024

This issue seems to appear in PyTorch commit between https://hud.pytorch.org/hud/pytorch/pytorch/d66f12674cfe0151a86dc10b8de216f83bf42e6e (failed) and https://hud.pytorch.org/hud/pytorch/pytorch/0ff2f8b52248323bbe25108b64e706c43390cb72 (success). I need to narrow this range down a bit.

pytorchmergebot added a commit to pytorch/pytorch that referenced this issue Jun 9, 2024
This reverts commit 669560d.

Reverted #127265 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but I suspect that it causes this failure pytorch/vision#8478 on vision where its C++ extension could not be loaded on macOS ([comment](#127265 (comment)))
pytorchmergebot referenced this issue in pytorch/pytorch Jun 9, 2024
Since it can eliminate some linker warnings on MacOS

Pull Request resolved: #127265
Approved by: https://github.com/ezyang, https://github.com/malfet
@huydhn
Copy link
Contributor

huydhn commented Jun 9, 2024

I think I have found the change responsible for this in pytorch/pytorch#127265. I have reverted it so that we can test it out in the next PyTorch nightly build. Reverting it locally fix the issue for me.

@huydhn
Copy link
Contributor

huydhn commented Jun 9, 2024

Testing in #8485, but pending the revert https://hud.pytorch.org/hud/pytorch/pytorch/75b0720a97ac5d82e8a7a1a6ae7c5f7a87d7183d to get into the next PyTorch nightly build.

@NicolasHug
Copy link
Member Author

#8485 worked, thanks a ton @huydhn !

TharinduRusira pushed a commit to TharinduRusira/pytorch that referenced this issue Jun 14, 2024
This reverts commit 669560d.

Reverted pytorch#127265 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but I suspect that it causes this failure pytorch/vision#8478 on vision where its C++ extension could not be loaded on macOS ([comment](pytorch#127265 (comment)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants