Skip to content

Port all C++ ops to use the dispatcher #2796

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
6 tasks done
fmassa opened this issue Oct 13, 2020 · 7 comments · Fixed by #2938
Closed
6 tasks done

Port all C++ ops to use the dispatcher #2796

fmassa opened this issue Oct 13, 2020 · 7 comments · Fixed by #2938

Comments

@fmassa
Copy link
Member

fmassa commented Oct 13, 2020

🚀 Feature

We currently manually handle CPU / CUDA / autograd dispatches in our wrapper function. We should instead use the dispatcher from PyTorch, which was built to do exactly that.

The work should closely follow the PR from @ezyang in #2366

Motivation

The dispatcher is a new mechanism in PyTorch that selects which kernel to run depending on properties of the input tensors that were passed. The dispatcher is thus a centralized place where cpu / cuda / autograd / autocast / quantized / xla / etc are handled.

One thing to keep an eye on is that currently we need to duplicate the input checks for both CPU and CUDA functions. This is something that @ezyang is working on in pytorch/pytorch#45277

Current support:

  • nms
  • roi_align
  • deform_conv2d
  • roi_pool
  • ps_roi_align
  • ps_roi_pool

Question for @ezyang : following our discussion in https://github.com/pytorch/vision/pull/2366/files#r447547554 , do you think we should be providing a fallback in PyTorch for registering ops without double backwards?

@bmanga
Copy link
Contributor

bmanga commented Oct 13, 2020

With the current master, it now seems that the torchvision ops are no longer registered when building/using the c++ side of torchvision. I think it's related to these dispatcher changes. Is this expected?

My current fix is to manually add torch::RegisterOperators().op(...) back into my client code.

EDIT: Seems to be the same problem as #2736, #2762.

@fmassa
Copy link
Member Author

fmassa commented Oct 13, 2020

Hi @bmanga

I was just chatting with @vfdev-5 about this. Indeed it seems that once we started using the dispatcher for NMS and RoIAlign libtorchvision is not linked anymore.

I don't think this is expected, and we should fix it.

@ezyang it looks like since we added the Dispatcher to RoIAlign, C++ users of torchvision started having issues using RoIAlign, as it wouldn't be linked to the binary anymore. Do you know what it might be?

@bmanga
Copy link
Contributor

bmanga commented Oct 13, 2020

@fmassa the problem seems to be that the macro
TORCH_LIBRARY(torchvision, m) {...}
creates a static variable TORCH_LIBRARY_static_init_torchvision, which as usual gets discarded :/
I have a solution that seems to work for me based on my old weak symbols approach (PR #2253), I can put it up as a PR if you'd like.

@fmassa
Copy link
Member Author

fmassa commented Oct 13, 2020

A PR for this would be great!

I would also love @ezyang and @vfdev-5 to review this, and they have more experience on this than I do

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Oct 13, 2020

@bmanga could you please detail your solution, does it still use TORCH_LIBRARY and TORCH_LIBRARY_IMPL approach ?
In your PR you are using RegisterOperators to register the ops...

@bmanga
Copy link
Contributor

bmanga commented Oct 13, 2020

@vfdev-5 yes those are kept. I just make sure that TORCH_LIBRARY_static_init_torchvision is referenced in the code.
Admittedly that breaks the TORCH_LIBRARY macro abstraction, but at least the C++ side is working again.

That PR was from before RegisterOperators was replaced with TORCH_LIBRARY.

@ezyang
Copy link
Contributor

ezyang commented Oct 13, 2020

To reference the issue from last time directly, it was #2134

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants