Adding ROIAlign backwards for CPU #504

xssChauhan · 2018-05-17T00:41:24Z

Adding ROIAlign backwards implementation for CPU.

Implemented using vision's CUDA implementation for the same purpose.

Currently the layers branch is not compiling.

fmassa · 2018-05-17T15:05:25Z

Thanks for the PR!

I believe we still need to modify this file in order for the CPU dispatch to work?

xssChauhan · 2018-05-17T16:42:50Z

@fmassa My bad! Fixing this.

xssChauhan · 2018-05-17T17:11:40Z

@fmassa Fixed the python interface

sampepose · 2018-05-19T18:29:48Z

Can you make sure flake8 runs successfully on your code? CI is failing since there are some linter errors.

fmassa · 2018-05-19T22:58:37Z

@sampepose I believe the flake8 issues are on my end. I need to fix them before merging the layers branch into master.
And we don't currently have a linter for C++ in torchvision, so this should be fine.

fmassa · 2018-05-19T23:01:31Z

@xssChauhan could you please write a small python file that tests that the gradients are indeed computed correctly?
For that, you can use PyTorch torch.autograd.gradcheck. Also, when running the code, make sure that you are using double tensors - if you use float tensors, you'll have problems because of the lack of precision for finite differences differentiation that gradcheck uses to compare the gradients.

Once we know that the gradcheck is passing, I'll merge this patch.

Thanks!

xssChauhan · 2018-05-20T11:30:28Z

@fmassa Will do so

xssChauhan · 2018-05-22T17:36:58Z

@fmassa layers branch is currently not compiling.

Here's the error:

$ python setup.py install                                                                                                                                                                                   
running install                                                                                                                                                                                             
running bdist_egg                                                                                                                                                                                           
running egg_info                                                                                                                                                                                            
writing torchvision.egg-info/PKG-INFO                                                                                                                                                                       
writing dependency_links to torchvision.egg-info/dependency_links.txt                                                                                                                                       
writing requirements to torchvision.egg-info/requires.txt 
writing top-level names to torchvision.egg-info/top_level.txt                                         
reading manifest file 'torchvision.egg-info/SOURCES.txt'                                              
reading manifest template 'MANIFEST.in'            
warning: no previously-included files matching '__pycache__' found under directory '*'                
warning: no previously-included files matching '*.py[co]' found under directory '*'                   
writing manifest file 'torchvision.egg-info/SOURCES.txt'                                              
installing library code to build/bdist.linux-x86_64/egg                                               
running install_lib                                
running build_py                                   
running build_ext                                  
building 'torchvision._C' extension                
gcc -pthread -Wno-unused-result -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/home/schauhan/vision/torchvision/csrc -I/home/schauhan/vision/env/lib/python3.6/site-packages/torch/lib/include -I/home/schauhan/vision/env/lib/python3.6/site-packages/torch/lib/include/TH -I/home/schauhan/vision/env/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/include/python3.6m -c /home/schauhan/vision/torchvision/csrc/vision.cpp -o build/temp.linux-x86_64-3.6/home/schauhan/vision/torchvision/csrc/vision.o -DTORCH_EXTENSION_NAME=torchvision._C -std=c++11                                                  
In file included from /home/schauhan/vision/env/lib/python3.6/site-packages/torch/lib/include/pybind11/pytypes.h:12:0,                                                                                      
                 from /home/schauhan/vision/env/lib/python3.6/site-packages/torch/lib/include/pybind11/cast.h:13,                                                                                           
                 from /home/schauhan/vision/env/lib/python3.6/site-packages/torch/lib/include/pybind11/attr.h:13,                                                                                           
                 from /home/schauhan/vision/env/lib/python3.6/site-packages/torch/lib/include/pybind11/pybind11.h:43,                                                                                       
                 from /home/schauhan/vision/env/lib/python3.6/site-packages/torch/lib/include/torch/torch.h:6,                                                                                              
                 from /home/schauhan/vision/torchvision/csrc/cpu/vision.h:2,                          
                 from /home/schauhan/vision/torchvision/csrc/nms.h:2,                                 
                 from /home/schauhan/vision/torchvision/csrc/vision.cpp:1:                            
<command-line>:0:33: error: expected initializer before _._ token                                     
/home/schauhan/vision/env/lib/python3.6/site-packages/torch/lib/include/pybind11/detail/common.h:212:47: note: in definition of macro _PYBIND11_CONCAT_                                                     
 #define PYBIND11_CONCAT(first, second) first##second                                                 
                                               ^~~~~~                                                 
/home/schauhan/vision/torchvision/csrc/vision.cpp:6:1: note: in expansion of macro _PYBIND11_MODULE_  
 PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {        
 ^~~~~~~~~~~~~~~                                   
/home/schauhan/vision/torchvision/csrc/vision.cpp:6:17: note: in expansion of macro _TORCH_EXTENSION_NAME_                                                                                                  
 PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {        
                 ^~~~~~~~~~~~~~~~~~~~              
<command-line>:0:33: error: expected initializer before _._ token                                     
/home/schauhan/vision/env/lib/python3.6/site-packages/torch/lib/include/pybind11/detail/common.h:171:51: note: in definition of macro _PYBIND11_PLUGIN_IMPL_                                                
     extern "C" PYBIND11_EXPORT PyObject *PyInit_##name()                                             
                                                   ^~~~                                               
/home/schauhan/vision/torchvision/csrc/vision.cpp:6:1: note: in expansion of macro _PYBIND11_MODULE_  
 PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {        
 ^~~~~~~~~~~~~~~                                   
/home/schauhan/vision/torchvision/csrc/vision.cpp:6:17: note: in expansion of macro _TORCH_EXTENSION_NAME_                                                                                                  
 PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {        
                 ^~~~~~~~~~~~~~~~~~~~              
<command-line>:0:33: error: expected initializer before _._ token                                     
/home/schauhan/vision/env/lib/python3.6/site-packages/torch/lib/include/pybind11/detail/common.h:212:47: note: in definition of macro _PYBIND11_CONCAT_                                                     
 #define PYBIND11_CONCAT(first, second) first##second                                                 
                                               ^~~~~~                                                 
/home/schauhan/vision/torchvision/csrc/vision.cpp:6:1: note: in expansion of macro _PYBIND11_MODULE_  
 PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {        
 ^~~~~~~~~~~~~~~                                   
/home/schauhan/vision/torchvision/csrc/vision.cpp:6:17: note: in expansion of macro _TORCH_EXTENSION_NAME_                                                                                                  
 PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {        
                 ^~~~~~~~~~~~~~~~~~~~              
error: command 'gcc' failed with exit status 1

This is stopping me from successfully building the extension, and hence cannot access the roi_align_backwards from python interface.
How can I fix this?

fmassa · 2018-05-22T18:27:18Z

This is the error you get before even applying your patch, is that right?

xssChauhan · 2018-05-22T18:33:59Z

Yes. Initially thought that it was introduced by me. Then tried on the layers branch, and found the same issue.

fmassa · 2018-05-22T18:36:45Z

weird. Let me try compiling it (last time I checked it was working)

fmassa · 2018-05-22T18:38:39Z

Ok, I know what's going on.

You need to have a source install from PyTorch. The bug you are facing was fixed in pytorch/pytorch#6986

xssChauhan · 2018-05-22T18:43:11Z

Understood. Would install PyTorch from source.
Was using pip installation till now. Thank you :)

xssChauhan · 2018-05-23T17:42:06Z

@fmassa Installed PyTorch from source and then tried compiling the layers branch.
Here's the error that i got:

running install
running bdist_egg
running egg_info
writing torchvision.egg-info/PKG-INFO
writing dependency_links to torchvision.egg-info/dependency_links.txt
writing requirements to torchvision.egg-info/requires.txt
writing top-level names to torchvision.egg-info/top_level.txt
reading manifest file 'torchvision.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no previously-included files matching '__pycache__' found under directory '*'
warning: no previously-included files matching '*.py[co]' found under directory '*'
writing manifest file 'torchvision.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
building 'torchvision._C' extension
gcc -pthread -B /home/shikhar/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/shikhar/Documents/vision-org/torchvision/csrc -I/home/shikhar/anaconda3/lib/python3.6/site-packages/torch/lib/include -I/home/shikhar/anaconda3/lib/python3.6/site-packages/torch/lib/include/TH -I/home/shikhar/anaconda3/lib/python3.6/site-packages/torch/lib/include/THC -I/home/shikhar/anaconda3/include/python3.6m -c /home/shikhar/Documents/vision-org/torchvision/csrc/vision.cpp -o build/temp.linux-x86_64-3.6/home/shikhar/Documents/vision-org/torchvision/csrc/vision.o -DTORCH_EXTENSION_NAME=_C -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
gcc -pthread -B /home/shikhar/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/shikhar/Documents/vision-org/torchvision/csrc -I/home/shikhar/anaconda3/lib/python3.6/site-packages/torch/lib/include -I/home/shikhar/anaconda3/lib/python3.6/site-packages/torch/lib/include/TH -I/home/shikhar/anaconda3/lib/python3.6/site-packages/torch/lib/include/THC -I/home/shikhar/anaconda3/include/python3.6m -c /home/shikhar/Documents/vision-org/torchvision/csrc/cpu/ROIAlign_cpu.cpp -o build/temp.linux-x86_64-3.6/home/shikhar/Documents/vision-org/torchvision/csrc/cpu/ROIAlign_cpu.o -DTORCH_EXTENSION_NAME=_C -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/home/shikhar/Documents/vision-org/torchvision/csrc/cpu/ROIAlign_cpu.cpp:226:66: error: macro "AT_ASSERT" passed 2 arguments, but takes just 1
   AT_ASSERT(!input.type().is_cuda(), "input must be a CPU tensor");
                                                                  ^
/home/shikhar/Documents/vision-org/torchvision/csrc/cpu/ROIAlign_cpu.cpp:227:64: error: macro "AT_ASSERT" passed 2 arguments, but takes just 1
   AT_ASSERT(!rois.type().is_cuda(), "rois must be a CPU tensor");
                                                                ^
/home/shikhar/Documents/vision-org/torchvision/csrc/cpu/ROIAlign_cpu.cpp: In function ‘at::Tensor ROIAlign_forward_cpu(const at::Tensor&, const at::Tensor&, float, int, int, int)’:
/home/shikhar/Documents/vision-org/torchvision/csrc/cpu/ROIAlign_cpu.cpp:226:3: error: ‘AT_ASSERT’ was not declared in this scope
   AT_ASSERT(!input.type().is_cuda(), "input must be a CPU tensor");
   ^
error: command 'gcc' failed with exit status 1

fmassa · 2018-05-24T08:13:41Z

Ok, this is due to a recent change in PyTorch that modified the behavior of AT_ASSERT as you can see in pytorch/pytorch#7104

I believe you can replace AT_ASSERT with AT_CHECK and it should compile.

And sorry for the troubles getting this branch to compile, as you can see PyTorch is evolving quite fast!

xssChauhan · 2018-05-24T17:36:06Z

Thank you. I'll do the changes.
Quiet exciting to see PyTorch grow so fast and learn from it!

xssChauhan · 2018-05-26T12:44:30Z

Hey @fmassa
torchvision is now successfully compiling, but here's the issue that i am facing now:

Even the successful compilation does not seem to add layers or _C modules to torchvision.

Here are the steps that i took:

Install pytorch from source using the master branch and no cuda support.
Replace all the occurences of AT_ASSERT with AT_CHECK in nms_cpu.cpp and ROIAlign_cpu.cpp.
Install torchvision from source using layers branch

The build process throws no error.
In the shell:

Here are a couple of implementation details:

Using a conda environment
The build folder seems to have the compiled extension and the layers module
I dont know if it is relevant, but installing pytorch and vision from source has different effects
I have tried manually copying the contents of the build folder to the environment. The _C extension, is thus present in the environment package, but still does not appear in the terminal.

How can i fix this? Would be grateful for the help.

fmassa · 2018-05-28T20:22:32Z

Hi,

So, I believe the copying of the _C file should fix all (almost?) all the issues.
I've installed torchvision using python setup.py build develop, so that we have a symlink to the torchvision directory.

Can you try doing

from torchvision import layers

If that doesn't work, one possibility that I think that's happening is that you might need to uninstall your previous torchvision installation before installing the new one using python setup.py install.

Could you try doing conda uninstall torchvision, check in your distribution that torchvision is not in the site-packages folder, than then try installing it again using python setup.py build develop?

xssChauhan · 2018-05-31T10:37:17Z

@fmassa My bad. from torchvision import pytorch seemed to do the trick. Thank you.

fmassa · 2018-06-19T11:28:22Z

@xssChauhan is this ready for review? Did you manage to perform the gradcheck?

xssChauhan · 2018-06-20T06:42:54Z

@fmassa The branch is currently not passing the gradcheck. I've been working on finding the issue.

My code is referenced from Caffe2 and vision's implementation of the same.
I recently stumbled upon this implementation that references caffe2 as well. Using the same parameters as in the link to run the gradcheck.

varunagrawal · 2018-10-16T22:08:23Z

Since this PR seems to be abandoned, I have taken the liberty to add the backwards pass for CPU in #630.

fmassa · 2018-10-17T09:46:08Z

Closing in favor of #630

Adding ROIAlign backwards for CPU

32e59ae

xssChauhan added 2 commits May 17, 2018 22:36

Adding Batch Size argument

b60fcae

Registering Python interface for ROIAlign_backward_cpu

373b087

Replacing AT_ASSERT with AT_CHECK

763842b

ahirner mentioned this pull request Jul 2, 2018

Assertion macros compatible with pytorch master #540

Merged

fmassa closed this Oct 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding ROIAlign backwards for CPU #504

Adding ROIAlign backwards for CPU #504

xssChauhan commented May 17, 2018

fmassa commented May 17, 2018

xssChauhan commented May 17, 2018

xssChauhan commented May 17, 2018

sampepose commented May 19, 2018

fmassa commented May 19, 2018

fmassa commented May 19, 2018

xssChauhan commented May 20, 2018

xssChauhan commented May 22, 2018 •

edited

Loading

fmassa commented May 22, 2018

xssChauhan commented May 22, 2018

fmassa commented May 22, 2018

fmassa commented May 22, 2018

xssChauhan commented May 22, 2018

xssChauhan commented May 23, 2018

fmassa commented May 24, 2018

xssChauhan commented May 24, 2018 •

edited

Loading

xssChauhan commented May 26, 2018

fmassa commented May 28, 2018

xssChauhan commented May 31, 2018

fmassa commented Jun 19, 2018

xssChauhan commented Jun 20, 2018

varunagrawal commented Oct 16, 2018

fmassa commented Oct 17, 2018

Adding ROIAlign backwards for CPU #504

Adding ROIAlign backwards for CPU #504

Conversation

xssChauhan commented May 17, 2018

fmassa commented May 17, 2018

xssChauhan commented May 17, 2018

xssChauhan commented May 17, 2018

sampepose commented May 19, 2018

fmassa commented May 19, 2018

fmassa commented May 19, 2018

xssChauhan commented May 20, 2018

xssChauhan commented May 22, 2018 • edited Loading

fmassa commented May 22, 2018

xssChauhan commented May 22, 2018

fmassa commented May 22, 2018

fmassa commented May 22, 2018

xssChauhan commented May 22, 2018

xssChauhan commented May 23, 2018

fmassa commented May 24, 2018

xssChauhan commented May 24, 2018 • edited Loading

xssChauhan commented May 26, 2018

fmassa commented May 28, 2018

xssChauhan commented May 31, 2018

fmassa commented Jun 19, 2018

xssChauhan commented Jun 20, 2018

varunagrawal commented Oct 16, 2018

fmassa commented Oct 17, 2018

xssChauhan commented May 22, 2018 •

edited

Loading

xssChauhan commented May 24, 2018 •

edited

Loading