Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crop layer for automatically aligning computations #1976

Closed
wants to merge 5 commits into from

Conversation

shelhamer
Copy link
Member

master edition of #1639 -- thanks to a rebase by @philkr. After #1974 and #1975.

Existing layers shift and warp coordinate space: translation by padding (or lack thereof), contraction by strided convolution or pooling, and expansion by strided deconvolution (#1615). Often one wants to align two blobs, e.g., to establish a correspondence between input and output, or to fuse two different paths of computation. Counting conv/deconv strides to ensure that blob coordinates have the same scale is generally straightforward. Computing the offset between two blobs that results from intermediate padding and kernel sizes is trickier.

This layer takes two bottom blobs and produces one top, which is a copy of the first bottom cropped to the size of the second so that coordinates exactly correspond, i.e., it makes sense to fuse or compare the top blob with the second bottom, regardless of whatever padding or other shenanigans took place between their computation.

This is done by computing the coordinate mapping between the two bottom blobs, as provided by #1637 and made accessible by #1638. If that mapping is a simple translation, and has the right sign to allow the first blob to be "cropped to" the second, the layer simply performs the copy. If the mapping is not an integer translation, or the translation has the wrong sign, an error is thrown, and the net may be rearranged to allow sensible fusion.

The implementation of LayerSetUp amounts to some simple graph traversal to find the path connecting the two bottoms. Currently Net does not provide great facilities for traversing the layer graph, so it's a bit cumbersome; maybe this can be improved in the future.

There is a bit of engineering involved in these three PRs, but the result is pretty convenient: what was before a tricky offline calculation becomes a trivial layer specification.

Another way to implement this, without #1974, would be to remove the graph traversal from CropLayer, giving it a simple parameter instead, and provide some other mechanism for automatically filling in that parameter.

Currently CPU and GPU (trivially) are provided, but tests and documentation are not.

shelhamer added a commit to shelhamer/caffe that referenced this pull request Feb 26, 2015
Crop layer for automatically aligning computations

* shelhamer/crop-layer:
  add CropLayer for cropping one blob to another using induced coordinates
  layers get a pointer back to their owning Net
  implement coord_map for all applicable layers
  add FilterMap for the coord mapping used by (de)conv and pooling layers
  add util/coords.hpp for coordinate mapping functions
shelhamer added a commit to shelhamer/caffe that referenced this pull request Feb 26, 2015
Crop layer for automatically aligning computations

* shelhamer/crop-layer:
  add CropLayer for cropping one blob to another using induced coordinates
  layers get a pointer back to their owning Net
  implement coord_map for all applicable layers
  add FilterMap for the coord mapping used by (de)conv and pooling layers
  add util/coords.hpp for coordinate mapping functions
@shelhamer shelhamer mentioned this pull request Mar 10, 2015
longjon added a commit to longjon/caffe that referenced this pull request Mar 10, 2015
Crop layer for automatically aligning computations
longjon added a commit to longjon/caffe that referenced this pull request Mar 10, 2015
Crop layer for automatically aligning computations
longjon added a commit to longjon/caffe that referenced this pull request Mar 10, 2015
Crop layer for automatically aligning computations
weiliu89 added a commit to weiliu89/caffe that referenced this pull request Apr 1, 2015
Crop layer for automatically aligning computations
elleryrussell pushed a commit to elleryrussell/caffe that referenced this pull request May 1, 2015
Crop layer for automatically aligning computations
elleryrussell pushed a commit to elleryrussell/caffe that referenced this pull request Jul 3, 2015
Crop layer for automatically aligning computations

Conflicts:
	include/caffe/vision_layers.hpp
twerdster pushed a commit to twerdster/caffe that referenced this pull request Jul 19, 2015
Crop layer for automatically aligning computations
@ctensmeyer
Copy link

This is a great thing. I have one suggestion about specifying the operation. The actual data of one of the bottom blobs is not needed; you only need its shape and the coordinate mapping so you can perform the crop. Instead of specifying the "crop like this" blob as a bottom blob, it could instead be referenced by name in another field in a new CropLayerParameter protobuf message. This way, we avoid introducing a split layer (and the additional data copying) that would occur with making it a bottom blob. This reduces GPU memory usage and could allow larger networks to use this.

@longjon
Copy link
Contributor

longjon commented Sep 26, 2015

For those watching, this is due for an update with a less intrusive version (like the "another way" above) that takes advantage of net spec, after which I think it'll be ready for merge.

@waldol1, yes, that's a good point about the extra allocation. Unfortunately you can't really specify a layer name in a parameter (without some extra mechanism), since that breaks the layer abstraction (a layer has no way to access other layers (without the backpointer to its net, which is present here but removed in the "less intrusive version" above) except through its top and bottom links). Let's think more about the right way to address that!

@mohomran
Copy link
Contributor

@longjon: Does the less intrusive version already exist in some form? I'm willing to pitch in either way to make this ready for merge.

LowikC pushed a commit to LowikC/caffe that referenced this pull request Dec 31, 2015
Crop layer for automatically aligning computations

# Conflicts:
#	include/caffe/common_layers.hpp
#	include/caffe/layer.hpp
#	include/caffe/neuron_layers.hpp
#	include/caffe/vision_layers.hpp
#	src/caffe/net.cpp
@BlGene
Copy link
Contributor

BlGene commented Jan 15, 2016

@longjon Do you know when the netspec version of this PR can be committed?

In lieu of this I did a naive rebased this PR here. Which sefaults for some reason, 😬, probably due to the now different sharing layers from root.

I didn't look at this too closely, but for the less intrusive version I guess the idea is for the crop layer to have access to the other layers over the layer sharing through net functionality ( whats the right name for this?). Then the only hurdle would be that the variables that determine the transformation are stored in different formats ( kernel_shape_ as Blob in conv and deconv, vs kernel_h_, kernel_w_ in pool ). This would mean (a.) either storing all transformation in the same format, or (b.) special code in the crop layer to know where it should look. --This is pure conjecture.

BR, Max

@longjon
Copy link
Contributor

longjon commented Jan 15, 2016

Sorry for the wait folks; I hope to have an update on this next week.

You may want to take a look at my most recently rebased version (still fairly out-of-date, but less than this PR) at https://github.com/longjon/caffe/tree/future.

@BlGene: not really, the new way I prefer to do this is:

  1. All magic is removed from the Crop layer, which takes crop values as normal parameters.
  2. The crop amounts are computed ahead-of-time by Python code that reads the net graph.

This moves the magic from inside a layer to outside the net, preserving the layers of abstraction without hindering future functionality; as a bonus it makes the crop values discoverable to the user, and can be adapted for other features that rely on computing the coordinate maps. More details to come!

@BlGene
Copy link
Contributor

BlGene commented Jan 18, 2016

@longjon

Yes, that would be even less invasive. I updated my rebase to so that the crop layer should now work this way. The layer only works for 4D blobs, one could in theory extend it.

The next step would be to write a demo python script that calculates appropriate crop offset parameters for the FCN net for example.

BR, Max

@shelhamer
Copy link
Member Author

Replaced by N-D crop in #3570.

@shelhamer shelhamer closed this Feb 27, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants