Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ScalarLayer to multiply two Blobs with broadcasting #3021

Closed
wants to merge 3 commits into from

Conversation

jeffdonahue
Copy link
Contributor

This adds ScalarLayer, which takes two Blobs and (in effect) multiplies them elementwise, after broadcasting the axes of the second Blob to match the first as necessary.

For example, if bottom[0] has shape (2, 3, 4, 5) and bottom[1] has shape (3, 4) and axis == 1, then the computation of this layer is equivalent to reshaping bottom[1] to (1, 3, 4, 1), then tiling it to (2, 3, 4, 5), then multiplying the result elementwise with bottom[0].

In the most general case, Backward to bottom[1] is accomplished with elementwise multiplication followed by 2 gemvs. For special cases (when bottom[1]'s shape corresponds to the beginning or end of bottom[0]'s shape, e.g. if it were instead shape (2, 3) and axis == 0 or shape (4, 5) with axis == 2) one or both of the gemvs is skipped (or replaced with a dot product).

My use case for this comes from #2033 -- I am replacing the hacky coeff_blob I added to Eltwise to perform the binary multiplications with this layer. It could also replace the channel-wise scalar in PReLU (I think this backward implementation is faster), or be used to learn a channel-wise scalar after batch normalization.

Thanks to @longjon for the name for this layer and initial implementation of a previous version.

@jeffdonahue
Copy link
Contributor Author

I've updated this with some cleanup in the original commit, and two extra commits. The first (0437545) makes the layer learn the scalar as a parameter if only one bottom is specified. The second (4199be6) makes it support in-place computation, which necessitates(?) adding a temp storage blob of the input size and copying the input in forward for use in backward. Happy to squash if desired (though the commits are cleanly separated).

@jeffdonahue jeffdonahue force-pushed the scalar-layer branch 2 times, most recently from aa023e6 to 6fb602a Compare December 27, 2015 18:23
@jeffdonahue
Copy link
Contributor Author

Replaced by #3591

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant