Make the pointwise pooling code more modular #87
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
While trying to write a Fourier max pooling layer, I realized that a lot of the general pooling functionality I needed was already implemented by the various pointwise pooling layers, but unfortunately not in a way that could be used by outside classes. To make the pooling code easier to extend, and to reduce code duplication, I made the following changes:
I moved the Gaussian blurring functionality (used by the "antialiased" layers) into it's own standalone module. This eliminates a significant amount of existing code duplication between the
PointwiseAvgPoolAntialiased2D
,PointwiseAvgPoolAntialiased3D
, and_PointwiseMaxPoolAntialiasedND
classes, and makes it easier to create new pooling layers that involve Gaussian blurring (which is important for maintaining translational equivariance).I consolidated the logic in the pointwise
evaluate_output_shape()
methods into a single standalone function. This again eliminates a significant amount code duplication, this time between pretty much every pointwise pooling class, and again makes it easier to create new pooling layers. 1I made an effort to implement
forward()
using the exact same object returned byexport()
, to the extent possible. This just helps ensure that the exported model doesn't behave differently from the original model. I also implemented theexport()
method for the antialiased pooling layers.I reduced the number of pointwise pooling base classes to three:
_PointwisePoolND
: for directly applying a pytorch pooling layer to aGeometricTensor
._PointwiseAvgPoolAntialiasedND
: for applying a Gaussian blur._PointwiseMaxPoolAntialiasedND
: for applying a max pool followed by a Gaussian blur.This makes the inheritance hierarchy easier to understand, and eliminates some code duplication between mostly redundant base classes. Note that each of these new base classes pools in a fundamentally different way. A key to being able to architect the code like this is the use of the standalone modules/functions described above, rather than inheritance, to share functionality between the base classes.
To guarantee that this PR does not change any behavior of any of the refactored pooling layers, I added a number of additional tests. These tests fall into two categories:
Starting with a known input tensor, make sure that every element of the output tensor is exactly what it should be. In other words, check that the pooling operation is calculated correctly. These tests are very rigorous, but were tedious to write, because for each I had to calculate the expected result by hand. As a result, I don't test large tensors (i.e. batch and channel dimensions of 1, spatial dimensions of ≈5) and I only test one reasonable/representative set of input parameters for each type of pointwise pooling layer.
Check that every input parameter has the expected result on the shape of the output tensor. While not as rigorous as the above tests, these tests are still good, because the primary effect of most of the input parameters is to change the size of the output. Plus, it's easy to manually calculate expected output sizes for many input parameter variations.
I confirmed that the new tests (and the old tests, which mostly cover equivariance) pass both for the existing code and for this PR. This makes me confident that the proposed changes do not change any existing behavior, which is important.
Minor comments:
ceil_mode
argument that doesn't do anything. This argument affects whether some integer expression divided by the stride is rounded up or down when calculating the output size for a pooling layer. In the antialiased pooling layers, the actual pooling is always done with stride=1, in which case the above quotient is always an integer and rounding has no effect. In the name of backwards compatibility, I kept the argument and added a warning if the user actually specifies it. Eventually, I think this argument should be removed.Footnotes
Although as an aside, I don't really like that every equivariant module is forced to calculate an output size. This isn't always easy to do, and as far as I can tell nothing in the code base actually uses this information, so it always feels like a waste of time. ↩