Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the pointwise pooling code more modular #87

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

kalekundert
Copy link
Contributor

While trying to write a Fourier max pooling layer, I realized that a lot of the general pooling functionality I needed was already implemented by the various pointwise pooling layers, but unfortunately not in a way that could be used by outside classes. To make the pooling code easier to extend, and to reduce code duplication, I made the following changes:

  • I moved the Gaussian blurring functionality (used by the "antialiased" layers) into it's own standalone module. This eliminates a significant amount of existing code duplication between the PointwiseAvgPoolAntialiased2D, PointwiseAvgPoolAntialiased3D, and _PointwiseMaxPoolAntialiasedND classes, and makes it easier to create new pooling layers that involve Gaussian blurring (which is important for maintaining translational equivariance).

  • I consolidated the logic in the pointwise evaluate_output_shape() methods into a single standalone function. This again eliminates a significant amount code duplication, this time between pretty much every pointwise pooling class, and again makes it easier to create new pooling layers. 1

  • I made an effort to implement forward() using the exact same object returned by export(), to the extent possible. This just helps ensure that the exported model doesn't behave differently from the original model. I also implemented the export() method for the antialiased pooling layers.

  • I reduced the number of pointwise pooling base classes to three:

    • _PointwisePoolND: for directly applying a pytorch pooling layer to a GeometricTensor.
    • _PointwiseAvgPoolAntialiasedND: for applying a Gaussian blur.
    • _PointwiseMaxPoolAntialiasedND: for applying a max pool followed by a Gaussian blur.

    This makes the inheritance hierarchy easier to understand, and eliminates some code duplication between mostly redundant base classes. Note that each of these new base classes pools in a fundamentally different way. A key to being able to architect the code like this is the use of the standalone modules/functions described above, rather than inheritance, to share functionality between the base classes.

To guarantee that this PR does not change any behavior of any of the refactored pooling layers, I added a number of additional tests. These tests fall into two categories:

  • Starting with a known input tensor, make sure that every element of the output tensor is exactly what it should be. In other words, check that the pooling operation is calculated correctly. These tests are very rigorous, but were tedious to write, because for each I had to calculate the expected result by hand. As a result, I don't test large tensors (i.e. batch and channel dimensions of 1, spatial dimensions of ≈5) and I only test one reasonable/representative set of input parameters for each type of pointwise pooling layer.

  • Check that every input parameter has the expected result on the shape of the output tensor. While not as rigorous as the above tests, these tests are still good, because the primary effect of most of the input parameters is to change the size of the output. Plus, it's easy to manually calculate expected output sizes for many input parameter variations.

I confirmed that the new tests (and the old tests, which mostly cover equivariance) pass both for the existing code and for this PR. This makes me confident that the proposed changes do not change any existing behavior, which is important.

Minor comments:

  • The antialiased max pooling layers have a ceil_mode argument that doesn't do anything. This argument affects whether some integer expression divided by the stride is rounded up or down when calculating the output size for a pooling layer. In the antialiased pooling layers, the actual pooling is always done with stride=1, in which case the above quotient is always an integer and rounding has no effect. In the name of backwards compatibility, I kept the argument and added a warning if the user actually specifies it. Eventually, I think this argument should be removed.

Footnotes

  1. Although as an aside, I don't really like that every equivariant module is forced to calculate an output size. This isn't always easy to do, and as far as I can tell nothing in the code base actually uses this information, so it always feels like a waste of time.

- Move the Gaussian blurring functionality (used by the "antialiased"
  layers) into it's own module.

- Consolidate the logic in all of the pointwise `evaluate_output_shape()`
  methods into a single standalone function.

- Implement `forward()` using the exact same object returned by
  `export()`, to the extent possible.

Signed-off-by: Kale Kundert <kale@thekunderts.net>
Signed-off-by: Kale Kundert <kale@thekunderts.net>
Signed-off-by: Kale Kundert <kale@thekunderts.net>
Signed-off-by: Kale Kundert <kale@thekunderts.net>
@kalekundert kalekundert force-pushed the feat-pool branch 2 times, most recently from 33a5d8f to 0d230db Compare November 21, 2023 22:58
Signed-off-by: Kale Kundert <kale@thekunderts.net>
Signed-off-by: Kale Kundert <kale@thekunderts.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant