Optim-wip: Add model linearization, and expanded weights spatial positions #574

ProGamerGov · 2020-12-27T18:52:47Z

Optionally replace nonlinear MaxPool2d layers with their linear AvgPool2d equivalents. By removing nonlinear operations from models, the expanded weights can be used for direction objectives and other stuff. I also added a few sentences to the tutorial about removing nonlinear operations.
Optionally replace nonlinear ReLU / RedirectedReLU layers with empty layers that do nothing.
Added info for how to visualize expanded weights spatial positions in the weight vis tutorial.
Added info to tutorial for how to find and visualize top neuron connections by using expanded weights.
Improved some weight vis tutorial descriptions per Optim wip - General fixes, SharedImage, & Weight Visualization #543 (review)
I also added an expanded weights test which checks that removing nonlinear layers has the correct effect on expanded weights. This also lets me assert the output tensor values.

* Optionally replace non-linear MaxPool2d layers with their linear AvgPool2d equivalents. * Added info for how to visualize expanded weights spatial positions in expanded weights / weight vis tutorial.

ProGamerGov · 2020-12-30T18:49:18Z

@NarineK I tried to improve the descriptions in the weight vis tutorial, so it should hopefully be easier to understand and follow now. Let me know if any areas still need improvements!

ProGamerGov · 2021-01-05T02:18:46Z

@NarineK So, I was testing the crop function on expanded weights and I discovered that it has an issue with an output shape of (5,5) and an input shape of (14,14). This is a bit of a problem as the mixed4a, b, c, d, & e layers have a size of 14,14 with a size of (5,5) when the padding is cropped away. The center crop seems to work with literally every other size and output shape combo I tried, so I'm not sure what to do?

import torch
import torch.nn as nn
import torch.nn.functional as F

def crop_x(x, crop_list, pading_offset=True):
    h, w = x.shape
    h_crop = h - int(round((h - crop_list[0]) / 2.0))
    w_crop = w - int(round((w - crop_list[1]) / 2.0))
    return x[h_crop - crop_list[0] : h_crop, w_crop - crop_list[1] : w_crop]

px = F.pad(x, (3, 4, 3, 4), value=float("-inf"))
print(px.shape)
x_out = crop_x(px, [5, 5])
print(x_out)

px = F.pad(x, (4, 5, 4, 5), value=float("-inf"))
print(px.shape)
x_out = crop_x(px, [5, 5])
print(x_out)

px = F.pad(x, (5, 6, 5, 6), value=float("-inf"))
print(px.shape)
x_out = crop_x(px, [5, 5])
print(x_out)

px = F.pad(x, (11, 12, 11, 12), value=float("-inf"))
print(px.shape)
x_out = crop_x(px, [5, 5])
print(x_out)


# Weights and sizing normally wouldn't match this I think,
# so we could ignore this combo.
px = F.pad(x, (5, 4, 4, 5), value=float("-inf"))
print(px.shape)
x_out = crop_x(px, [5, 5])
print(x_out)

And the output of the above code:

torch.Size([12, 12])
tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])
torch.Size([14, 14])
tensor([[1., 1., 1., 1., -inf],
        [1., 1., 1., 1., -inf],
        [1., 1., 1., 1., -inf],
        [1., 1., 1., 1., -inf],
        [-inf, -inf, -inf, -inf, -inf]])
torch.Size([16, 16])
tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])
torch.Size([28, 28])
tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])
torch.Size([14, 14])
tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [-inf, -inf, -inf, -inf, -inf]])

The issue is evident when visualizing the heatmap of the expanded weights between mixed4 layers and a crop shape of (5,5).

NarineK · 2021-01-05T23:16:57Z

@ProGamerGov, that's interesting!
We can fix the first problem if we use math.ceil instead of round

    h_crop = h - int(math.ceil((h - crop_list[0]) / 2.0))
    w_crop = w - int(math.ceil((w - crop_list[1]) / 2.0))

Center cropping can be challenging in terms of on which side to include the endpoint. We could potentially make it as an argument to center crop function - such as center towards left or right.
What do you think ?

ProGamerGov · 2021-01-05T23:46:41Z

@NarineK Thanks for figuring out the fix! And yeah I definitely think that we should make it so that users can choose with side to crop to! I think that would be as simple as optionally subtracting or adding one to the existing indices? Or were you thinking of a better way of doing it?

NarineK · 2021-01-06T00:45:16Z

@NarineK Thanks for figuring out the fix! And yeah I definitely think that we should make it so that users can choose with side to crop to! I think that would be as simple as optionally subtracting or adding one to the existing indices? Or were you thinking of a better way of doing it?

yes, if the cropped sides are not equal. If they are equal and the cropping falls right into those equal sides then we want to make sure that we don't do unnecessary + /- 1 on one or other side.

ProGamerGov · 2021-01-06T22:17:03Z

@NarineK I've added the new center crop parameter for dealing with unequal sides!

captum/optim/_param/image/transform.py

ProGamerGov · 2021-01-08T23:33:56Z

@NarineK I added this line to the tutorial to prevent the PyTorch user warnings from appearing:

import warnings
warnings.filterwarnings("ignore", category=UserWarning, module='torch')

Those lines should make it so that we don't have to remove the warnings every time we update a notebook.

NarineK · 2021-01-09T01:52:10Z

@NarineK I added this line to the tutorial to prevent the PyTorch user warnings from appearing:
import warnings
warnings.filterwarnings("ignore", category=UserWarning, module='torch')
Those lines should make it so that we don't have to remove the warnings every time we update a notebook.

I see - it would be good that we don't alway hide the warnings when we run the notebook in case there are important warnings that we need to be aware of. That would make sense to use before we push the code into the codebase.

ProGamerGov · 2021-01-09T02:00:29Z

@NarineK Ah, okay! I've removed the import. I'll re-add and then remove it before uploading notebook updates in the future.

ProGamerGov · 2021-01-24T18:16:58Z

@NarineK The multi-gpu test seems to be broken currently as @greentfrapp had the same errors.

NarineK

Thank you for working on this PR, @ProGamerGov! Looks great! Here are couple suggestions.

It would be good that we add a bit more description in the beginning of the tutorial. The first 1-2 sentences from Extracting expanded weights section could be a good fit there too. In addition to that describing a bit about NMF etc...
visualize_activations and vis_multi share share code that I think could be combined together. Also, vis_multi is not a very clear name, maybe we can give a more clear name to it.  Some of the args have type hint some not. It would be good to be consistent and have it everywhere or remove it everywhere.
get_expanded_weights sounds a bit like a getter function but that function does more that a get. We could probably call it extract_expanded_weights or another name of your choice.
Here you might also want to describe in the text what y and x are:[target2 output channels, target1 output channels, y, x] . You can also describe that circuits are the grads of the target2 for the center activation w.r.t. target 1 etc ...
vis_neuron_direction seems to be a copy of vis_multi for a different loss where channel is replaced with vec. We can combine these functions in one visualize_activations function or at least keep transformation as a sharable definition.
For the section, Visualizing the spatial positions of expanded weights image, I think you can probably visualize the show(W_3a_3b_hm) next to it for easy comparison.  Are we getting identical results what OpenAI has in their implementation ?
Do you have documentation or citation on Weight banding ?
In terms of Weight banding I see that you access the conv layer but not pooling. Don't we need to call circuits for W_5b_c5x5 and pooling layers ?
Pooling layer related circuits you are getting in the below cell: W_p2_3a = optimviz ...
but that's not used in the cell above ...
I might be missing something here
I think that we can abstract out this piece in a function and reuse it:

       A = []
       for i in highlow_units:
           x_out = vis_multi(model, model.conv3, i)
           A.append(x_out.detach())

      grid_img = torchvision.utils.make_grid(torch.cat(A), nrow=5)
     show(grid_img)

nit: Can we, please, also describe where are we getting the highlow_units and bw indices from ?
reducer.components are we setting the components in ChannelReducer ? https://github.com/pytorch/captum/blob/ba076856d14a4c85f5903e371b272ac20b293c07/captum/optim/_utils/reducer.py#L17
In the section Multiple related neurons with a small number of factors I'd add more documentation describing the components and the semantics. Also couple more sentences about the highlights of the visualizations.

* Also added a missing type hint & updated citation.

ProGamerGov · 2021-01-25T19:36:11Z

@NarineK

Currently working on improving the descriptions based on your feedback, though it might be better to do it in a future PR as I can't think of how to word it at the moment.
visualize_activations (now called vis_neuron_large) only shares two lines of code plus two transforms with vis_multi (now known as vis_neuron), so it could be better to just leave them as separate. I've also added type hints!
The Lucid version of get_expanded_weights is also called get_expanded_weights, but I can change the name if you think that a different name would be better.
I've replaced y & x with height & width. Also, can you elaborate more on that explanation of circuits?
I've merged vis_neuron_direction and vis_multi into a new function called vis_neuron.
I've added show(W_3a_3b_hm) to the spatial visualization cell. And the weights should be the same as the Lucid / OpenAI version.
I've added info about the upcoming weight banding paper.
For weight banding we are looking directly at the non expanded weights. I've added some words that clarify this to the tutorial.
Okay, I've turned it into a function that's reused across the tutorial!
I've added links to where the neurons come from.
The reducer.components variable is a factorization matrix created during decomposition when the channel reducer is run: https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.NMF.html
I've improved the documentation, so hopefully everything makes more sense now!

ProGamerGov · 2021-01-25T22:54:13Z

@NarineK If there are no more major issues with the notebook then you can merge this PR whenever you want!

NarineK · 2021-01-26T05:19:08Z

@NarineK

Currently working on improving the descriptions based on your feedback, though it might be better to do it in a future PR as I can't think of how to word it at the moment.

visualize_activations (now called vis_neuron_large) only shares two lines of code plus two transforms with vis_multi (now known as vis_neuron), so it could be better to just leave them as separate. I've also added type hints!

The Lucid version of get_expanded_weights is also called get_expanded_weights, but I can change the name if you think that a different name would be better.

I've replaced y & x with height & width. Also, can you elaborate more on that explanation of circuits?

I've merged vis_neuron_direction and vis_multi into a new function called vis_neuron.

I've added show(W_3a_3b_hm) to the spatial visualization cell. And the weights should be the same as the Lucid / OpenAI version.

I've added info about the upcoming weight banding paper.

For weight banding we are looking directly at the non expanded weights. I've added some words that clarify this to the tutorial.

Okay, I've turned it into a function that's reused across the tutorial!

I've added links to where the neurons come from.

The reducer.components variable is a factorization matrix created during decomposition when the channel reducer is run: https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.NMF.html

I've improved the documentation, so hopefully everything makes more sense now!

Thank you for addressing all comments @ProGamerGov!

Regarding 1: Sure, we can do it in a separate PR. I was thinking that you can summarize what you are doing in the notebook in the top of the notebook. Something like as abstract.
I understand that you wanted to be consistent with lucid but I think that from engineering perspective extract_expanded_weights or compute_expanded_weights is more clear unless you want to have it identical with lucid then I'm fine with it too.

If you want to can add more explanation to the printout of top neurons and add something like:  print('Top {} neurons of channel 379 with the largest L2-norm: {} '.format(top_connected_neurons.size(), list(top_connected_neurons))) ....  
nit: Did we wanted to add return here:  https://github.com/pytorch/captum/blob/b9cbf02df212d64831eae552cfd0224a8d0bb1a3/tests/optim/models/test_models.py#L27

* Also made improvements to top channel section of the notebook.

ProGamerGov · 2021-01-26T15:36:23Z

@NarineK

I've added a basic summary what we are doing in the notebook!
Okay, I've changed get_expanded_weights to extract_expanded_weights!
I've added the improved output printing for the top channels!
I'm not quite sure, but the current version of the test seems to work? We can change it in a future PR if need be.

NarineK · 2021-01-26T19:38:46Z

@NarineK

I've added a basic summary what we are doing in the notebook!

Okay, I've changed get_expanded_weights to extract_expanded_weights!

I've added the improved output printing for the top channels!

I'm not quite sure, but the current version of the test seems to work? We can change it in a future PR if need be.

Thank you for addressing the comments, @ProGamerGov ! Regarding 4. did you try to add return statement and it failed ? I think that currently there is a bug in the recursion and it might be working that now but if we want to use the function in a more general context we will see issues.

ProGamerGov · 2021-01-26T19:43:56Z

@NarineK I'm not quite sure what you mean? There are the two return statements, and stops when it finds the target instance type:

def _check_layer_in_model(
    self,
    model: torch.nn.Module,
    layer: Type[torch.nn.Module],
) -> None:
    def check_for_layer_in_model(model, layer) -> bool:
        for name, child in model._modules.items():
            if child is not None:
                if isinstance(child, layer):
                    return True
                check_for_layer_in_model(child, layer)
        return False

    self.assertTrue(check_for_layer_in_model(model, layer))

NarineK · 2021-01-26T19:45:45Z

@NarineK I'm not quite sure what you mean? There are the two return statements, and stops when it finds the target instance type:

def _check_layer_in_model(
    self,
    model: torch.nn.Module,
    layer: Type[torch.nn.Module],
) -> None:
    def check_for_layer_in_model(model, layer) -> bool:
        for name, child in model._modules.items():
            if child is not None:
                if isinstance(child, layer):
                    return True
                check_for_layer_in_model(child, layer)
        return False

    self.assertTrue(check_for_layer_in_model(model, layer))

@ProGamerGov, I meant a return here: return check_for_layer_in_model(child, layer, in_model)
Adding a return in front of check_for_layer_in_model(child, layer, in_model)

ProGamerGov · 2021-01-26T20:03:05Z

@NarineK Ah, thank you! I made the change and the return causes the test to fail.

tests/optim/models/test_models.py:30: in _check_layer_in_model
    self.assertTrue(check_for_layer_in_model(model, layer))
E   AssertionError: False is not true

From meta-pytorch#599

ProGamerGov · 2021-01-26T20:37:02Z

@NarineK I added the GPU test fix, and now the GPU test is working again!

ProGamerGov · 2021-01-29T02:29:27Z

@NarineK I tested the _check_layer_in_model function and it does go through every layer in the model in it's current form. So, it looks like it's able to reach the inner modules.

NarineK · 2021-01-30T03:50:43Z

@NarineK I tested the _check_layer_in_model function and it does go through every layer in the model in it's current form. So, it looks like it's able to reach the inner modules.

Thank you for testing it @ProGamerGov !
It fails for example in this use case:

        class SubModule(nn.Module):
            def __init__(self):
                super().__init__()
                self.r = nn.ReLU()

            def forward(self, x):
                return self.r(x)

        class MyModel(torch.nn.Module):
            def __init__(self):
                super().__init__()
                self.l1 = nn.Linear(3, 4)
                self.sub = SubModule()
            def forward(self, x):
                return self.l1(self.sub(x))

Here: _check_layer_in_model(self, MyModel(), nn.ReLU) fails.
It will happen if we have custom submodules.

Add model linearization, and expanded weights spatial positions

6892f3c

* Optionally replace non-linear MaxPool2d layers with their linear AvgPool2d equivalents. * Added info for how to visualize expanded weights spatial positions in expanded weights / weight vis tutorial.

facebook-github-bot added the cla signed label Dec 27, 2020

ProGamerGov mentioned this pull request Dec 27, 2020

Optim-wip: Consolidate CenterCrop & improve ToRGB #573

Merged

Spelling fixes and add top channels to tutorial

336c62f

ProGamerGov force-pushed the optim-wip-circuits branch from eefab69 to 336c62f Compare December 29, 2020 21:59

ProGamerGov added 3 commits December 29, 2020 15:00

Optionally remove nonlinear ReLU & full expanded weights test

dba1e48

Use torch.norm for PyTorch 1.3.0 only

83bc5b8

Improve weight vis tutorial descriptions

94fe929

ProGamerGov force-pushed the optim-wip-circuits branch from bc0913f to 94fe929 Compare December 31, 2020 20:42

ProGamerGov added 5 commits December 31, 2020 13:44

Move ignore_layer & max2avg_pool2d to utils/models

690fc79

Improve weight vis tutorial

caf0cc7

Improve intro & expanded weight description

620f1ee

Improvements

7515fd2

Merge branch 'optim-wip' into optim-wip-circuits

2e86d98

ProGamerGov force-pushed the optim-wip-circuits branch from 699fbe2 to 2e86d98 Compare January 6, 2021 02:50

ProGamerGov added 2 commits January 5, 2021 19:52

Replace round with math.ceil in CenterCrop

752186a

Add optional center crop offset parameter for uneven sides

4116ecd

NarineK reviewed Jan 8, 2021

View reviewed changes

captum/optim/_param/image/transform.py Outdated Show resolved Hide resolved

ProGamerGov added 2 commits January 8, 2021 11:18

Improve center crop parameter description

f1f73b0

Update weight vis tutorial for new center crop

ebaacbd

Remove suppression of PyTorch UserWarnings

891fd97

ProGamerGov added 6 commits January 23, 2021 19:51

max2avg_pool2d -> replace_max_with_avgconst_pool2d

075ac59

Improve _check_layer_in_model test

73175c0

Remove unused type hint import

537fe79

Revert layer check test and add type hints

0a8b6e2

Add number of expected layers to tests

866faac

Change _check_layer_in_model based on feedback

ba07685

NarineK approved these changes Jan 25, 2021

View reviewed changes

Changes to tutorial based on feedback

130513b

* Also added a missing type hint & updated citation.

Address notebook feedback - part 2

fae868b

Better wording for new sentence

b9cbf02

ProGamerGov added 2 commits January 26, 2021 07:53

get_expanded_weights -> extract_expanded_weights

4d85d13

* Also made improvements to top channel section of the notebook.

Improve weight vis notebook introduction

83921ed

Change NMF link location

f45b208

Add return to _check_layer_in_model test

bc7fa97

ProGamerGov added 2 commits January 26, 2021 13:03

Remove return

162e47a

GPU Test Fix

2fc46d0

From meta-pytorch#599

NarineK merged commit 99f872c into meta-pytorch:optim-wip Jan 26, 2021

ProGamerGov mentioned this pull request Jan 30, 2021

Optim-wip: Fix _check_layer_in_model test #606

Merged

Optim-wip: Add model linearization, and expanded weights spatial positions #574

Optim-wip: Add model linearization, and expanded weights spatial positions #574

Uh oh!

Conversation

ProGamerGov commented Dec 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ProGamerGov commented Dec 30, 2020

Uh oh!

ProGamerGov commented Jan 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NarineK commented Jan 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ProGamerGov commented Jan 5, 2021

Uh oh!

NarineK commented Jan 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ProGamerGov commented Jan 6, 2021

Uh oh!

Uh oh!

ProGamerGov commented Jan 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NarineK commented Jan 9, 2021

Uh oh!

ProGamerGov commented Jan 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ProGamerGov commented Jan 24, 2021

Uh oh!

NarineK left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ProGamerGov commented Jan 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ProGamerGov commented Jan 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NarineK commented Jan 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ProGamerGov commented Jan 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NarineK commented Jan 26, 2021

Uh oh!

ProGamerGov commented Jan 26, 2021

Uh oh!

NarineK commented Jan 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ProGamerGov commented Jan 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ProGamerGov commented Jan 26, 2021

Uh oh!

ProGamerGov commented Jan 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NarineK commented Jan 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ProGamerGov commented Dec 27, 2020 •

edited

Loading

ProGamerGov commented Jan 5, 2021 •

edited

Loading

NarineK commented Jan 5, 2021 •

edited

Loading

NarineK commented Jan 6, 2021 •

edited

Loading

ProGamerGov commented Jan 8, 2021 •

edited

Loading

ProGamerGov commented Jan 9, 2021 •

edited

Loading

NarineK left a comment •

edited

Loading

ProGamerGov commented Jan 25, 2021 •

edited

Loading

ProGamerGov commented Jan 25, 2021 •

edited

Loading

NarineK commented Jan 26, 2021 •

edited

Loading

ProGamerGov commented Jan 26, 2021 •

edited

Loading

NarineK commented Jan 26, 2021 •

edited

Loading

ProGamerGov commented Jan 26, 2021 •

edited

Loading

ProGamerGov commented Jan 29, 2021 •

edited

Loading

NarineK commented Jan 30, 2021 •

edited

Loading