Optim-wip: Improve loss objective testing coverage #951

ProGamerGov · 2022-05-21T22:04:36Z

This PR does the following:

Ensures loss testing coverage is as high as possible.
Simplified loss composition code with new rmodule_op function.
Removed the NumPy import from loss testing.

Summary: This deploys pyfmt with usort 1.0 and the new import merging behavior. Facebook This is part of the final rollout, announced here: https://fb.workplace.com/groups/pyfmt/posts/1011066416197541/ Preemptive SEV: S271899 Hand rolled on devserver and laptops, with binaries hosted on manifold bucket `pyfi_wheels`. Couldn't use MSDK bump due to issue with make_par on sandcastle Macs: https://fb.workplace.com/groups/fbpython/posts/7503431436364825/ pokemon_lift Reviewed By: zertosh Differential Revision: D36394396 fbshipit-source-id: 7cee2a05261e3281fe86360cdb2faa62df1d9a4e

Summary: Applies new import merging and sorting from µsort v1.0. When merging imports, µsort will make a best-effort to move associated comments to match merged elements, but there are known limitations due to the diynamic nature of Python and developer tooling. These changes should not produce any dangerous runtime changes, but may require touch-ups to satisfy linters and other tooling. Note that µsort uses case-insensitive, lexicographical sorting, which results in a different ordering compared to isort. This provides a more consistent sorting order, matching the case-insensitive order used when sorting import statements by module name, and ensures that "frog", "FROG", and "Frog" always sort next to each other. For details on µsort's sorting and merging semantics, see the user guide: https://usort.readthedocs.io/en/stable/guide.html#sorting Reviewed By: lisroach Differential Revision: D36402214 fbshipit-source-id: b641bfa9d46242188524d4ae2c44998922a62b4c

Summary: This updates SGD linear models to work appropriately with Lime, addressing pytorch#910 . Particularly, this switches Lime interpretable model inputs / outputs from double to float and enables gradients when necessary. Also adds a unit test to Lime for testing with SGD linear models. Pull Request resolved: pytorch#938 Reviewed By: NarineK Differential Revision: D36331146 Pulled By: vivekmig fbshipit-source-id: 84d7aecf293404f9ba0b14c48e8723e0e489b392

Summary: By default: `"1.8.0" > "1.10.0"` will be equal to True, despite 1.10 being a later version that 1.8.0. This PR fixes this issue. Pull Request resolved: pytorch#940 Reviewed By: NarineK Differential Revision: D36336547 Pulled By: vivekmig fbshipit-source-id: 84f277eb1e6897a8378ce9eb8c9eab3285ad8494

* Ensure testing coverage is as high as possible. * Simplified code with new `rmodule_op` function. * Removed the NumPy import from loss testing.

* Wrap all remaining `torch.__version__` calls in `version.parse`. * Remove unused version check in `typing.py`. * Expose `MaxPool2dRelaxed` to users so that tutorials using it work. * Expose `dataset` module to users. * Fixed `show` & `save_tensor_as_image` docs.

* Improve efficiency of the `FacetLoss` objective.

Summary: This switches usage of full backward hooks to instead apply forward hooks which then add tensor backward hooks, as suggested in pytorch#914 . We initially did not choose this approach since it may have limitations with backward hooks on modules with multiple tensors as inputs / outputs (each tensor must be called independently in the hook), but all current use-cases within Captum only require a single tensor input / output. This change allows us to enable in-place modules as well as remove the limitation on neuron input attribution. DeepLift also no longer needs valid module checks, as these are no longer applicable with usage of tensor hooks. Pull Request resolved: pytorch#979 Reviewed By: NarineK Differential Revision: D41687791 Pulled By: vivekmig fbshipit-source-id: 2ddc5aac7b9bf70a56ffb3ace3dc026fca7d4bfa

Summary: Pull Request resolved: pytorch#1073 - For all `TracInCPBase` implementations, this adds an additional `test_loss_fn` initialization argument, which is the loss function to apply to test examples when computing the influence of a training example on a test example. With this change,the influence score is a sum over terms for each checkpoint, where each term is the gradient of `loss_fn` for a given training example, multiplied with the gradient of `test_loss_fn` for a given test example. Before, `test_loss_fn` was assumed to be the same as `loss_fn`. - checks regarding the reduction type of both `loss_fn` and `test_loss_fn` are now handled by helper functions `_check_tracincp_loss_fn` and `_check_tracincp_fast_loss_fn`. - documentation is updated. one detail: for `TracInCP`, we assume that `sample_wise_grads_per_batch` is applied to both `loss_fn` and `test_loss_fn` (if provided), and this is mentioned in the documentation. - `test_tracin_regression.test_tracin_regression` is slightly modified - `DataInfluenceConstructor` now can explicitly pass in the same loss function for both `loss_fn` and `test_loss_fn` (done when `duplicate_loss_fn=True`). Doing so would have the same effect as not passing in `test_loss_fn`, so the original tests are also applied to the case when `duplicate_loss_fn=True`, as the expected behavior should be the same as before. - a new test, `test_tracin_regression.test_tracin_constant_test_loss_fn` is added. For all implementations of `TracInCPBase`, it checks that if `test_loss_fn` is a constant loss function, the influence scores are all 0's. This should be the case, because if `test_loss_fn` is constant, its gradients would all be 0's, so that training examples have 0 influence on test examples. Reviewed By: cyrjano Differential Revision: D41202866 fbshipit-source-id: c2359955081ed88998c9312c744bef6ae3c21e58

Summary: Pull Request resolved: pytorch#1068 This diff adds the `compute_intermediate_quantities` method to `TracInCP`, which returns influence embeddings such that the influence of one example on another is the dot-product of their respective influence embeddings. In the case of `TracInCP`, its influence embeddings are simply the parameter-gradients for an example, concatenated over different checkpoints. There is also an `aggregate` option that if True, returns not the influence embeddings of each example in the given dataset, but instead their *sum*. This is useful for the validation diff workflow (which is the next diff in the stack), where we want to calculate the influence of a given training example on an entire validation dataset. This can be accomplished by taking the dot-product of the training example's influence embedding with the *sum* of the influence embeddings over the validation dataset (i.e. with `aggregate=True`) For tests, the tests currently used for `TracInCPFastRandProj.compute_intermediate_quantities` (`test_tracin_intermediate_quantities.test_tracin_intermediate_quantities_api`, `test_tracin_intermediate_quantities.test_tracin_intermediate_quantities_consistent`) are applied to `TracInCP.compute_intermediate_quantities`. In addition, `test_tracin_intermediate_quantities.test_tracin_intermediate_quantities_aggregate` is added to test the `aggregate=True` option, checking that with `aggregate=True`, the returned influence embedding is indeed the sum of the influence embeddings for the given dataset. Reviewed By: cyrjano Differential Revision: D40688327 fbshipit-source-id: a4427affa877bb51130114e23d00179c2f32c600

Summary: Pull Request resolved: pytorch#1090 Add a new `str` argument `reg_reduction` in Captum STG classes, which specifies how the returned regularization should be reduced. Following Pytorch Loss's design, support 3 modes: `sum`, `mean`, and `none`. The default is `sum`. (There may be needs for other modes in future, like `weighted_sum`. With customized `mask`, each gate may handle different number of elements. The application may want to use as few elements as possible instead of as few gates. For now, such use cases can use `none` option and reduce themselves) Although we previously used `mean`, we decided to change to `sum` as default for 3 reasons: 1. The original paper "LEARNING SPARSE NEURAL NETWORKS THROUGH L0 REGULARIZATION" used `sum` both in its writing and its [implementation](https://github.com/AMLab-Amsterdam/L0_regularization/blob/master/l0_layers.py#L70) {F822978249} 2. L^1 and L^2 regularization also `sum` over each parameter without averaging over total number of parameters within a model. See [Pytorch's implementation](https://github.com/pytorch/pytorch/blob/df569367ef444dc9831ef0dde3bc611bcabcfbf9/torch/optim/adagrad.py#L268) 3. When there are multiple STG of imbalanced lengths, the results are comparable in `sum` but not `mean`. If the model has 2 STG, where one has 100 gates and the other has one single gate, the regularization of each gate in the 1st STG will be divided by 100 in `mean`, which makes the 1st STG 100 times weaker than the 2nd STG. This is usually unexpected for users. Using `mean` or `sum` will not impact the performance when there is only one BSN layer, coz people can tune `reg_weight` to counter the difference. The authors of "Feature selection using Stochastic Gates" mixed using `sum` and `mean` in [their implementation](https://github.com/runopti/stg/blob/master/python/stg/models.py#L164-L195) For backward compatibility, explicitly specified `reg_reduction = "mean"` for all existing usages in Pyper and MVAI. Reviewed By: cyrjano, edqwerty10 Differential Revision: D41991741 fbshipit-source-id: 698db938fc373747db0df1b1145c6e9943476142

Summary: Due to floating point arithmetic inaccuracies and limitations switching to double for test cases and passing dtype to projection matrix so that the the arithmetic error is within accepted range. Pull Request resolved: pytorch#1081 Reviewed By: 99warriors Differential Revision: D41919707 Pulled By: NarineK fbshipit-source-id: f8ef65e751ada7c3d3baddb1231b547c3be1823c

Summary: `ufmt` (`black + usort`) overlaps with many rules of `flake8`. `ufmt` can automatically fix many violations given by `flake8`. So we should screen if any errors can be auto fixed by `ufmt` at first to avoid wasting people's time. And only show the rest `flake8` errors that require manual fixes. Pull Request resolved: pytorch#1089 Reviewed By: cyrjano Differential Revision: D42006058 Pulled By: aobo-y fbshipit-source-id: 711065a35da4816c7fd30068b1c5734de961caa1

Summary: Add sphinx pages for STG Pull Request resolved: pytorch#1092 Reviewed By: vivekmig Differential Revision: D42053425 Pulled By: aobo-y fbshipit-source-id: 93f098734ecc736c1b1232a8abb80d46c2362940

Summary: Adds an optional mask argument to the FGSM and PGD adversarial attacks. This mask determines which pixels are affected by the adversarial perturbations. If no mask is specified, then all pixels are affected. This PR resolves pytorch#941. Pull Request resolved: pytorch#1043 Reviewed By: NarineK Differential Revision: D40442870 Pulled By: vivekmig fbshipit-source-id: f9f0688519f61f1520e949e3d7a8cb1cfc0342f3

Summary: Pull Request resolved: pytorch#1072 This diff changes the API for implementations of `TracInCPBase` as discussed in https://fb.quip.com/JbpnAiWluZmI. In particular, the arguments representing test data of the `influence` method are changed from `inputs: Tuple, targets: Optional[Tensor]` to `inputs: Union[Tuple[Any], DataLoader]`, which is either a single batch, or a dataloader yielding batches. In both cases, `model(*batch)` is assumed to produce the predictions for a batch, and `batch[-1]` is assumed to be the labels for a batch. This is the same format assumed of the batches yielded by `train_dataloader`. We make this change for 2 reasons - it unifies the assumptions made of the test data and the assumptions made of the training data - for some implementations, we want to allow the test data to be represented by a dataloader. with the old API, there was no clean way to allow both a single as well as a dataloader to be passed in, since a batch required 2 arguments, but a dataloader only requires 1. For now, all implementations only allow `inputs` to be a tuple (and not a dataloader). This is okay due to inheritance rules. Later on, we will allow some implementations (i.e. `TracInCP`) to accept a dataloader as `inputs`. Other changes: - changes to make documentation. for example, documentation in `TracInCPBase.influence` now refers to the "test dataset" instead of test batch. - the `unpack_inputs` argument is no longer needed for the `influence` methods, and is removed - the usage of `influence` in all the tests is changed to match new API. - signature of helper methods `_influence_batch_tracincp` and `_influence_batch_tracincp_fast` are changed to match new representation of batches. Reviewed By: cyrjano Differential Revision: D41324297 fbshipit-source-id: 4fe4f188050a11c7c243750614d29a6ec49607ae

Summary: Include inherited methods in sphinx and remove dummy wrapper in children ![Screen Shot 2022-12-16 at 6 46 00 PM](https://user-images.githubusercontent.com/5113450/208220030-50d853a9-bea8-4a4d-bdfc-9673c18fc987.png) Pull Request resolved: pytorch#1095 Reviewed By: NarineK Differential Revision: D42117580 Pulled By: aobo-y fbshipit-source-id: 6eace5ff620c5765bff3df511d95a01a9652ff87

Summary: Correctly validate the `forward_fun`'s output shape in `FeatureAblation` (and `FeaturePermutation`) Abandoned previous flawed assumption of "aggregation mode", which forbid the support for multi-outputs models (ref pytorch#1047) New logic does not care output shape when `perturbations_per_eval == 1`. Only when `perturbations_per_eval > 1`, it require "Non-Aggregation mode", which is defined as the 1st dim of the model's output should grow with the input's batch size in the same ratio. This is achieved by actually comparing the output shape of 2 different inputs instead of making any assumption based on other user config: - The baseline output is from the initial eval with the original inputs which we have to run anyway. - The expanded output is from the 1st ablated eval whose input batch size has been expanded for more feature perturbation. This way does not even introduce any extra forward calls. Pull Request resolved: pytorch#1091 Reviewed By: vivekmig Differential Revision: D42027843 Pulled By: aobo-y fbshipit-source-id: cbcbad64bb1695e7be9c9447ebde6ec3f0cb8a90

Summary: Circle CI job `test_conda` is failing because of some missing dependencies, like ``` E ImportError: libtiff.so.5: cannot open shared object file: No such file or directory ``` The cause is that when we install `nodejs` through conda, it will upgrade some packages, like ``` > conda install -y --no-channel-priority -c conda-forge nodejs=14 ... libtiff 4.3.0-hf544144_1 --> 4.5.0-h82bc61c_0 ``` The newer version is not be compatible with some python packages. Downgrade those incompatible packages will fix the issue. But I made this PR to just drop the installations of `nodejs` and `yarn` (and some other unused packages like `mypy` `flake8`). Also stop building `captum-insight` I feel `nodejs` & `yarn` are used for building the website and Insight's frontend. However, this script is only used in for CircleCI's conda testing, which does not execute anything related to Insight frontend or build website, so it safe to remove them. (I am not sure why we include them in the 1st place. Let me know if I missed some use cases of this script) Pull Request resolved: pytorch#1097 Reviewed By: vivekmig Differential Revision: D42201597 Pulled By: aobo-y fbshipit-source-id: a57a13a76176ffa47e5728d6359410c2f599ba24

Summary: Pull Request resolved: pytorch#1094 Reviewed By: vivekmig Differential Revision: D42103175 Pulled By: aobo-y fbshipit-source-id: fbde8a84842a19dea772a3dbadb5f0d1e6af8bc1

Summary: This allows examination of each channel's contribution. That is useful if channels are something other than standard RGB, for example multi-spectral input, potentially with many spectral channels. Pull Request resolved: pytorch#1086 Reviewed By: vivekmig Differential Revision: D42221000 Pulled By: NarineK fbshipit-source-id: 1b04276d68e4a22a1d7338bd80436b118268d787

Summary: - correct arg type `optional` based on our convention - code block highlight - add example Pull Request resolved: pytorch#1100 Reviewed By: cyrjano Differential Revision: D42493054 Pulled By: aobo-y fbshipit-source-id: 9491d0202a9bcd73ace93482500ffc7ca902c819

Summary: * Make language more inclusive by removing unnecessarily gendered language Pull Request resolved: pytorch#1106 Test Plan: ### Before ``` .../captum $ grep -ri "[^a-z]$she\|he\|her\|him\|his$[^a-z]" --include=\*.py captum/attr/_utils/approximation_methods.py: proposed by [Xue Feng and her intern Hauroun Habeeb] captum/concept/_utils/classifier.py: empty dictionary if she/he decides to not return any performance ``` ### After ``` .../captum $ grep -ri "[^a-z]$she\|he\|her\|him\|his$[^a-z]" --include=\*.py captum/attr/_utils/approximation_methods.py: proposed by [Xue Feng and her intern Hauroun Habeeb] ``` Reviewed By: aobo-y Differential Revision: D42823268 Pulled By: jessijzhao fbshipit-source-id: d741a9f57384eaec32d6574081d16506bf14dc7e

Summary: Building on top of this pull request: pytorch#668 Pull Request resolved: pytorch#1105 Reviewed By: aobo-y Differential Revision: D42811693 Pulled By: skim9 fbshipit-source-id: 23b07caf6bc161fbac35f225070c4f74ba206743

Summary: Fix the failed conda ci job which is caused by the dependencies conflicts of libmamba-solver - remove the libmamba-solver and use the default solver instead Also "quietly" execute `conda install` to disable progressbar, which pollutes the CircleCI logs Pull Request resolved: pytorch#1108 Reviewed By: skim9 Differential Revision: D43005598 Pulled By: aobo-y fbshipit-source-id: 3719a1ebc7eac976759a57ed9ab455fe5d5a8908

Summary: Fix failing captum test cases in gradient shap and layer conductance related to timeout Reviewed By: vivekmig Differential Revision: D44208585 fbshipit-source-id: 45e989e113b195a2a52aec6ecf831908efe41a29

Clip tokenizer

Add clip models

Add clip loss objectives

Add transparency tutorial

Improve docs

add model tutorial

update custom modules tutorial

Optim wip version check

ProGamerGov and others added 16 commits April 26, 2022 13:06

Add the CLIP ResNet 50x4 model

34bedcf

Merge branch 'optim-wip' into optim-wip-clip-vis

e9598ea

Update CLIP model for new testing & linting

599d8e1

Add CLIP loss objectives

d27e6c2

Fix Mypy error

77850c7

Fix Mypy errors

a4eee84

Add Optimization With Transparency tutorial

452979b

Improve loss objective docs + batch_index

e2e58da

Fix NeuronActivation docs

c905352

Fix FacetLoss docs

8c28dad

Improve VectorLoss docs

32fc693

Improve loss objective testing

7b78aaa

* Ensure testing coverage is as high as possible. * Simplified code with new `rmodule_op` function. * Removed the NumPy import from loss testing.

facebook-github-bot added the cla signed label May 21, 2022

ProGamerGov added 13 commits May 21, 2022 16:23

Fix mypy test errors

69a73b2

Fix mypy tests

caffe7c

Update test_loss.py

973aacc

Add more tests

6e6f4e6

Fix weird value mismatch

221c72b

Add batch_index tests to new objectives

862ddce

Miscellaneous Fixes

b196bbe

* Wrap all remaining `torch.__version__` calls in `version.parse`. * Remove unused version check in `typing.py`. * Expose `MaxPool2dRelaxed` to users so that tutorials using it work. * Expose `dataset` module to users. * Fixed `show` & `save_tensor_as_image` docs.

Add Model Preparation Tutorial

c656658

Improve vector function

0e7d0f4

Improve the FacetLoss objective

3b67bb0

* Improve efficiency of the `FacetLoss` objective.

Add CLIP objectives to __all__

4c51ef1

Separate some loss tests

36df47e

Fix mistake in FacetLoss docs

31cb2a9

vivekmig and others added 29 commits December 9, 2022 18:10

add stg pages in sphinx (pytorch#1092)

7c228ac

Summary: Add sphinx pages for STG Pull Request resolved: pytorch#1092 Reviewed By: vivekmig Differential Revision: D42053425 Pulled By: aobo-y fbshipit-source-id: 93f098734ecc736c1b1232a8abb80d46c2362940

Updating version to 0.6.0 (pytorch#1094)

b398d52

Summary: Pull Request resolved: pytorch#1094 Reviewed By: vivekmig Differential Revision: D42103175 Pulled By: aobo-y fbshipit-source-id: fbde8a84842a19dea772a3dbadb5f0d1e6af8bc1

Fix captum's internal failing test cases

50f7bdd

Summary: Fix failing captum test cases in gradient shap and layer conductance related to timeout Reviewed By: vivekmig Differential Revision: D44208585 fbshipit-source-id: 45e989e113b195a2a52aec6ecf831908efe41a29

Update setup.py

dd610df

Merge pull request #635 from ProGamerGov/patch-22

7e3ebae

Clip tokenizer

Merge pull request #636 from ProGamerGov/optim-wip-clip-vis

040ca9f

Add clip models

Merge pull request #637 from ProGamerGov/optim-wip-clip-loss-objectives

b98c20e

Add clip loss objectives

Merge pull request #638 from ProGamerGov/optim-wip-transparency

94b90d6

Add transparency tutorial

Merge pull request #639 from ProGamerGov/optim-wip-loss-docs

a1e0633

Improve docs

Merge pull request #642 from ProGamerGov/optim-wip-model-tutorial

32e4f33

add model tutorial

Merge pull request #643 from ProGamerGov/optim-wip-custom-modules

ebf133d

update custom modules tutorial

Merge pull request #640 from ProGamerGov/optim-wip-version-check

613e052

Optim wip version check

Merge branch 'master-0-new-1' into optim-wip-loss-testing

99d5b00

ProGamerGov mentioned this pull request Apr 21, 2025

Full Implementation of the Captum-Optim Module #1545

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optim-wip: Improve loss objective testing coverage #951

Optim-wip: Improve loss objective testing coverage #951

Uh oh!

ProGamerGov commented May 21, 2022 •

edited

Loading

Uh oh!

Uh oh!

Optim-wip: Improve loss objective testing coverage #951

Are you sure you want to change the base?

Optim-wip: Improve loss objective testing coverage #951

Uh oh!

Conversation

ProGamerGov commented May 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ProGamerGov commented May 21, 2022 •

edited

Loading