Enable analyzing nested input- and output-dicts #212

snimu · 2023-01-13T10:19:10Z

There was an unexpected complication, as described in my comment on pull-request#195: some num_params were negative. I've fixed it by ignoring negative numbers for total_params and trainable_params in ModelStatistics.__init__(...).

I don't know why the num_params is sometimes negative, and so cannot say for sure that this solution is correct, only that it produces consistent results. Hopefully, this doesn't cause any issues in the future, though at some point, I would like to experimentally validate at least the calculated memory consumption.

Speaking of memory consumption, neither LayerInfo.param_bytes nor LayerInfo.output_bytes is ever negative, at least not in the models that I've tested, so I haven't included corresponding checks for this pull-request.

@mert-kurttutan stated in his comment to pull-request#195 that it might be interesting to accumulate the sizes of all torch.Tensor in a nested structure. I have chosen not to implement this, because it leads to a few complications:

If the resulting list of sizes is nested (e.g. [[3, 64, 64], [3, 64, 64]]), torchinfo.summary fails (I've tried this; IIRC, raise ValueError("Unknown input type") is triggered in torchinfo.py::forward because of x, though I don't know why). Some rewriting would have to occur.
If the resulting list of sizes is flattened (e.g. [3, 64, 64, 3, 64, 64]), the output of summary becomes very ugly very quickly. Also, this would be confusing, as it is now unclear what the exact input- and output-structure is.

In both cases, backward-compatability would be broken in some cases (the output of summary would look different for some models & inputs), so a decision would have to be made whether this is acceptable, or a new parameter (recurse_inputs or something like that) would have to be added to give the user control over this feature.

I didn't want to make the design-decisions and thought that this basic solution for making dict work as well for input_data as it already works for tuple and list is sufficient for the time being.

Thoughts appreciated :)

codecov · 2023-01-13T10:23:06Z

Codecov Report

Merging #212 (116202f) into main (8305e1f) will increase coverage by 0.01%.
The diff coverage is 92.30%.

@@            Coverage Diff             @@
##             main     #212      +/-   ##
==========================================
+ Coverage   97.39%   97.41%   +0.01%     
==========================================
  Files           6        6              
  Lines         615      618       +3     
==========================================
+ Hits          599      602       +3     
  Misses         16       16

Impacted Files	Coverage Δ
torchinfo/layer_info.py	`96.11% <90.00%> (+0.01%)`	⬆️
torchinfo/model_statistics.py	`98.57% <100.00%> (+0.04%)`	⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

…installed

- `elem_bytes` in `LayerInfo.calculate_size(...)` didn't work for nested dicts

- adapted highly_nested_dict_model.out accordingly

TylerYep

Overall change seems good. My only feedback is that we should try to get some test coverage for the missing lines to make sure they work properly.

TylerYep · 2023-01-17T01:31:50Z

tests/fixtures/torchversion.py

+    yield `True`, but if "1.7.1" is installed, torchversion_at_least("1.8")` would
+    yield `False`.
+    """
+    version_installed = torch.__version__.split(".")


Let's use the version utils from packaging instead of reimplementing them:

https://stackoverflow.com/questions/11887762/how-do-i-compare-version-numbers-in-python

This also makes my test.yml file a lot simpler, thanks for the suggestion

TylerYep · 2023-01-17T02:46:42Z

torchinfo/model_statistics.py

-                self.trainable_params += layer_info.leftover_trainable_params()
+                leftover_params = layer_info.leftover_params()
+                leftover_trainable_params = layer_info.leftover_trainable_params()
+                self.total_params += leftover_params if leftover_params > 0 else 0


This is suspicious, let's see if we can figure out why params can ever be negative. I'll take a look too

snimu · 2023-01-17T12:47:30Z

Alright, I'll increase coverage 🙂

…cts with `tensor`-attribute - Found error in new testcase that comes with this commit

missing: - not hasattr(inputs, "__getitem__") - last return

snimu · 2023-01-17T19:02:49Z

Increased coverage (though I'm working on more - don't merge yet);

replaced the custom torch version comparison; I don't quite understand what you mean by this:

This also makes my test.yml file a lot simpler, thanks for the suggestion

TylerYep · 2023-01-19T18:27:36Z

This also makes my test.yml file a lot simpler, thanks for the suggestion
If you look at this file, there's a really complex matrix of Pytorch versions and Python versions that are supported.

I want to eventually move all of this to a set of function-specific ignore rules to make it easier to manage and to expand test coverage to more supported versions.

- torch_nested has 99.something% test-coverage - Makes test-coverage for this package much easier - Increases readability & extensibility

snimu · 2023-01-21T11:57:33Z

When writing tests for this PR, I have had trouble creating inputs to LayerInfo.calculate_size that would cover an edgecase but not raise an exception somewhere else in torchinfo, so I created torch-nested to externalize the functionality and tests. It seems to work very well and would have the following advantages:

Make test-coverage easier for torchinfo (torch-nested currently has code-coverage of 99.6%)
Make LayerInfo.calculate_size more readable and maintainable
For the future, make it easier to extend torchinfo to accept more data-structures.

I'm actively working on torch-nested, but the basic functionality needed for torchinfo—easy accessing of tensors in deeply nested structures—will not change, but only be extended to more data-structures.

So far, I've replicated the functionality of LayerInfo.calculate_size one-to-one. I don't quite understand why the last
torch.Tensor is used for dict but the first for list and tuple, but have preserved that functionality, too.

What do you think?

PS: The additional cov-miss is coming from the except ImportError-statement in line 14 of layer_info.py. Unfortunately, this is necessary for the reason named in the comment:

# Repeated imports: if "import torch" etc. are outside try-except,
    #   isort will put them above the "from torch_nested ..."-import,
    #   which will cause pylint to throw the following error:
    # "C0412: Imports from package torch are not grouped (ungrouped-imports)"

TylerYep · 2023-01-21T18:44:44Z

Hi, unfortunately I am not willing to accept PRs moving parts of torchinfo to an outside library and using that library instead - as a maintainer it will add a lot of additional burden in terms of aligning releases, version support, and ease of refactoring without breaking existing users' code unintentionally.

You can bring the necessary parts of torch-nested library's functionality to torchinfo and I can review (which I believe was your previous version of this PR), but adding that library as a dependency is a nonstarter.

trouble creating inputs to LayerInfo.calculate_size that would cover an edgecase but not raise an exception somewhere else in torchinfo

If this is the case then we should address this exceptions, please feel free to share the code that created these exceptions and we can investigate and fix those issues.

I don't quite understand why the last torch.Tensor is used for dict but the first for list and tuple, but have preserved that functionality, too.

This was arbitrary and I am willing to make a backwards incompatible change to this, as long as it makes the behavior better. Ideally we should show all tensor sizes in dicts but if we can only show one for now then any tensor is fine - this behavior is not set in stone yet.

…tead - Fixes issue#141 - Increases test-coverage - Produces more plausible output for some cases

snimu · 2023-01-21T22:24:27Z

Hi, unfortunately I am not willing to accept PRs moving parts of torchinfo to an outside library and using that library instead - as a maintainer it will add a lot of additional burden in terms of aligning releases, version support, and ease of refactoring without breaking existing users' code unintentionally.

Alright, I understand.

Ideally we should show all tensor sizes in dicts ...

I've played with extracting the full shape of nested structures in LayerInfo.calculate_size, but other parts of torchinfo had problems with the output—specifically, somewhere in forward_pass, an exception occured with very nested structures (it may have been process_input instead, it was a week ago ;)This is actually why I thought creating the torch-nested package would make sense, to create a generalized way of interacting with nested data-structures containing torch.Tensors. However, I fully understand that you don't want the burden of having to interact with external packages, so I've fixed the issue without using it now.

... but if we can only show one for now then any tensor is fine - this behavior is not set in stone yet.

I have left it in for now. Thought it might be better to change the behavior in one go when the change is made to show the full data-structure-size.

trouble creating inputs to LayerInfo.calculate_size that would cover an edgecase but not raise an exception somewhere else in torchinfo

If this is the case then we should address this exceptions, please feel free to share the code that created these exceptions and we can investigate and fix those issues.

I think I may have mixed up some things: IIRC, I've mostly raised exceptions when trying to extract the full size and shape of nested data-structures as described above, not for test-cases. Sorry. If I have time, I'll look into it at some point.

Fix [issue#214](#215)

snimu · 2023-02-05T12:25:09Z

Now fixes issue#215 in addition to issue#141 (and with good test-coverage)

TylerYep · 2023-02-05T19:49:03Z

Looks great, thank you for fixing this and cleaning up the code along the way! This code can be really confusing and it's great that this solution can work for many models and external libraries too.

snimu added 2 commits January 13, 2023 10:20

enable analyzing nested input- and output dicts

4f540b8

enable analyzing nested input- and output dicts

ef0e4ed

snimu added 4 commits January 13, 2023 12:28

skip tests that require torch v1.8 or above when an older version is …

4d17355

…installed

add test for highly nested dicts, fix error found by it

38e2af0

- `elem_bytes` in `LayerInfo.calculate_size(...)` didn't work for nested dicts

LayerInfo.calculate_size.extract_tensor now works with dict properly

a96e995

- adapted highly_nested_dict_model.out accordingly

simplified test_highly_nested_dict_model

6931e39

TylerYep reviewed Jan 17, 2023

View reviewed changes

snimu added 6 commits January 17, 2023 18:05

LayerInfo.calculate_size.extract_tensor now works properly for obje…

ad7ee14

…cts with `tensor`-attribute - Found error in new testcase that comes with this commit

Add docstring to test to explain what exactly it tests

81167cb

test all edge-cases of LayerInfo.calculate_size.extract_tensor

4d31c90

use dim=0 in F.softmax explicitely (implicit use depreciated)

b51d029

replace custom torchversion_at_least with packaging.version.parse

80bb192

modify EdgecaseInputOutputModel to increase test-coverage

55aee3e

missing: - not hasattr(inputs, "__getitem__") - last return

use torch_nested-package to simplify LayerInfo.calculate_size

6daa9f1

- torch_nested has 99.something% test-coverage - Makes test-coverage for this package much easier - Increases readability & extensibility

Move back from using torch-nested. Fix and use nested_list_size ins…

563b8ba

…tead - Fixes issue#141 - Increases test-coverage - Produces more plausible output for some cases

snimu mentioned this pull request Jan 31, 2023

conflict with compressai #215

Closed

snimu added 2 commits February 5, 2023 11:38

Fix problem with accessing of dicts

a0a7dbc

Fix [issue#214](#215)

Install compressai in workflows

116202f

TylerYep linked an issue Feb 5, 2023 that may be closed by this pull request

conflict with compressai #215

Closed

TylerYep merged commit c879e2a into TylerYep:main Feb 5, 2023

TylerYep mentioned this pull request Feb 5, 2023

AttributeError: 'tuple' object has no attribute 'size' #195

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable analyzing nested input- and output-dicts #212

Enable analyzing nested input- and output-dicts #212

snimu commented Jan 13, 2023

codecov bot commented Jan 13, 2023 •

edited

Loading

TylerYep left a comment

TylerYep Jan 17, 2023

TylerYep Jan 17, 2023

snimu commented Jan 17, 2023

snimu commented Jan 17, 2023

TylerYep commented Jan 19, 2023

snimu commented Jan 21, 2023 •

edited

Loading

TylerYep commented Jan 21, 2023

snimu commented Jan 21, 2023

snimu commented Feb 5, 2023

TylerYep commented Feb 5, 2023

Enable analyzing nested input- and output-dicts #212

Enable analyzing nested input- and output-dicts #212

Conversation

snimu commented Jan 13, 2023

codecov bot commented Jan 13, 2023 • edited Loading

Codecov Report

TylerYep left a comment

Choose a reason for hiding this comment

TylerYep Jan 17, 2023

Choose a reason for hiding this comment

TylerYep Jan 17, 2023

Choose a reason for hiding this comment

snimu commented Jan 17, 2023

snimu commented Jan 17, 2023

TylerYep commented Jan 19, 2023

snimu commented Jan 21, 2023 • edited Loading

TylerYep commented Jan 21, 2023

snimu commented Jan 21, 2023

snimu commented Feb 5, 2023

TylerYep commented Feb 5, 2023

codecov bot commented Jan 13, 2023 •

edited

Loading

snimu commented Jan 21, 2023 •

edited

Loading