Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable analyzing nested input- and output-dicts #212

Merged
merged 16 commits into from
Feb 5, 2023
Merged

Enable analyzing nested input- and output-dicts #212

merged 16 commits into from
Feb 5, 2023

Conversation

snimu
Copy link
Contributor

@snimu snimu commented Jan 13, 2023

Fixes issue#141.

There was an unexpected complication, as described in my comment on pull-request#195: some num_params were negative. I've fixed it by ignoring negative numbers for total_params and trainable_params in ModelStatistics.__init__(...).

I don't know why the num_params is sometimes negative, and so cannot say for sure that this solution is correct, only that it produces consistent results. Hopefully, this doesn't cause any issues in the future, though at some point, I would like to experimentally validate at least the calculated memory consumption.

Speaking of memory consumption, neither LayerInfo.param_bytes nor LayerInfo.output_bytes is ever negative, at least not in the models that I've tested, so I haven't included corresponding checks for this pull-request.

@mert-kurttutan stated in his comment to pull-request#195 that it might be interesting to accumulate the sizes of all torch.Tensor in a nested structure. I have chosen not to implement this, because it leads to a few complications:

  1. If the resulting list of sizes is nested (e.g. [[3, 64, 64], [3, 64, 64]]), torchinfo.summary fails (I've tried this; IIRC, raise ValueError("Unknown input type") is triggered in torchinfo.py::forward because of x, though I don't know why). Some rewriting would have to occur.
  2. If the resulting list of sizes is flattened (e.g. [3, 64, 64, 3, 64, 64]), the output of summary becomes very ugly very quickly. Also, this would be confusing, as it is now unclear what the exact input- and output-structure is.

In both cases, backward-compatability would be broken in some cases (the output of summary would look different for some models & inputs), so a decision would have to be made whether this is acceptable, or a new parameter (recurse_inputs or something like that) would have to be added to give the user control over this feature.

I didn't want to make the design-decisions and thought that this basic solution for making dict work as well for input_data as it already works for tuple and list is sufficient for the time being.

Thoughts appreciated :)

@codecov
Copy link

codecov bot commented Jan 13, 2023

Codecov Report

Merging #212 (116202f) into main (8305e1f) will increase coverage by 0.01%.
The diff coverage is 92.30%.

@@            Coverage Diff             @@
##             main     #212      +/-   ##
==========================================
+ Coverage   97.39%   97.41%   +0.01%     
==========================================
  Files           6        6              
  Lines         615      618       +3     
==========================================
+ Hits          599      602       +3     
  Misses         16       16              
Impacted Files Coverage Δ
torchinfo/layer_info.py 96.11% <90.00%> (+0.01%) ⬆️
torchinfo/model_statistics.py 98.57% <100.00%> (+0.04%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Copy link
Owner

@TylerYep TylerYep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall change seems good. My only feedback is that we should try to get some test coverage for the missing lines to make sure they work properly.

yield `True`, but if "1.7.1" is installed, torchversion_at_least("1.8")` would
yield `False`.
"""
version_installed = torch.__version__.split(".")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the version utils from packaging instead of reimplementing them:

https://stackoverflow.com/questions/11887762/how-do-i-compare-version-numbers-in-python

This also makes my test.yml file a lot simpler, thanks for the suggestion

self.trainable_params += layer_info.leftover_trainable_params()
leftover_params = layer_info.leftover_params()
leftover_trainable_params = layer_info.leftover_trainable_params()
self.total_params += leftover_params if leftover_params > 0 else 0
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is suspicious, let's see if we can figure out why params can ever be negative. I'll take a look too

@snimu
Copy link
Contributor Author

snimu commented Jan 17, 2023

Alright, I'll increase coverage 🙂

@snimu
Copy link
Contributor Author

snimu commented Jan 17, 2023

Increased coverage (though I'm working on more - don't merge yet);

replaced the custom torch version comparison; I don't quite understand what you mean by this:

This also makes my test.yml file a lot simpler, thanks for the suggestion

@TylerYep
Copy link
Owner

This also makes my test.yml file a lot simpler, thanks for the suggestion
If you look at this file, there's a really complex matrix of Pytorch versions and Python versions that are supported.

I want to eventually move all of this to a set of function-specific ignore rules to make it easier to manage and to expand test coverage to more supported versions.

- torch_nested has 99.something% test-coverage
- Makes test-coverage for this package much easier
- Increases readability & extensibility
@snimu
Copy link
Contributor Author

snimu commented Jan 21, 2023

When writing tests for this PR, I have had trouble creating inputs to LayerInfo.calculate_size that would cover an edgecase but not raise an exception somewhere else in torchinfo, so I created torch-nested to externalize the functionality and tests. It seems to work very well and would have the following advantages:

  1. Make test-coverage easier for torchinfo (torch-nested currently has code-coverage of 99.6%)
  2. Make LayerInfo.calculate_size more readable and maintainable
  3. For the future, make it easier to extend torchinfo to accept more data-structures.

I'm actively working on torch-nested, but the basic functionality needed for torchinfo—easy accessing of tensors in deeply nested structures—will not change, but only be extended to more data-structures.

So far, I've replicated the functionality of LayerInfo.calculate_size one-to-one. I don't quite understand why the last
torch.Tensor is used for dict but the first for list and tuple, but have preserved that functionality, too.

What do you think?

PS: The additional cov-miss is coming from the except ImportError-statement in line 14 of layer_info.py. Unfortunately, this is necessary for the reason named in the comment:

# Repeated imports: if "import torch" etc. are outside try-except,
    #   isort will put them above the "from torch_nested ..."-import,
    #   which will cause pylint to throw the following error:
    # "C0412: Imports from package torch are not grouped (ungrouped-imports)"

@TylerYep
Copy link
Owner

Hi, unfortunately I am not willing to accept PRs moving parts of torchinfo to an outside library and using that library instead - as a maintainer it will add a lot of additional burden in terms of aligning releases, version support, and ease of refactoring without breaking existing users' code unintentionally.

You can bring the necessary parts of torch-nested library's functionality to torchinfo and I can review (which I believe was your previous version of this PR), but adding that library as a dependency is a nonstarter.

trouble creating inputs to LayerInfo.calculate_size that would cover an edgecase but not raise an exception somewhere else in torchinfo

If this is the case then we should address this exceptions, please feel free to share the code that created these exceptions and we can investigate and fix those issues.

I don't quite understand why the last torch.Tensor is used for dict but the first for list and tuple, but have preserved that functionality, too.

This was arbitrary and I am willing to make a backwards incompatible change to this, as long as it makes the behavior better. Ideally we should show all tensor sizes in dicts but if we can only show one for now then any tensor is fine - this behavior is not set in stone yet.

…tead

- Fixes issue#141
- Increases test-coverage
- Produces more plausible output for some cases
@snimu
Copy link
Contributor Author

snimu commented Jan 21, 2023

Hi, unfortunately I am not willing to accept PRs moving parts of torchinfo to an outside library and using that library instead - as a maintainer it will add a lot of additional burden in terms of aligning releases, version support, and ease of refactoring without breaking existing users' code unintentionally.

Alright, I understand.

Ideally we should show all tensor sizes in dicts ...

I've played with extracting the full shape of nested structures in LayerInfo.calculate_size, but other parts of torchinfo had problems with the output—specifically, somewhere in forward_pass, an exception occured with very nested structures (it may have been process_input instead, it was a week ago ;)This is actually why I thought creating the torch-nested package would make sense, to create a generalized way of interacting with nested data-structures containing torch.Tensors. However, I fully understand that you don't want the burden of having to interact with external packages, so I've fixed the issue without using it now.

... but if we can only show one for now then any tensor is fine - this behavior is not set in stone yet.

I have left it in for now. Thought it might be better to change the behavior in one go when the change is made to show the full data-structure-size.

trouble creating inputs to LayerInfo.calculate_size that would cover an edgecase but not raise an exception somewhere else in torchinfo

If this is the case then we should address this exceptions, please feel free to share the code that created these exceptions and we can investigate and fix those issues.

I think I may have mixed up some things: IIRC, I've mostly raised exceptions when trying to extract the full size and shape of nested data-structures as described above, not for test-cases. Sorry. If I have time, I'll look into it at some point.

@snimu snimu mentioned this pull request Jan 31, 2023
@snimu
Copy link
Contributor Author

snimu commented Feb 5, 2023

Now fixes issue#215 in addition to issue#141 (and with good test-coverage)

@TylerYep TylerYep linked an issue Feb 5, 2023 that may be closed by this pull request
@TylerYep
Copy link
Owner

TylerYep commented Feb 5, 2023

Looks great, thank you for fixing this and cleaning up the code along the way! This code can be really confusing and it's great that this solution can work for many models and external libraries too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

conflict with compressai AttributeError: 'tuple' object has no attribute 'size'
2 participants