-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable analyzing nested input- and output-dicts #212
Conversation
Codecov Report
@@ Coverage Diff @@
## main #212 +/- ##
==========================================
+ Coverage 97.39% 97.41% +0.01%
==========================================
Files 6 6
Lines 615 618 +3
==========================================
+ Hits 599 602 +3
Misses 16 16
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
- `elem_bytes` in `LayerInfo.calculate_size(...)` didn't work for nested dicts
- adapted highly_nested_dict_model.out accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall change seems good. My only feedback is that we should try to get some test coverage for the missing lines to make sure they work properly.
tests/fixtures/torchversion.py
Outdated
yield `True`, but if "1.7.1" is installed, torchversion_at_least("1.8")` would | ||
yield `False`. | ||
""" | ||
version_installed = torch.__version__.split(".") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use the version utils from packaging instead of reimplementing them:
https://stackoverflow.com/questions/11887762/how-do-i-compare-version-numbers-in-python
This also makes my test.yml file a lot simpler, thanks for the suggestion
self.trainable_params += layer_info.leftover_trainable_params() | ||
leftover_params = layer_info.leftover_params() | ||
leftover_trainable_params = layer_info.leftover_trainable_params() | ||
self.total_params += leftover_params if leftover_params > 0 else 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is suspicious, let's see if we can figure out why params can ever be negative. I'll take a look too
Alright, I'll increase coverage 🙂 |
…cts with `tensor`-attribute - Found error in new testcase that comes with this commit
missing: - not hasattr(inputs, "__getitem__") - last return
Increased coverage (though I'm working on more - don't merge yet); replaced the custom torch version comparison; I don't quite understand what you mean by this:
|
I want to eventually move all of this to a set of function-specific ignore rules to make it easier to manage and to expand test coverage to more supported versions. |
- torch_nested has 99.something% test-coverage - Makes test-coverage for this package much easier - Increases readability & extensibility
When writing tests for this PR, I have had trouble creating inputs to
I'm actively working on torch-nested, but the basic functionality needed for torchinfo—easy accessing of tensors in deeply nested structures—will not change, but only be extended to more data-structures. So far, I've replicated the functionality of What do you think? PS: The additional cov-miss is coming from the
|
Hi, unfortunately I am not willing to accept PRs moving parts of torchinfo to an outside library and using that library instead - as a maintainer it will add a lot of additional burden in terms of aligning releases, version support, and ease of refactoring without breaking existing users' code unintentionally. You can bring the necessary parts of torch-nested library's functionality to torchinfo and I can review (which I believe was your previous version of this PR), but adding that library as a dependency is a nonstarter.
If this is the case then we should address this exceptions, please feel free to share the code that created these exceptions and we can investigate and fix those issues.
This was arbitrary and I am willing to make a backwards incompatible change to this, as long as it makes the behavior better. Ideally we should show all tensor sizes in dicts but if we can only show one for now then any tensor is fine - this behavior is not set in stone yet. |
…tead - Fixes issue#141 - Increases test-coverage - Produces more plausible output for some cases
Alright, I understand.
I've played with extracting the full shape of nested structures in
I have left it in for now. Thought it might be better to change the behavior in one go when the change is made to show the full data-structure-size.
I think I may have mixed up some things: IIRC, I've mostly raised exceptions when trying to extract the full size and shape of nested data-structures as described above, not for test-cases. Sorry. If I have time, I'll look into it at some point. |
Fix [issue#214](#215)
Looks great, thank you for fixing this and cleaning up the code along the way! This code can be really confusing and it's great that this solution can work for many models and external libraries too. |
Fixes issue#141.
There was an unexpected complication, as described in my comment on pull-request#195: some
num_params
were negative. I've fixed it by ignoring negative numbers fortotal_params
andtrainable_params
inModelStatistics.__init__(...)
.I don't know why the
num_params
is sometimes negative, and so cannot say for sure that this solution is correct, only that it produces consistent results. Hopefully, this doesn't cause any issues in the future, though at some point, I would like to experimentally validate at least the calculated memory consumption.Speaking of memory consumption, neither
LayerInfo.param_bytes
norLayerInfo.output_bytes
is ever negative, at least not in the models that I've tested, so I haven't included corresponding checks for this pull-request.@mert-kurttutan stated in his comment to pull-request#195 that it might be interesting to accumulate the sizes of all
torch.Tensor
in a nested structure. I have chosen not to implement this, because it leads to a few complications:[[3, 64, 64], [3, 64, 64]]
),torchinfo.summary
fails (I've tried this; IIRC,raise ValueError("Unknown input type")
is triggered intorchinfo.py::forward
because ofx
, though I don't know why). Some rewriting would have to occur.[3, 64, 64, 3, 64, 64]
), the output ofsummary
becomes very ugly very quickly. Also, this would be confusing, as it is now unclear what the exact input- and output-structure is.In both cases, backward-compatability would be broken in some cases (the output of
summary
would look different for some models & inputs), so a decision would have to be made whether this is acceptable, or a new parameter (recurse_inputs
or something like that) would have to be added to give the user control over this feature.I didn't want to make the design-decisions and thought that this basic solution for making
dict
work as well forinput_data
as it already works fortuple
andlist
is sufficient for the time being.Thoughts appreciated :)