Fix calculate_macs() for Linear layers. #318

andravin · 2024-08-22T03:36:04Z

Fixes #77.

TylerYep · 2024-11-01T04:37:28Z

Could you add a unit test and also explain why this fixes the issue?

Hardcoding Linear is not ideal, but if there is something special about Linear layers that requires this fix I would like to understand it more completely.

andravin · 2024-11-01T21:24:30Z

The documentation for torch.nn.Linear says the input tensor can have "any number of dimensions."

Currently, torchinfo only handles two-dimensional input tensors correctly, any additional dimensions are ignored:

                    self.macs += self.output_size[0] * cur_params

The change accounts for all of the leading dimensions:

                elif "Linear" in self.class_name:
                    self.macs += int(cur_params * prod(self.output_size[:-1]))

You can think of torch.nn.Linear as a tensor-matrix multiplication or n-mode product, with the last dimension of the tensor as the "mode." This is equivalent to matrix multiplication if we first unfold all of the leading dimensions of the tensor into a single dimension. The size of the unfolded dimension is prod(self.output_size[:-1]).

See section A.2.3 N-mode Product in this reference for a detailed explanation.

Fix MACs in lst.out and lstm_half.out.

andravin · 2024-11-01T21:33:37Z

I corrected the output files for the LSTM tests, but the flan_t5 tests are still broken. I am not familiar with that model,so I would have to learn how to calculate the number of operations correctly.

andravin · 2024-11-01T21:40:20Z

The way LayerInfo.calculate_macs just checks a partial match on the class-name for Conv and now Linear and ignores the module name seems dicey. Exact matches against fully qualified <module>.<class> names would seem more robust.

Also, any unrecognized layer is assumed to be a Linear layer with 2D input, judging by the code in the else block. That also seems error-prone. It might be better to throw an exception for unsupported layers.

andravin · 2024-11-02T18:32:43Z

I added a unit test for torch.nn.Linear that uses a 3D input tensor.

TylerYep · 2024-11-02T18:41:40Z

Thanks for the explanation. It seems very reasonable that this error would cause other model outputs to change, so feel free to update those as suggested by the test errors.

I think this specific calculation has a lot of room for improvement, since it was only hardcoded with a few layer types to begin with. It would be great to enhance the <module>.<class> exact match code as well, but feel free to tackle these things in a separate PR.

andravin · 2024-11-02T18:55:00Z

It looks like torch.nn.Embedding layers have the same bug that torch.nn.Linear layers did.

MACs increased from 280.27M to 18.25G because of the Linear layer fix.

andravin · 2024-11-03T18:23:19Z

@TylerYep, I followed your guidance and changed the ground truth in flan_t5_small.out to equal the output of the unit tests with the linear layer MACs fix.

Let me know if you prefer the three commits in this pull request to be squashed into a single commit.

TylerYep · 2024-11-05T02:29:48Z

Looks good. I'll happily accept more PRs expanding this functionality. Thank you for your contributions!

Fix calculate_macs() for Linear layers.

c2da67d

Fix MACs in lst.out and lstm_half.out.

andravin force-pushed the fix-linear-layer-macs branch from 9a483b9 to c2da67d Compare November 1, 2024 21:29

Add test for torch.nn.Linear.

512de2c

Change groud-truth Total mult-adds in flan_t5_small.out.

c0ef108

MACs increased from 280.27M to 18.25G because of the Linear layer fix.

TylerYep merged commit 38ab72b into TylerYep:main Nov 5, 2024
30 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix calculate_macs() for Linear layers. #318

Fix calculate_macs() for Linear layers. #318

andravin commented Aug 22, 2024

TylerYep commented Nov 1, 2024 •

edited

Loading

andravin commented Nov 1, 2024

andravin commented Nov 1, 2024

andravin commented Nov 1, 2024

andravin commented Nov 2, 2024

TylerYep commented Nov 2, 2024

andravin commented Nov 2, 2024

andravin commented Nov 3, 2024 •

edited by TylerYep

Loading

TylerYep commented Nov 5, 2024

Fix calculate_macs() for Linear layers. #318

Fix calculate_macs() for Linear layers. #318

Conversation

andravin commented Aug 22, 2024

TylerYep commented Nov 1, 2024 • edited Loading

andravin commented Nov 1, 2024

andravin commented Nov 1, 2024

andravin commented Nov 1, 2024

andravin commented Nov 2, 2024

TylerYep commented Nov 2, 2024

andravin commented Nov 2, 2024

andravin commented Nov 3, 2024 • edited by TylerYep Loading

TylerYep commented Nov 5, 2024

TylerYep commented Nov 1, 2024 •

edited

Loading

andravin commented Nov 3, 2024 •

edited by TylerYep

Loading