You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the number of MACs is calculated in LayerInfo.calculate_macs. However, as both issue#77 and PR#193 show, this is more complex than assumed in LayerInfo.
Theoretically, the code for this might grow pretty large if we want it to be accurate for all pre-defined torch.Modules. For example, there is a difference between a nn.Linear-layer and a nn.Bilinear-layer, not to mention convolutions and other operations.
Moving the calculation of MACs to its own class that is then used by LayerInfo would, I think, make it easier and more readable to implement accurate estimations of different layers over time. It would also allow whomever implements the estimation for a layer to add detailed comments explaining what they are doing. For potentially mathematically complex code, that seems like an advantage.
Here's a rough idea of how something like this might look like:
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Currently, the number of MACs is calculated in
LayerInfo.calculate_macs
. However, as both issue#77 and PR#193 show, this is more complex than assumed inLayerInfo
.Theoretically, the code for this might grow pretty large if we want it to be accurate for all pre-defined
torch.Modules
. For example, there is a difference between ann.Linear
-layer and ann.Bilinear
-layer, not to mention convolutions and other operations.Moving the calculation of MACs to its own class that is then used by
LayerInfo
would, I think, make it easier and more readable to implement accurate estimations of different layers over time. It would also allow whomever implements the estimation for a layer to add detailed comments explaining what they are doing. For potentially mathematically complex code, that seems like an advantage.Here's a rough idea of how something like this might look like:
Hope I didn't leave a bug in there.
Of course, the
prod
-function from layer_info.py could simply be moved to macs.py.I'm not sure if this is a good idea or not, so I would welcome feedback :)
Beta Was this translation helpful? Give feedback.
All reactions