We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe the bug The total number of parameters show by summary function is different from what is shown on model card over huggingface website.
To Reproduce
from transformers import AutoModel, AutoModelForSeq2SeqLM from torchinfo import summary stos = AutoModelForSeq2SeqLM.from_pretrained('google/flan-t5-small') summary(stos, row_settings=('var_names',)) """ Output: ==================================================================================================== Layer (type (var_name)) Param # ==================================================================================================== T5ForConditionalGeneration (T5ForConditionalGeneration) -- ├─Embedding (shared) 16,449,536 ├─T5Stack (encoder) 16,449,536 │ └─Embedding (embed_tokens) (recursive) │ └─ModuleList (block) -- │ │ └─T5Block (0) 2,360,512 │ │ └─T5Block (1) 2,360,320 │ │ └─T5Block (2) 2,360,320 │ │ └─T5Block (3) 2,360,320 │ │ └─T5Block (4) 2,360,320 │ │ └─T5Block (5) 2,360,320 │ │ └─T5Block (6) 2,360,320 │ │ └─T5Block (7) 2,360,320 │ └─T5LayerNorm (final_layer_norm) 512 │ └─Dropout (dropout) -- ├─T5Stack (decoder) 16,449,536 │ └─Embedding (embed_tokens) (recursive) │ └─ModuleList (block) -- │ │ └─T5Block (0) 3,147,456 │ │ └─T5Block (1) 3,147,264 │ │ └─T5Block (2) 3,147,264 │ │ └─T5Block (3) 3,147,264 │ │ └─T5Block (4) 3,147,264 │ │ └─T5Block (5) 3,147,264 │ │ └─T5Block (6) 3,147,264 │ │ └─T5Block (7) 3,147,264 │ └─T5LayerNorm (final_layer_norm) 512 │ └─Dropout (dropout) -- ├─Linear (lm_head) 16,449,536 ==================================================================================================== Total params: 109,860,224 Trainable params: 109,860,224 Non-trainable params: 0 ==================================================================================================== """
Expected behavior The total number of parameters be around 77 million (exactly 77,305,216 when using peft.print_trainable_parameters)
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Describe the bug
The total number of parameters show by summary function is different from what is shown on model card over huggingface website.
To Reproduce
Expected behavior
The total number of parameters be around 77 million (exactly 77,305,216 when using peft.print_trainable_parameters)
The text was updated successfully, but these errors were encountered: