Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor to inherit from nn.Module instead of nn.ModuleList #17501

Merged

Conversation

amyeroberts
Copy link
Collaborator

@amyeroberts amyeroberts commented May 31, 2022

What does this PR do?

Refactors classes inheriting from nn.ModuleList to inherit from nn.Module instead. This is to make debugging and inspecting layer outputs more easy.

See also: #17493

The following was run to check the weight loading in:

from transformers import BeitForImageClassification, Data2VecVisionForImageClassification

print("\nLoading in Data2VecVision model...")
model_checkpoint = "facebook/data2vec-vision-base"
model = Data2VecVisionForImageClassification.from_pretrained(model_checkpoint)

print("\nLoading in BeiT model...")
model_checkpoint = "microsoft/beit-base-patch16-224-pt22k"
model = BeitForImageClassification.from_pretrained(model_checkpoint)

Output:

Loading in Data2VecVision model...
/Users/aroberts/.virtualenvs/tenv/lib/python3.9/site-packages/torch/functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  /Users/distiller/project/pytorch/aten/src/ATen/native/TensorShape.cpp:2228.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
Some weights of Data2VecVisionForImageClassification were not initialized from the model checkpoint at facebook/data2vec-vision-base and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

Loading in BeiT model...
Some weights of the model checkpoint at microsoft/beit-base-patch16-224-pt22k were not used when initializing BeitForImageClassification: ['layernorm.bias', 'layernorm.weight', 'lm_head.weight', 'lm_head.bias']
- This IS expected if you are initializing BeitForImageClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BeitForImageClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BeitForImageClassification were not initialized from the model checkpoint at microsoft/beit-base-patch16-224-pt22k and are newly initialized: ['beit.pooler.layernorm.bias', 'classifier.bias', 'classifier.weight', 'beit.pooler.layernorm.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

Running on main we see the same weights are newly initialized:

Loading in Data2VecVision model...
/Users/aroberts/.virtualenvs/tenv/lib/python3.9/site-packages/torch/functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  /Users/distiller/project/pytorch/aten/src/ATen/native/TensorShape.cpp:2228.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
Some weights of Data2VecVisionForImageClassification were not initialized from the model checkpoint at facebook/data2vec-vision-base and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

Loading in BeiT model...
Some weights of the model checkpoint at microsoft/beit-base-patch16-224-pt22k were not used when initializing BeitForImageClassification: ['layernorm.bias', 'lm_head.bias', 'lm_head.weight', 'layernorm.weight']
- This IS expected if you are initializing BeitForImageClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BeitForImageClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BeitForImageClassification were not initialized from the model checkpoint at microsoft/beit-base-patch16-224-pt22k and are newly initialized: ['beit.pooler.layernorm.weight', 'classifier.bias', 'classifier.weight', 'beit.pooler.layernorm.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented May 31, 2022

The documentation is not available anymore as the PR was closed or merged.

@LysandreJik
Copy link
Member

Let us know when you'd like for us to review! :)

Blender Bot tests failing (should be unrelated to this PR) and pass locally). I don't have sufficient permisisons to re-run the CI workflow (totally or from failed)
@github-actions
Copy link

github-actions bot commented Jul 1, 2022

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified that both BeiT and Data2VecVision checkpoints load without trouble. Thanks for working on this, @amyeroberts!

@LysandreJik LysandreJik merged commit cf2578a into huggingface:main Jul 4, 2022
viclzhu pushed a commit to viclzhu/transformers that referenced this pull request Jul 18, 2022
…ace#17501)

* Refactor to inherit from nn.Module instead of nn.ModuleList

* Fix typo

* Empty to trigger CI re-run

Blender Bot tests failing (should be unrelated to this PR) and pass locally). I don't have sufficient permisisons to re-run the CI workflow (totally or from failed)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants