Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transformer models do not use their corresponding pretrained poolers #1062

Closed
zphang opened this issue Apr 11, 2020 · 2 comments
Closed

Transformer models do not use their corresponding pretrained poolers #1062

zphang opened this issue Apr 11, 2020 · 2 comments
Labels
jiant-v1-legacy Relevant to versions <= v1.3.2 wontfix This will not be worked on

Comments

@zphang
Copy link
Collaborator

zphang commented Apr 11, 2020

The way that we are currently using Transformers models involves taking the based encoder, and extracting the full set of hidden activations (across all layers). See link. We later separately pull out only the top layer, and extract the first token representation if we're doing a single-vector task such as classification.

Because of this workflow, we do not end up using the pretrained pooler layers in the respective models, e.g. BERT, ALBERT. RoBERTa also inherits from BERT.

On the other hand, we do not expect this to be a major issue, as we have seen good results from tuning with this format across several works, e.g. https://arxiv.org/abs/1812.10860, https://arxiv.org/abs/1905.00537

@zphang zphang added the wontfix This will not be worked on label Apr 11, 2020
@HaokunLiu
Copy link
Member

Adding this as additional reference.

https://arxiv.org/pdf/1903.05987.pdf found diagnostic classifier on finetuned BERT layers achieves similar performance in layer 9-12 (MRPC), and layer 5-12 (STS-B). See figure 1 in the linked pdf.
This suggests pertrained layers on the top may not be that helpful to downstream pair sentence classification tasks.

@zphang
Copy link
Collaborator Author

zphang commented Oct 16, 2020

This is an automatically generated comment.

As we update jiant to v2.x, jiant v1.x has been migrated to https://github.com/nyu-mll/jiant-v1-legacy. As such, we are closing all issues relating to jiant v1.x in this repository.

If this issue is still affecting you in jiant v1.x, please follow up at nyu-mll/jiant-v1-legacy#1062.

If this issue is still affecting you in jiant v2.x, reopen this issue or create a new one.

@zphang zphang closed this as completed Oct 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jiant-v1-legacy Relevant to versions <= v1.3.2 wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants