Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added transformers_config for passing arguments to the transformer #268

Merged
merged 25 commits into from
Jul 8, 2021
Merged

added transformers_config for passing arguments to the transformer #268

merged 25 commits into from
Jul 8, 2021

Conversation

KennethEnevoldsen
Copy link
Contributor

Added transformers_config to allow the user to pass arguments to the transformers forward pass. Most notably the output_attentions.

for convenience, I used this example to test the code:

import spacy

nlp = spacy.blank("en")

# Construction via add_pipe with custom config
config = {
    "model": {
        "@architectures": "spacy-transformers.TransformerModel.v1",
        "name": "bert-base-uncased",
        "transformers_config": {"output_attentions": True},
    }
}
transformer = nlp.add_pipe(
    "transformer", config=config)
transformer.model.initialize()


doc = nlp("This is a sentence.")

which gives you:

len(doc._.trf_data.attention) # 12
doc._.trf_data.attention[-1].shape  # (1, 12, 7, 7) last layer of attention 
len(doc._.trf_data.tensors) # 2 
doc._.trf_data.tensors[0].shape # (1, 7, 768) <-- wordpiece embedding
doc._.trf_data.tensors[1].shape  # (1, 768) <-- assuming this is the pooled embedding?

Sidenote: it took me quite a while to find the default config str. It might be ideal to make this into a standalone file and load it in?

@svlandeg svlandeg added the enhancement New feature or request label May 3, 2021
@honnibal
Copy link
Member

honnibal commented May 6, 2021

@KennethEnevoldsen Thanks for this, and sorry for the delay reviewing it. I think it'll be a really nice feature, it's just a question of getting these details about the backwards compat right.

@KennethEnevoldsen
Copy link
Contributor Author

Thanks, @honnibal, and completely understandable. It is potentially a feature that could be integrated nicely with displacy for displaying the attention given to each word. Not entirely sure though seems like the judge is still out on how interpretability these attention weights are.

@svlandeg
Copy link
Member

svlandeg commented Jul 5, 2021

It looks like a binary file ".DS_Store" got committed by accident...

@svlandeg
Copy link
Member

svlandeg commented Jul 5, 2021

@KennethEnevoldsen : Apologies again for the late follow-up! I just did one final round of review and think we should be able to wrap this up soon :-)

@KennethEnevoldsen
Copy link
Contributor Author

No problem will follow up on this during the weekend

@svlandeg
Copy link
Member

svlandeg commented Jul 8, 2021

Thanks again for your contribution and patience! I think this looks good - just made an internal note about the documentation & moving v1 to legacy, we'll take care of that in a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants