Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot load xlnet-base-cased transformer #8996

Closed
RajK853 opened this issue Aug 19, 2021 · 3 comments
Closed

Cannot load xlnet-base-cased transformer #8996

RajK853 opened this issue Aug 19, 2021 · 3 comments
Labels
bug Bugs and behaviour differing from documentation feat / transformer Feature: Transformer

Comments

@RajK853
Copy link

RajK853 commented Aug 19, 2021

How to reproduce the behaviour

import spacy
from thinc.api import Config

default_model_config = """
[model]
@architectures = "spacy.Tagger.v1"

[model.tok2vec]
@architectures = "spacy-transformers.Tok2VecTransformer.v1"
name = "xlnet-base-cased"
grad_factor = 1.0
pooling = {"@layers":"reduce_mean.v1"}

[model.tok2vec.get_spans]
@span_getters = "spacy-transformers.strided_spans.v1"
window = 128
stride = 96

[model.tok2vec.tokenizer_config]
use_fast = true
"""

config = Config().from_str(default_model_config)
model = spacy.registry.resolve(config)["model"]
model.initialize()

Trying to load the xlnet-base-cased transformer raises the following error.

Some weights of the model checkpoint at xlnet-base-cased were not used when initializing XLNetModel: ['lm_loss.weight', 'lm_loss.bias']
- This IS expected if you are initializing XLNetModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing XLNetModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_2292405/1170947513.py in <module>
      1 config = Config().from_str(default_model_config)
      2 model = spacy.registry.resolve(config)["model"]
----> 3 model.initialize()

~/.local/share/virtualenvs/mlgec-87_FmqUT/lib/python3.8/site-packages/thinc/model.py in initialize(self, X, Y)
    297             validate_fwd_input_output(self.name, self._func, X, Y)
    298         if self.init is not None:
--> 299             self.init(self, X=X, Y=Y)
    300         return self
    301 

~/.local/share/virtualenvs/mlgec-87_FmqUT/lib/python3.8/site-packages/thinc/layers/chain.py in init(model, X, Y)
     70     if X is None and Y is None:
     71         for layer in model.layers:
---> 72             layer.initialize()
     73         if model.layers[0].has_dim("nI"):
     74             model.set_dim("nI", model.layers[0].get_dim("nI"))

~/.local/share/virtualenvs/mlgec-87_FmqUT/lib/python3.8/site-packages/thinc/model.py in initialize(self, X, Y)
    297             validate_fwd_input_output(self.name, self._func, X, Y)
    298         if self.init is not None:
--> 299             self.init(self, X=X, Y=Y)
    300         return self
    301 

~/.local/share/virtualenvs/mlgec-87_FmqUT/lib/python3.8/site-packages/thinc/layers/chain.py in init(model, X, Y)
     70     if X is None and Y is None:
     71         for layer in model.layers:
---> 72             layer.initialize()
     73         if model.layers[0].has_dim("nI"):
     74             model.set_dim("nI", model.layers[0].get_dim("nI"))

~/.local/share/virtualenvs/mlgec-87_FmqUT/lib/python3.8/site-packages/thinc/model.py in initialize(self, X, Y)
    297             validate_fwd_input_output(self.name, self._func, X, Y)
    298         if self.init is not None:
--> 299             self.init(self, X=X, Y=Y)
    300         return self
    301 

~/.local/share/virtualenvs/mlgec-87_FmqUT/src/spacy-transformers/spacy_transformers/layers/transformer_model.py in init(model, X, Y)
    152     if "output_attentions" in trf_cfg and trf_cfg["output_attentions"] is True:
    153         tensors = tensors[:-1]  # remove attention
--> 154     t_i = find_last_hidden(tensors)
    155     model.set_dim("nO", tensors[t_i].shape[-1])
    156 

~/.local/share/virtualenvs/mlgec-87_FmqUT/src/spacy-transformers/spacy_transformers/util.py in find_last_hidden(tensors)
     80     """
     81     for i, tensor in reversed(list(enumerate(tensors))):
---> 82         if len(tensor.shape) == 3:
     83             return i
     84     else:

AttributeError: 'tuple' object has no attribute 'shape'

Our Environment

Info about spaCy

  • spaCy version: 3.1.1
  • Platform: Linux-4.19.0-17-cloud-amd64-x86_64-with-debian-10.10
  • Python version: 3.7.3
@adrianeboyd
Copy link
Contributor

This should be fixed in the future by this PR: explosion/spacy-transformers#283

@adrianeboyd adrianeboyd added bug Bugs and behaviour differing from documentation feat / transformer Feature: Transformer labels Aug 19, 2021
@adrianeboyd
Copy link
Contributor

Just cleaning up some older issues. This was fixed in spacy-transformers v1.1.

@github-actions
Copy link
Contributor

github-actions bot commented Apr 1, 2022

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 1, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Bugs and behaviour differing from documentation feat / transformer Feature: Transformer
Projects
None yet
Development

No branches or pull requests

2 participants