DeBERTa v2 throws "TypeError: stat: path should be string...", v1 not #10097

205g0 · 2021-02-09T11:19:33Z

Environment info

transformers version: 4.3.1
Platform: Linux-5.4.0-54-generic-x86_64-with-glibc2.29
Python version: 3.8.5
PyTorch version (GPU?): 1.7.1+cpu (False)
Tensorflow version (GPU?): not installed (NA)
Using GPU in script?: false
Using distributed or parallel set-up in script?: false

Who can help

@BigBird01 @patil-suraj

Information

Model I am using (DeBERTa v2):

The problem arises when using:

the official example scripts: (give details below)
my own modified scripts: (give details below)

The tasks I am working on is:

an official GLUE/SQUaD task: (give the name)
my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

Create this file:

from transformers import AutoTokenizer, AutoModel
import torch

tokenizer = AutoTokenizer.from_pretrained('microsoft/deberta-xlarge-v2')
model = AutoModel.from_pretrained('microsoft/deberta-xlarge-v2')

inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
outputs = model(**inputs)

last_hidden_states = outputs.last_hidden_state
print(outputs)

Run the file
You'll get:

(venv) root@16gb:~/deberta# python3 index.py
Traceback (most recent call last):
  File "index.py", line 4, in <module>
    tokenizer = AutoTokenizer.from_pretrained('microsoft/deberta-xlarge-v2')
  File "/root/deberta/venv/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 398, in from_pretrained
    return tokenizer_class_py.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/root/deberta/venv/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1788, in from_pretrained
    return cls._from_pretrained(
  File "/root/deberta/venv/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1860, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "/root/deberta/venv/lib/python3.8/site-packages/transformers/models/deberta/tokenization_deberta.py", line 542, in __init__
    if not os.path.isfile(vocab_file):
  File "/usr/lib/python3.8/genericpath.py", line 30, in isfile
    st = os.stat(path)
TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType

I tried this with the DeBERTa v1 models and there was no error. I've the same behavior when using DebertaTokenizer, DebertaModel

Expected behavior

No error.

The text was updated successfully, but these errors were encountered:

patil-suraj · 2021-02-09T12:36:26Z

Hi @205g0

Thank you for reporting this!

microsoft/deberta-xlarge-v2 uses sentencepiece vocab and it's not implemented for deberta, which is the reason for this error.

205g0 · 2021-02-09T13:57:24Z

Hey Suraj, thanks for the quick response and good to know!

patil-suraj · 2021-02-09T14:04:23Z

@BigBird01 do you think you could add the missing tokenizer, otherwise, I could add it. Thanks!

LysandreJik · 2021-02-09T15:27:15Z

DeBERTa-v2 is not available in the library yet. We're working towards it with @BigBird01.

BigBird01 · 2021-02-09T16:11:42Z

Thanks @205g0 for the interest in DeBERTa-v2. We are working on it with @LysandreJik, hopefully, it will be available soon. You can check our PR for the progress.

patil-suraj · 2021-02-09T16:14:40Z

Oh sorry, @BigBird01, I did not realize that this was a work in progress

BigBird01 · 2021-02-09T16:19:08Z

Oh sorry, @BigBird01, I did not realize that this was a work in progress

No worry, @patil-suraj. Thanks for your quick response. We are glad to integrate these SOTA NLU models with HF to benefit the community:)

github-actions · 2021-04-14T15:04:05Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions bot closed this as completed Apr 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeBERTa v2 throws "TypeError: stat: path should be string...", v1 not #10097

DeBERTa v2 throws "TypeError: stat: path should be string...", v1 not #10097

205g0 commented Feb 9, 2021

patil-suraj commented Feb 9, 2021

205g0 commented Feb 9, 2021

patil-suraj commented Feb 9, 2021

LysandreJik commented Feb 9, 2021

BigBird01 commented Feb 9, 2021

patil-suraj commented Feb 9, 2021

BigBird01 commented Feb 9, 2021 •

edited

Loading

github-actions bot commented Apr 14, 2021

DeBERTa v2 throws "TypeError: stat: path should be string...", v1 not #10097

DeBERTa v2 throws "TypeError: stat: path should be string...", v1 not #10097

Comments

205g0 commented Feb 9, 2021

Environment info

Who can help

Information

To reproduce

Expected behavior

patil-suraj commented Feb 9, 2021

205g0 commented Feb 9, 2021

patil-suraj commented Feb 9, 2021

LysandreJik commented Feb 9, 2021

BigBird01 commented Feb 9, 2021

patil-suraj commented Feb 9, 2021

BigBird01 commented Feb 9, 2021 • edited Loading

github-actions bot commented Apr 14, 2021

BigBird01 commented Feb 9, 2021 •

edited

Loading