Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training Error: RuntimeError: Could not infer dtype of NoneType #39

Open
davidberenstein1957 opened this issue Jun 22, 2023 · 1 comment

Comments

@davidberenstein1957
Copy link

davidberenstein1957 commented Jun 22, 2023

Hi, I was trying to distill a model but it resulted in an error.

import os
from fastcoref import TrainingArgs, CorefTrainer, LingMessCoref

texts = ["My sister has a dog. She loves him. Some like to play football, others like basketball.", "Paul Allen was born on January 21, 1953, in Seattle, Washington, to Kenneth Sam Allen and Edna Faye Allen. Allen attended Lakeside School, a private school in Seattle, where he befriended Bill Gates, two years younger, with whom he shared an enthusiasm for computers. Paul and Bill used a teletype terminal at their high school, Lakeside, to develop their programming skills on several time-sharing computer systems."]

model = LingMessCoref()
model.predict(texts=texts[0], output_file='train_file_with_clusters.jsonlines')

args = TrainingArgs(
    output_dir='test-trainer',
    overwrite_output_dir=True,
    model_name_or_path='roberta-base',
    device="mps",
    # device='cuda:2',
    # epochs=129,
    # logging_steps=100,
    # eval_steps=100
)   # you can control other arguments such as learning head and others.

trainer = CorefTrainer(
    args=args,
    train_file='train_file_with_clusters_save.jsonlines', 
    # nlp=nlp # optional, for custom nlp class from spacy
)
trainer.train()
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[3], line 21
      5 args = TrainingArgs(
      6     output_dir='test-trainer',
      7     overwrite_output_dir=True,
   (...)
     13     # eval_steps=100
     14 )   # you can control other arguments such as learning head and others.
     16 trainer = CorefTrainer(
     17     args=args,
     18     train_file='train_file_with_clusters_save.jsonlines', 
     19     # nlp=nlp # optional, for custom nlp class from spacy
     20 )
---> 21 trainer.train()
     22 # trainer.evaluate(test=True)
     24 trainer.push_to_hub('f-coref-xlm-roberta-base-ontonotes5')

File [~/Documents/programming/open-source/coref-test/.venv/lib/python3.10/site-packages/fastcoref/trainer.py:197](https://file+.vscode-resource.vscode-cdn.net/Users/davidberenstein/Documents/programming/open-source/KeyBERTNER/~/Documents/programming/open-source/coref-test/.venv/lib/python3.10/site-packages/fastcoref/trainer.py:197), in CorefTrainer.train(self)
    195 batch['input_ids'] = torch.tensor(batch['input_ids'], device=self.device)
    196 batch['attention_mask'] = torch.tensor(batch['attention_mask'], device=self.device)
--> 197 batch['gold_clusters'] = torch.tensor(batch['gold_clusters'], device=self.device)
    198 if 'leftovers' in batch:
    199     batch['leftovers']['input_ids'] = torch.tensor(batch['leftovers']['input_ids'], device=self.device)

RuntimeError: Could not infer dtype of NoneType
@shon-otmazgin
Copy link
Owner

Which version you using? btw I didn't test it with mps, can you upgrade to the latest fastcoref version and run it with cpu?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants