-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loss calculation error #55
Comments
Hi Jian, can you give me a small (self-contained) example showing how to get this error? |
Hi Thomas! I modified the code in your from pytorch_pretrained_bert.modeling import BertForMaskedLM, BertConfig
from pytorch_pretrained_bert import BertTokenizer
import torch
model = BertForMaskedLM.from_pretrained('bert-base-uncased')
# Tokenized input
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
text = "Who was Jim Henson ? Jim Henson was a puppeteer"
tokenized_text = tokenizer.tokenize(text)
# Mask a token that we will try to predict back with `BertForMaskedLM`
masked_index = 6
tokenized_text[masked_index] = '[MASK]'
# Convert token to vocabulary indices
indexed_truths = tokenizer.convert_tokens_to_ids(tokenized_text)
indexed_tokens = tokenizer.convert_tokens_to_ids(tokenized_text)
# Convert inputs to PyTorch tensors
tokens_tensor = torch.tensor([indexed_tokens])
indexed_truths_tensor = torch.tensor([indexed_truths])
# Evaluate loss
model.eval()
masked_lm_logits_scores = model(tokens_tensor, masked_lm_labels=indexed_truths_tensor)
print(masked_lm_logits_scores) |
Thank you, you are right, I fixed that on master. It will be in the next release. |
ghost
mentioned this issue
Nov 30, 2018
xloem
pushed a commit
to xloem/transformers
that referenced
this issue
Apr 9, 2023
* Update trainer and model flows to accommodate sparseml Disable FP16 on QAT start (huggingface#12) * Override LRScheduler when using LRModifiers * Disable FP16 on QAT start * keep wrapped scaler object for training after disabling Using QATMatMul in DistilBERT model class (huggingface#41) Removed double quantization of output of context layer. (huggingface#45) Fix DataParallel validation forward signatures (huggingface#47) * Fix: DataParallel validation forward signatures * Update: generalize forward_fn selection Best model after epoch (huggingface#46) fix sclaer check for non fp16 mode in trainer (huggingface#38) Mobilebert QAT (huggingface#55) * Remove duplicate quantization of vocabulary. enable a QATWrapper for non-parameterized matmuls in BERT self attention (huggingface#9) * Utils and auxillary changes update Zoo stub loading for SparseZoo 1.1 refactor (huggingface#54) add flag to signal NM integration is active (huggingface#32) Add recipe_name to file names * Fix errors introduced in manual cherry-pick upgrade Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>
jameshennessytempus
pushed a commit
to jameshennessytempus/transformers
that referenced
this issue
Jun 1, 2023
jonb377
pushed a commit
to jonb377/hf-transformers
that referenced
this issue
Apr 5, 2024
Summary: This is the initial PoC to integrate PEFT Lora. More information can be found: http://shortn/_0CPRwxcB5P Test Plan: Test it locally on a v4-8.
ZYC-ModelCloud
pushed a commit
to ZYC-ModelCloud/transformers
that referenced
this issue
Nov 14, 2024
ZYC-ModelCloud
pushed a commit
to ZYC-ModelCloud/transformers
that referenced
this issue
Nov 14, 2024
* continue on error * Update run_tests.yml * use bash * Update run_tests.yml * fix \n not work * rename * use modelcloud/gptqmodel:github-ci-v1
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
https://github.com/huggingface/pytorch-pretrained-BERT/blob/982339d82984466fde3b1466f657a03200aa2ffb/pytorch_pretrained_bert/modeling.py#L744
Got
ValueError: Expected target size (1, 30522), got torch.Size([1, 11])
at line 744 ofmodeling.py
. I think the line should be changed tomasked_lm_loss = loss_fct(prediction_scores.view([-1, self.config.vocab_size]), masked_lm_labels.view([-1]))
.The text was updated successfully, but these errors were encountered: