Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run_qa.py script does not compute eval_loss and gives KeyError: 'eval_loss' with load_best_model_at_end #29801

Closed
2 of 4 tasks
ftesser opened this issue Mar 22, 2024 · 3 comments · Fixed by #29867
Closed
2 of 4 tasks
Labels
bug Examples Which is related to examples in general Good Second Issue Issues that are more difficult to do than "Good First" issues - give it a try if you want!

Comments

@ftesser
Copy link

ftesser commented Mar 22, 2024

System Info

  • transformers version: 4.39.0
  • Platform: Linux-5.15.0-100-lowlatency-x86_64-with-glibc2.35
  • Python version: 3.10.12
  • Huggingface_hub version: 0.20.3
  • Safetensors version: 0.4.2
  • Accelerate version: 0.27.2
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.2.0+cu121 (False)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?: no
  • Using distributed or parallel set-up in script?: no

Who can help?

The official QA training script run_qa.py return the following error with --load_best_model_at_end and --metric_for_best_model "loss".

Traceback (most recent call last):
  File "/home/fabio/repos/transformers/examples/pytorch/question-answering/run_qa.py", line 716, in <module>
    main()
  File "/home/fabio/repos/transformers/examples/pytorch/question-answering/run_qa.py", line 657, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
  File "/home/fabio/repos/transformers/src/transformers/trainer.py", line 1780, in train
    return inner_training_loop(
  File "/home/fabio/repos/transformers/src/transformers/trainer.py", line 2213, in _inner_training_loop
    self._maybe_log_save_evaluate(tr_loss, grad_norm, model, trial, epoch, ignore_keys_for_eval)
  File "/home/fabio/repos/transformers/src/transformers/trainer.py", line 2588, in _maybe_log_save_evaluate
    self._save_checkpoint(model, trial, metrics=metrics)
  File "/home/fabio/repos/transformers/src/transformers/trainer.py", line 2669, in _save_checkpoint
    metric_value = metrics[metric_to_check]
KeyError: 'eval_loss'

full_log.txt

@ArthurZucker @sgugger

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

To reproduce the error in quick way you can use a distilled model and limit the max_train_samples and max_eval_samples:

python examples/pytorch/question-answering/run_qa.py \
--model_name_or_path deepset/roberta-base-squad2-distilled  \
--dataset_name squad \
--do_train \
--do_eval \
--max_seq_length 384 \
--doc_stride 128 \
--max_train_samples 5 \
--max_eval_samples 2 \
--num_train_epochs 3 \
--load_best_model_at_end \
--metric_for_best_model "loss" \
--evaluation_strategy "epoch" \
--save_strategy "epoch" \
--overwrite_output_dir \
--output_dir ~/tmp/debug_squad/

I have tested this bug also with a normal model and a full dataset but the error is always there.

Expected behavior

During the evaluation phase the eval_loss should be computed and the best model should be saved using the loss metric.

@amyeroberts amyeroberts added Examples Which is related to examples in general bug labels Mar 22, 2024
@ArthurZucker
Copy link
Collaborator

that is more a trainer issue cc @muellerzr and @SunMarc. Given that it is an example and we usually don't maintain examples, I'll set this as a good second issue.

@ArthurZucker ArthurZucker added the Good Second Issue Issues that are more difficult to do than "Good First" issues - give it a try if you want! label Mar 25, 2024
@ftesser
Copy link
Author

ftesser commented Mar 26, 2024

@ArthurZucker @jla524 I saw your pull request #29867 , and I agree that adding the message for unsupported metrics is a great idea.

However, I don't understand why loss is not supported in the case of squad2 (#29867 (comment)): the use case is to have the loss metric available in the validation set in order to use that metric to determine the best model.

Furthermore, the loss is calculated in the training set . Why isn't it possible to calculate it in the valuation set too?

@ArthurZucker
Copy link
Collaborator

@ftesser that is a question for evaluate! It's probably that the loss is not really metric like f1 and it's log elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Examples Which is related to examples in general Good Second Issue Issues that are more difficult to do than "Good First" issues - give it a try if you want!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants