Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#24028 seems to break the last epoch for a logging integration #24939

Closed
1 of 4 tasks
franz101 opened this issue Jul 19, 2023 · 2 comments
Closed
1 of 4 tasks

#24028 seems to break the last epoch for a logging integration #24939

franz101 opened this issue Jul 19, 2023 · 2 comments

Comments

@franz101
Copy link

franz101 commented Jul 19, 2023

System Info

Hey @muellerzr,

thanks for your lightning fast (accelerated) reply ;) regarding #24028, I'm currently debugging what's causing the issue

Setup:

  • A custom callback to log embeddings, the data collator in the Trainer is wrapped to extract ids of each sample in a batch

Error:

  • The wrapped data collation works fine except in the last step

How to reproduce?
See reproduction tab

Currently this is the example I can show for reproduction.
My first guess, it's related to multiprocessing. It seems like the custom collator is not called in the last step.
But will give more details or a possible solution soon.

Who can help?

@muellerzr

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

git clone https://github.com/rungalileo/dataquality.git
cd dataquality
python -m venv .venv
source .venv/bin/activate
pip install invoke
inv all
pip install --upgrade transformers
pytest tests/integrations/hf/test_text_classification_hf.py -s -k test_remove_unused_columns

Expected behavior

Test should finish collating each step

@franz101
Copy link
Author

franz101 commented Jul 19, 2023

Reproducible in this colab example currently

@franz101 franz101 reopened this Jul 19, 2023
@franz101
Copy link
Author

Found the issue, it seems like the on step end is called only after two data points have been collated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant