Set pad_token in run_glue_no_trainer.py #28534 #30157

JINO-ROHIT · 2024-04-10T06:28:02Z

What does this PR do?

This PR adds the pad tokenizer in the run_glue_no_trainer.py script under examples.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@amyeroberts @ArthurZucker

amyeroberts

Thanks for adding - just a small suggestion

examples/pytorch/text-classification/run_glue_no_trainer.py

JINO-ROHIT · 2024-04-10T15:20:41Z

done! @amyeroberts

amyeroberts

Thanks for iterating!

For the failing tests, running make fixup and pushing the changes should resolve

JINO-ROHIT · 2024-04-10T16:54:10Z

Exception: Found the following copy inconsistencies:

tests/models\roc_bert\test_tokenization_roc_bert.py: copy does not match models.bert.test_tokenization_bert.BertTokenizationTest.test_is_whitespace at line 167
Run make fix-copies or python utils/check_copies.py --fix_and_overwrite to fix them.
make: *** [Makefile:38: repo-consistency] Error 1

I get this error but i tried running both commands, still gives the same error, how to resolve this? @amyeroberts

amyeroberts · 2024-04-11T08:37:02Z

There's something funny going on with the formatting here - files like src/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py shouldn't be modified when calling make fixup.
and you shouldn't need to run make fix-copies for this PR.

In your environment, make sure you have all of the necessary formatting libraries with pip install -e .[testing]. Once that's done, I'd undo the formatting changes to the unrelated files and then try running make fix up again

JINO-ROHIT · 2024-04-11T09:41:48Z

@amyeroberts ive recreated my env, can you help me out with undoing the formatting files?

JINO-ROHIT added 3 commits April 10, 2024 11:52

adding pad token to fix huggingface#28534

b46d74d

Merge remote-tracking branch 'origin/main' into fix-28534

1f57dc7

adding pad token to fix huggingface#28534

5e68f6a

amyeroberts reviewed Apr 10, 2024

View reviewed changes

examples/pytorch/text-classification/run_glue_no_trainer.py Outdated Show resolved Hide resolved

amyeroberts changed the title ~~adding pad token to fix #28534~~ Set pad_token in run_glue_no_trainer.py #28534 Apr 10, 2024

amyeroberts approved these changes Apr 10, 2024

View reviewed changes

make fixup

bc482a0

Merge remote-tracking branch 'origin/main' into fix-28534

505fbc8

JINO-ROHIT closed this Apr 13, 2024

JINO-ROHIT deleted the fix-28534 branch April 13, 2024 09:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set pad_token in run_glue_no_trainer.py #28534 #30157

Set pad_token in run_glue_no_trainer.py #28534 #30157

JINO-ROHIT commented Apr 10, 2024

amyeroberts left a comment

JINO-ROHIT commented Apr 10, 2024

amyeroberts left a comment

JINO-ROHIT commented Apr 10, 2024 •

edited

Loading

amyeroberts commented Apr 11, 2024

JINO-ROHIT commented Apr 11, 2024

Set pad_token in run_glue_no_trainer.py #28534 #30157

Set pad_token in run_glue_no_trainer.py #28534 #30157

Conversation

JINO-ROHIT commented Apr 10, 2024

What does this PR do?

Before submitting

Who can review?

amyeroberts left a comment

Choose a reason for hiding this comment

JINO-ROHIT commented Apr 10, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

JINO-ROHIT commented Apr 10, 2024 • edited Loading

amyeroberts commented Apr 11, 2024

JINO-ROHIT commented Apr 11, 2024

JINO-ROHIT commented Apr 10, 2024 •

edited

Loading