Fix missing test in `torch_job` #33593

ydshieh · 2024-09-19T13:02:30Z

What does this PR do?

Currently we have

@pytest.mark.generate
class GenerationTesterMixin:

and

class Mamba2ModelTest(ModelTesterMixin, GenerationTesterMixin, PipelineTesterMixin, unittest.TestCase)

(or any model test class)
plus

torch_job = CircleCIJob(
    "torch",
    docker_image=[{"image": "huggingface/transformers-torch-light"}],
    marker="not generate",
    parallelism=6,
    pytest_num_workers=8
)

in CircleCI config.

So torch_job won't run tests which is marked as generate, which are all tests as any model test class inherits from GenerationTesterMixin.

This PR fixes it

ydshieh · 2024-09-19T13:02:59Z

cc @ArthurZucker for reference

gante

Thank you for having a look and finding the root cause 🙏

gante · 2024-09-19T13:16:53Z

(the failing test seems related to #33533 cc @zucchini-nlp )

HuggingFaceDocBuilderDev · 2024-09-19T13:27:05Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

zucchini-nlp · 2024-09-19T13:35:25Z

Those are a bit flaky, in no-cache settings. Since the weights are random, we can generate image tokens (it's not oov anymore) and then at some point fail to get enough image embeddings. Do you think we should overwrite those for VLMs for be always with cache? @gante

imo, not a big deal, for me it never failed locally until I got to CI runs

ArthurZucker

Good catch, was missing some of them indeed!

amyeroberts

👀 Thanks for catching and fixing this!

gante · 2024-09-19T17:48:53Z

TL;DR

@ydshieh the failing test is flaky, I've retriggered the job to make the CI green. No success, I think I need to fix the flakiness first 👀 (in tests/models/pegasus/test_modeling_pegasus.py::PegasusStandaloneDecoderModelTest::test_generate_from_inputs_embeds_decoder_only)
@zucchini-nlp a short-term patch is 100% needed, I can trigger the error locally with CUDA_LAUNCH_BLOCKING=1 py.test tests/models/video_llava/test_modeling_video_llava.py::VideoLlavaForConditionalGenerationModelTest::test_sample_generate_dict_output --flake-finder --flake-runs=10. If we don't patch it, we will have many red CIs out there. I'd suggest whichever solution is the quickest to implement, since the actual fix has longer dependencies (see below) :D
EDIT: see VLM generate: tests can't generate image/video tokens #33623

@zucchini-nlp If I got it right, the error is caused by a generation-time behavior that doesn't exist in pre-trained models. This reminds me of Whisper, which has a bunch of PreTrainedConfig fields to ensure certain tokens are never generated out of position. They are PreTrainedConfig fields, and not GenerationConfig fields, because GenerationConfig didn't exist back then.

The correct long-term fix should then be to parameterize the model (as in the model class, not the tester) to have a GenerationConfig such that the bad generation behavior never happens -- in the test and outside it. This is also related to a chat I had with @Cyrilvallez today, where we identified that a model should be able to specify its own default cache class (and, because caching is a property of generate, it belongs in GenerationConfig). However, we can't define a default GenerationConfig for a model at the moment! I will add to my tasks adding this functionality, so that we can start parameterizing a default GenerationConfig for each model, instead of relying exclusively on PreTrainedConfig to set the model default behavior.

gante · 2024-09-19T18:29:10Z

@ydshieh #33602 should fix one of the tests frequently flaking in the CI runs (tests/models/pegasus/test_modeling_pegasus.py::PegasusStandaloneDecoderModelTest::test_generate_from_inputs_embeds_decoder_only)

ydshieh · 2024-09-20T09:30:19Z

OK, I will rebase once #33602 is merged.

Should I also wait for a fix for

tests/models/video_llava/test_modeling_video_llava.py::VideoLlavaForConditionalGenerationModelTest::test_sample_generate_dict_output

?

gante · 2024-09-20T13:24:14Z

Should I also wait for a fix for tests/models/video_llava/test_modeling_video_llava.py::VideoLlavaForConditionalGenerationModelTest::test_sample_generate_dict_output

@ydshieh working on it now (cc @zucchini-nlp, who is off for the next few days 🤗 )

gante · 2024-09-20T14:00:25Z

@ydshieh #33623 :D

gante · 2024-09-20T14:43:53Z

@ydshieh rebasing now should get rid of the red CI 🙏

fix missing tests Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

ydshieh requested review from gante and amyeroberts September 19, 2024 13:02

gante approved these changes Sep 19, 2024

View reviewed changes

ArthurZucker approved these changes Sep 19, 2024

View reviewed changes

amyeroberts reviewed Sep 19, 2024

View reviewed changes

amyeroberts approved these changes Sep 19, 2024

View reviewed changes

gante mentioned this pull request Sep 19, 2024

Generate: remove flakyness in test_generate_from_inputs_embeds_decoder_only #33602

Merged

fix missing tests

e4d4429

ydshieh force-pushed the fix_decorate branch from 3accbdb to e4d4429 Compare September 20, 2024 15:05

ydshieh merged commit 31caf0b into main Sep 20, 2024
19 checks passed

ydshieh deleted the fix_decorate branch September 20, 2024 15:16

itazap pushed a commit to NielsRogge/transformers that referenced this pull request Sep 20, 2024

Fix missing test in torch_job (huggingface#33593)

239665e

fix missing tests Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

amyeroberts pushed a commit to amyeroberts/transformers that referenced this pull request Oct 2, 2024

Fix missing test in torch_job (huggingface#33593)

c128ed9

fix missing tests Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

BernardZach pushed a commit to BernardZach/transformers that referenced this pull request Dec 5, 2024

Fix missing test in torch_job (huggingface#33593)

8b61178

fix missing tests Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

BernardZach pushed a commit to innovationcore/transformers that referenced this pull request Dec 6, 2024

Fix missing test in torch_job (huggingface#33593)

2998249

fix missing tests Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix missing test in `torch_job` #33593

Fix missing test in `torch_job` #33593

ydshieh commented Sep 19, 2024 •

edited

Loading

ydshieh commented Sep 19, 2024

gante left a comment

gante commented Sep 19, 2024

HuggingFaceDocBuilderDev commented Sep 19, 2024

zucchini-nlp commented Sep 19, 2024

ArthurZucker left a comment

amyeroberts left a comment

gante commented Sep 19, 2024 •

edited

Loading

gante commented Sep 19, 2024

ydshieh commented Sep 20, 2024

gante commented Sep 20, 2024

gante commented Sep 20, 2024

gante commented Sep 20, 2024

Fix missing test in torch_job #33593

Fix missing test in torch_job #33593

Conversation

ydshieh commented Sep 19, 2024 • edited Loading

What does this PR do?

ydshieh commented Sep 19, 2024

gante left a comment

Choose a reason for hiding this comment

gante commented Sep 19, 2024

HuggingFaceDocBuilderDev commented Sep 19, 2024

zucchini-nlp commented Sep 19, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

amyeroberts left a comment

Choose a reason for hiding this comment

gante commented Sep 19, 2024 • edited Loading

gante commented Sep 19, 2024

ydshieh commented Sep 20, 2024

gante commented Sep 20, 2024

gante commented Sep 20, 2024

gante commented Sep 20, 2024

Fix missing test in `torch_job` #33593

Fix missing test in `torch_job` #33593

ydshieh commented Sep 19, 2024 •

edited

Loading

gante commented Sep 19, 2024 •

edited

Loading