Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make (TF) CI faster (test only a random subset of model classes) #24592

Merged
merged 3 commits into from
Jun 30, 2023

Conversation

ydshieh
Copy link
Collaborator

@ydshieh ydshieh commented Jun 30, 2023

What does this PR do?

Daily CI is currently running in 22h30m. @Rocketknight1 might have a way to bring it back to 19-20 hours.

For some tests, let's test only a (random) subset of the model classes 🙏 .

Here is the timing of some very slow tests currently:

398.44s call     tests/models/bert/test_modeling_tf_bert.py::TFBertModelTest::test_xla_fit
275.59s call     tests/models/bert/test_modeling_tf_bert.py::TFBertModelTest::test_saved_model_creation_extended
217.84s call     tests/models/bert/test_modeling_tf_bert.py::TFBertModelTest::test_compile_tf_model
106.25s call     tests/models/bert/test_tokenization_bert_tf.py::BertTokenizationTest::test_saved_model
77.69s call     tests/models/bert/test_modeling_tf_bert.py::TFBertModelTest::test_onnx_runtime_optimize

and

352.31s call     tests/models/bart/test_modeling_tf_bart.py::TFBartModelTest::test_saved_model_creation_extended
272.56s call     tests/models/bart/test_modeling_tf_bart.py::TFBartModelTest::test_compile_tf_model
270.84s call     tests/models/bart/test_modeling_tf_bart.py::TFBartModelTest::test_xla_fit
132.59s call     tests/models/bart/test_modeling_tf_bart.py::TFBartModelTest::test_onnx_runtime_optimize

@ydshieh ydshieh changed the title Make (TF) CI faster (by testing only a subset of model classes) Make (TF) CI faster (test only a random subset of model classes) Jun 30, 2023
@ydshieh ydshieh requested review from Rocketknight1 and sgugger June 30, 2023 10:34
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jun 30, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not take a random subset but the first two then. To test the base model and a model with head.

@Rocketknight1
Copy link
Member

Rocketknight1 commented Jun 30, 2023

Some of the very slow tests (like test_saved_model_creation_extended and test_xla_fit) only apply to a few models anyway - they're in test_modeling_tf_core.py, so they shouldn't have a big effect on the total test runtime. I might have a couple of ideas for speeding up test_compile_tf_model, though!

@ydshieh
Copy link
Collaborator Author

ydshieh commented Jun 30, 2023

Let's not take a random subset but the first two then. To test the base model and a model with head.

Would it be ok to take the first one (base model) + a random other one with head?

@Rocketknight1
Copy link
Member

Also, I looked a bit closer at this PR and I'm actually a bit scared of some of the changes - in particular, test_pt_tf_model_equivalence is one of the most important tests and picks up lots of implementation problems in TF ports, so I don't want to reduce its coverage!

@ydshieh
Copy link
Collaborator Author

ydshieh commented Jun 30, 2023

@Rocketknight1

But that test is not changed, i.e. it doesn't use get_random_model_classes introduced here. Nothing to fear 😆

@sgugger
Copy link
Collaborator

sgugger commented Jun 30, 2023

Would it be ok to take the first one (base model) + a random other one with head?

I don't like randomness in tests as it makes them flaky.

@ydshieh
Copy link
Collaborator Author

ydshieh commented Jun 30, 2023

Well, in this situation, I do prefer to keep a random head model.

  • We are reducing the number of model classes being tested due to the slow runtime. If we keep the fix model classes, we are likely to miss failures in certain model heads. (and for the involved tests in this PR, they all pass currently for their all model classes - if not, probably just one or two.)

  • Only slow tests are involved --> no flakyness shown on CircleCI.

    • Sorry, I am wrong in this. But I can change it to only for slow tests.

WDYT if I make changes only to slow tests?

@sgugger
Copy link
Collaborator

sgugger commented Jun 30, 2023

I very much doubt we will have a failure on a model with head and not the others. With the randomness in the test, you won't be able to reproduce easily (and I don't see the test even printing the model class that failed) so I'd keep things reproducible. This is also on TensorFlow which has very low usage, so I don't think it's worth spending too much time over-engineering something.

@ydshieh
Copy link
Collaborator Author

ydshieh commented Jun 30, 2023

OKOK

@ydshieh
Copy link
Collaborator Author

ydshieh commented Jun 30, 2023

@Rocketknight1 OK for you?

Copy link
Member

@Rocketknight1 Rocketknight1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm happy with this!

@ydshieh ydshieh merged commit 3441ad7 into main Jun 30, 2023
@ydshieh ydshieh deleted the save_our_poor_ci branch June 30, 2023 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants